Crawl-Ready AEM: Adapting Content for LLM Efficiency and AI Search

As LLMs evolve, so does their need for autonomously crawl, index, and synthesize content from the web and enterprise repositories. This talk explores how crawling being done, AI search works, and what developers, content strategists, and architects need to understand to optimize their environments for this new wave of indexing technology.

We will be talking about a brief overview of traditional crawling (Googlebot, Bingbot, etc.), Best Practices in AEM such as maintaining clear, static documentation files (.md) for structured ingestion and implementing llms.txt to specify crawler policies for LLM bots, also AI search and importance of having brand presence in the results.

Tomasz Sobczyk

How does the traffic from llm vs regular search looks like right now compared to each other for adobe.com?

(see answer in talk video)

ashrvt

Does Adding llms.txt Improve LLM Visibility and Citations?

Flavio Longato

At present there is no evidence that LLMs are hitting the LLMs.txt - therefore, we're assuming that there is no benefits for llm visibility / citations.

RolandGruber

How do you know which change affected visibility as the AI models are not updated on a daily basis but maybe monthly?

Flavio Longato

1. We look more at the trend over time vs just a one snapshot. 2. We try to only do one implementation i.e. add FAQ to certain pages, revamping content etc.. So that we can measure the impact of that single change 3. Every "change" we implement is always directly links to a group of prompts. If those prompts increase in brand visibility then we're confident of the impact

Ive

How did you fix then google clicks drop as you mentioned at the begining? Also, where is this tool available if we want to try it?

Flavio Longato

Sadly there is no fix for this. The industry is changing and brands will have less and less clicks to their owned channels. Our goal as marketing is not now ensure that: 1. Our brand is visible for relevant terms 2. The sentiment around how we are displayed is positive 3. If possible, optimize for citations but citation links currently have a <1.5% CTR (from LLM to website)

Ive

Did you consider make use of Apache lucene as for RAG?

Karolis

If sites should avoid js rendered content, does that mean that EDS sites have less chance to get crawled by LLMs?

(see answer in talk video)

wolf

Do any LLMs you considered care about pages being fast or accessible, or do they care about it being able to be processed?

Flavio Longato

They care a lot about those about pages: Why? Because they have relevant branding / company information that adds context to what the brand / company does. As LLMs render none-js version. They don't (currently) care about accessibility (as often what they see is not the final rendered html). They do care about the speed, mainly due to resource limitations.

Sentham L

What do you mean(the difference) by 'your page is accessible' and 'retrievable?

Flavio Longato

accessible: Can they find and access the page i.e. does your robots.txt / CDN block them from accessing it. retrievable: Once they have access to a page, can they extract the text being delivered without JS,

sabdouni

Is this product a new SKU? Would it be included in the current AEMcaaS licenses ?

Flavio Longato

LLM Optimizer is a new tool and is not part of the current licenses. It can also be used by any website, even if it is not running AEM.

Amine

it needs a licence to be used ?

(see answer in talk video)

Tomasz Sobczyk

Is there a reason not to include this inside site optimizer?

Flavio Longato

The main reason why it is separate is that it is not bound to AEM. i.e. any website can use LLM optimizer. Whereas AEM Sites Optimizer is dedicated to AEM websites only.

Flavio Longato

Another key difference between Sites Optimizer and LLMO: Sites Optimizer is dedicated to on-site optimizations (including SEO). LLM Optimizer is for brand visibility. Different personas targeted, workflows and opportunities in scope. The opportunities that overlap will be available in both products.

Jesse Pinkman

What is the price of Adobe LLM Optimizer?

Flavio Longato

We're still working on the pricing structure. There are several levels based on your needs and how much support you'd like. Feel free to contact your Adobe representative.

wolf

Does your tool attempt to optimise for LLMs that try to process assets as part of the context, in particular images and documents?

Flavio Longato

Great question! Yes, we're working on several opportunities. One example is that we're working on an agent that detects: 1. is the image an illustration / graph / table? 2. Can we extract that information and deliver it in the HTML so that LLMs can understand the context of the image. We've seen several examples of customers having high value assets like infographics but that information currently can not be accessed by LLMs.