How AI Mode and AI Overviews work based on patents and why we need new strategic focus on SEO

Two years ago, in my early quest to understand what would become AI Overviews, I declared that Retrieval Augmented Generation was the future of search. With AI Overviews and now AI Mode wreaking havoc on organic search traffic, that future is here.

There’s a dearth of good information available about how these search appliances function so I recently went on a severe deep dive of AI Mode. But I think it’s worthwhile to do an abridged version, tie the two products together, offer some more strategic thinking about how we surf the next wave of generative search, and take up more of the AI Overviews about AI Mode with more of my content – at least for me.

A Google search for [how does AI mode work]

The future of search is probabilistic, the past was deterministic

The big picture difference between classic information retrieval (what governs the 10 blue links) and generative informational retrieval for the web (what governs conversational search) is that the former is deterministic and the latter is probabilistic. In short, this means that the old version of Google displayed content the same way you delivered it. The new version of Google makes a lot of choices about how content should be considered, stitched together, and displayed.

With classic search the content that you put in is parsed and analyzed, but the form in which it appears in the SERPs is just the elements you’ve provided extracted from that content. Google did not interpret the information prior to displaying it. You could change your ranking and performance by adjusting a series of mostly known levers that are features of content, system, site architecture, links, and user signals.

With generative search, you still prepare your content, system, site, and links to be accessible and parsed, but there are a series of highly variable and invisible reasoning steps that decide whether your content is eligible to be a part of the final response. These reasoning steps also infuse memory of user interactions. So, you can do all your typical SEO common practices, be considered, and not make it to the other side of the reasoning pipeline. LLMs can be temperamental, so the same content could go through the same pipeline twice and yield a different result.

How AI Mode and AI Overviews work based on patents

AI Overviews and AI Mode are effectively governed by the same mechanisms. We’ll examine the following patents that explain the bulk of how they function:

Search with stateful chat – The primary system architecture for AI Mode.
Generative summaries for search results – The primary system architecture for AI Overviews.
Method for Text Ranking with Pairwise Ranking Prompting – The method of comparing passages via LLM reasoning.
User Embedding Models for Personalization of Sequence Processing Models – The method for creating embeddings representation user behaviors for personalization through reasoning.
Systems and methods for prompt-based query generation for diverse retrieval – The method for query fan-out.
Instruction Fine-Tuning Machine-Learned Models Using Intermediate Reasoning Steps – The general explanation of how reasoning works in Google’s LLMs.

Keep in mind that this is the short version and although I share some unique insights here too, you can check out the long form version if you want a deep dive on how AI Mode works.

Overview of how AI Mode works

AI Mode works by pulling first understanding and forming the context of the user that will inform all the downstream tasks. That context combined with the query informs the generation of a series of synthetic queries. Passages are pulled from documents that rank for the query set and then classification is done for the query that informs which of a series of LLMs will be used. The passages are then run through a series of reasoning chains and those that make it through are synthesized into a response. That response is refined based on the embeddings-based user profile, citations are pulled and then the response is rendered for the user.

There are several variants of this process contemplated in the Search with stateful chat patent application. Let’s walk through one of the figures step by step, mapping to the system logic. The architecture of Google’s AI Mode, as depicted in FIG. 9 of the patent application, represents a multi-stage, reasoning-informed system that transitions from query interpretation to synthetic expansion to downstream natural language response generation. Each step of this flow has major implications for how visibility is earned, and why traditional SEO tactics are insufficient in this environment.

Step 1: Receive a query (952)

At the moment of user input, the system ingests the query, but unlike classical search engines, this is just the spark, not the complete unit of work. The query is treated as a trigger for a broader information synthesis process rather than a deterministic retrieval request. All the remaining walkthroughs start here, so I will skip describing this step as we review AI Overviews.

SEO implication: Your content may not be evaluated solely in relation to the exact query string. It may be evaluated through the lens of how that query relates to dozens of other query-document pairs.

Step 2: Retrieve contextual information (954)

The system pulls user and device-level contextual information: prior queries in the session, location, account-linked behaviors (e.g., Gmail, Maps), device signals, and persistent memory. This helps the system ground the query in temporal and behavioral context.

SEO implication: The same query from two different users may trigger completely different retrieval paths based on historical behavior or device environment. This erodes the usefulness of rank tracking and amplifies the role of persistent presence across informational domains.

Step 3: Generate initial LLM output (956)

A foundation model (e.g., Gemini 2.5 Pro) processes the query and context to produce reasoning outputs. This may include inferred user intent, ambiguity resolution, and classification cues. This step initiates the system’s internal understanding of what the user is trying to achieve.

SEO implication: Your content’s ability to rank is now filtered through how well it aligns with the intent signature generated here, not just the original lexical query.

Step 4: Generate synthetic queries (958)

The LLM output guides the creation of multiple synthetic queries that reflect various reformulations of the original intent. These could include related, implicit, comparative, recent, or historically co-queried terms, forming a constellation of search intents. This is the query fan out process that we will discuss further below.

SEO implication: Visibility is now a matrix problem. If your content is optimized for the original query but irrelevant to the synthetic ones, you may not be retrieved at all. True optimization means anticipating and covering the latent query space.

Step 5: Retrieve query-responsive documents (960)

Search result documents are pulled from the index, not just in response to the original query, but in response to the entire fan-out of synthetic queries. The system builds a “custom corpus” of highly relevant documents across multiple sub-intents.

SEO implication: Your content competes in a dense retrieval landscape, not just a sparse one. Presence in this custom corpus depends on semantic similarity, not ranking position.

Step 6: Classify the query based on state data (962)

Using the query, the contextual information, the synthetic queries, and the candidate documents, the system assigns a classification to the query. This determines what type of answer is needed, explanatory, comparative, transactional, hedonic, etc.

SEO implication: The type of response governs what type of content is selected and how it’s synthesized. If your content is not structured to satisfy the dominant intent class, it may be excluded, regardless of relevance.

Step 7: Select specialized downstream LLM(s) (964)

Based on the classification, the system selects from a series of specialized models,e.g., ones tuned for summarization, structured extraction, translation, or decision-support. Each model plays a role in turning raw documents into useful synthesis.

SEO implication: The LLM that ultimately interacts with your content may never “see” the whole document, it may only consume a passage or a structured element like a list, table, or semantic triple. Format and chunkability become critical.

Step 8: Generate final output (966)

These downstream models produce the final response using natural language, potentially stitching together multiple passages across sources and modalities (text, video, audio).

SEO implication: The response is not a ranked list. It is a composition. Your inclusion is determined not by how well you compete on a page-level, but on how cleanly your content can be reused in the context of an LLM’s synthesis task.

Step 9: Render the response at the client device (968)

The synthesized natural language response is sent to the user, often with citations or interactive UI elements derived from the retrieved corpus. If your content is cited, it may drive traffic. But often, the response satisfies the user directly, reducing the need to click through.

SEO implication: Presence does not guarantee traffic. Just as brand marketers used to chase share of voice in TV ads, SEOs now need to measure share of Attributed Influence Value (AIV), and treat citation as both an awareness and trust-building lever.

AI Mode is a complete paradigm shift. Ranking is matrixed and the standard SEO tactics only get your content considered. We will need to do a lot of experimenting as a community to figure out what it takes to reliably make our content be cited, but those efforts may not yield any meaningful traffic. The user behavior in AI Mode is more reflective of a branding channel, so we’ll need to measure accordingly.

A summary of how AI Overviews work

Although AI Overviews have been previously examined, there is value in revisiting them in context of the new information that has surfaced from examining AI Mode. For instance, the query fan-out technique has not been considered in the various AI Overview studies that compare the classic ranking overlap with AI Overview performance. The matrix of queries used to generate these responses will help us uncover what to do.

There are several methods contemplated in the Generative summaries for search results patent application, but I want two distinct approaches to AI Overviews. In one, Gemini generates the response first and then looks to corroborate it with content. In the other it pulls the content and then generates the response.

AI Overview query fan-out for content version

This version of the AI Overview workflow shows how Google builds a response using expanded query sets, drawing from semantically and behaviorally adjacent searches. Instead of simply retrieving results for the explicit user query, the system proactively pulls documents associated with related, recent, and implied queries using a more simplistic version of the query fan-out technique than what’s used for AI Mode. From that broader corpus, a summary is generated and then verified before presentation. This is a hybrid of fan-out retrieval and reasoning-driven synthesis. Let’s walk through the process step-by-step and what those steps mean for SEO.

Step 1: Receive a query (252)

Step 2: Select documents responsive to the query (254)

Google retrieves a set of documents that respond directly to the user’s query. These are selected using a combination of query-dependent (text match, semantic similarity), query-independent (document authority, embeddings), and user-dependent (personalization) signals.

SEO implication: Semantic relevance and topical authority are key. This is your best shot at appearing if you’re optimized for traditional signals. But the classic signals are not the only things considered.

The system identifies documents responsive to other known queries that share semantic overlap or behavioral co-occurrence with the original. These documents are added to the retrieval set based on their relevance to related query variants.

SEO implication: Ranking for related topics boosts your exposure. This underscores the need for robust topical coverage and well-linked content clusters. Entities and co-relevance matter more than ever.

Step 4: Select documents for recent queries (258)

Next the system retrieves documents that respond to queries the user recently submitted. These may reflect evolving intent or ongoing research behavior in a search journey.

SEO implication: Your content may surface in AI Overviews for queries it wasn’t directly targeting, simply because it aligns semantically with prior queries in the same session. This increases the importance of consistency, cross-topic clarity, and journey-based content design. Tactically, you should align content strategy with the jobs to be done framework.

Step 5: Select Documents for Implied Queries (259)

Finally, the system pulls documents for implied queries inferred by the LLM from the phrasing or deeper intent of the original input. These are semantically rich, intent-predicted queries generated in the background.

SEO implication: This is the most opaque and disruptive layer. If your content isn’t optimized for what the user is actually trying to accomplish, you’ll never be part of the verification set. You must anticipate what the user actually means.

Step 6: Generate natural language summary (260)

With all document sets assembled, the system uses an LLM to generate a summary answer. It synthesizes content from text, image, or video sources, and may include source citations directly within the summary.

SEO implication: You are no longer competing for blue link rank. You are competing to be used by a machine to construct an answer. Your content must be written so that an LLM can easily extract and recombine it. This requires passage-level clarity, entity specificity, and coherence through the use of semantic triples.

Step 7: Generate LLM output with source and confidence signals (260B)

The model may attach source identifiers to passages or indicate confidence levels in certain answers based on how strong the match was between the summary and retrieved documents.

SEO implication: Confidence affects visibility. The clearer and more direct your content is in supporting fact-based assertions, the more likely it will be included and cited. Content that hedges, generalizes, or dilutes claims may be excluded.

Step 8: Render summary with verification links (262)

The final output is rendered to the user. Citations may be added as links to verifying sources (262A). Confidence annotations may be displayed (262B), but I imagine this is only for internal purposes. LLM outputs and document comparisons determine what gets cited and how prominently.

SEO implication: The click is no longer the primary KPI. Being cited is the primary visibility event that aligns with user behavior in this environment. You must treat passage-level citations as brand lift moments. Measuring citation frequency, position, and sentiment of how your brand is presented is the new SEO metric stack.

The AI Overview generate-first process

The AI Overview system, as depicted in FIG. 3 of the patent application, outlines a generative retrieval-and-verification architecture. In this version the system generates the response first then generates a response from the search results and compares them to the original version. It loops through this process until a validated version of the response can be returned.

Let’s walk through the process step-by-step and consider what it means for SEO.

Step 1: Receive a query (352)

Step 2: Generate natural language summary using LLM (354)

The system uses a LLM to generate an answer. This output is not simply pulled from one document, it may be synthesized from content tied to the query itself, to related queries, or to recent queries issued by the same user. Additionally, the model may use rewrites or paraphrases of the original query to expand the scope of the answer.

SEO implication: Since the LLM composes the answer, the SEO play is being present in the training data. Being in the training data will likely yield a resurge of microsites and benefit brands with large owned media portfolios.

Step 3: Select portion of summary for verification (356)

After generating the full summary, the system selects individual segments or claims that need to be verified against actual documents.

SEO implication: Your content may not need to “win” across an entire page. If a single paragraph or sentence provides clean support for a generated claim, you may be cited. Engineering for chunk-level clarity with concise, factual, retrievable content is critical.

Step 4: Determine candidate documents for verification (358)

Now the system needs to confirm that the generated claim is supported by actual published material. It does this in two ways: by semantically comparing the summary portion to passages in previously retrieved documents (358A), or by issuing a new search using the summary portion itself as a query (358B).

SEO implication: You may be cited even if you weren’t part of the original retrieval set. This means your content can be surfaced through the equivalent of quote searches on passages. Your goal is to write passages that are structured like the kinds of answers users and language models are likely to generate.

Step 5: Determine whether candidate document verifies the portion (360)

The system compares the candidate document passage to the generated summary portion to determine if it verifies the claim. This is a semantic alignment check between the summary and candidate content.

SEO implication: This is the moment of truth. If your passage is too vague, too salesy, or not factually anchored, it won’t verify. LLMs reward content that is precise, explainable, and logically aligned with user intent.

Step 6: Verification decision (362)

If verification succeeds, the system proceeds to cite the passage. If it fails, it tries another candidate document.

SEO implication: This introduces a content meritocracy at the passage level. It’s not about who ranks first, it’s about who best supports the synthesized idea. Many SEO-visible pages will fail this test.

Step 7: Linkify Verified Portion (364)

If a passage is verified, the corresponding segment of the summary is linked as its citation, typically with a scroll-to-text reference that sends the user to the verified source.

SEO implication: This is the new “rank.” Citation in the AI Overview is how users now encounter your content. Being referenced as the source of truth, especially high in the answer delivers visibility and supports brand awareness even without a click.

Step 8: Repeat for additional passages and documents (366, 368)

If the summary contains additional unverified segments, the system loops to identify and verify them as well.

SEO implication: Every paragraph in your content is a potential entry point. The more retrievable and verifiable chunks your content contains, the more opportunities you have for multi-citation across AI Overviews.

Step 9: Render final AI Overview (370)

Once all portions are verified (or at least those that can be), the AI-generated summary with inline citations is presented to the user.

SEO implication: Traffic may or may not follow, but brand presence, perceived authority, and user trust absolutely depend on your presence in this response. Being absent from the final rendering means you’re invisible in the most prominent part of modern search.

AI Overviews don’t rank content – they remix it and use existing content to validate it. And to be remixed, your content must win at the intersection of language model comprehension and multi-query relevance.

How query fan-out works in Google’s AI surfaces

The query fan-out technique is the invisible sauce behind both AI Overviews and AI Mode. Google extrapolates a series of so-called synthetic queries based on the explicit query, implicit information needs, and user behavior. These queries are used to inform what other documents are retrieved to inform “grounding” of the results. Although we will likely never get visibility into this data, both Andreas Volpini and I have separately made tools to help understand what those queries might be.

This diagram from the Systems and Methods for Prompt-Based Query Generation for Diverse Retrieval patent application shows how Google trains a query expansion model. Unlike traditional keyword expansion, this system uses LLMs to generate synthetic query-document pairs and trains a document retrieval model that can interpret user queries more broadly, drawing on multiple prompts to generate diverse interpretations of intent.

The previous patents already told us how it works, but let’s break down the training workflow step by step and explain the SEO implications of each stage.

Step 1610: Receiving prompts for a retrieval task

The system begins by receiving at least two prompts that describe the retrieval task it is meant to solve. These prompts instruct a large language model on how to generate variations of queries that could retrieve relevant content from a given corpus.

SEO implication: This is where user intent begins to fracture into multiple pathways. Google is no longer just learning to answer queries, it’s learning how to generate queries. If your content only aligns with one phrasing of a question, you may miss retrieval entirely. Success now depends on your content aligning with the broader intent space that can be articulated from different perspectives.

Step 1620: Generating synthetic query-document pairs with an LLM

Based on the prompts and the document corpus, the system uses an LLM to create a synthetic training dataset. Each entry is a pair: a synthetically generated query and a document from the corpus that could answer it. This effectively teaches the model which types of questions a given piece of content can satisfy, even if no user has ever searched that way before.

SEO implication: This is the core of the Query Fan-Out technique. Your page isn’t just evaluated against real queries, it’s tested against an LLM’s imagination of all the ways a user might ask for what you offer. Your content needs to be semantically robust, clearly structured, and richly aligned with multiple possible interpretations. Think: definitions, comparisons, FAQs, use cases, and scenario-based framing.

Step 1630: Training the retrieval model on synthetic pairs

The model is trained to understand the relationships between synthetic queries and relevant documents. This results in a document retriever that can accept real-world queries and infer which documents, across many latent intents, are most appropriate.

SEO implication: This further reinforces the idea that you’re not just being matched to a static keyword string anymore. You’re being retrieved based on how well your content semantically aligns with a matrix of machine-generated queries. Traditional keyword-first strategies won’t help you here. You need content that hits high-dimensional conceptual alignment. It means you gotta get on the embeddings train.

Step 1640: Delivering the trained retrieval model

Once trained, the model becomes the retrieval engine deployed in systems like AI Overviews and AI Mode. It sits behind the scenes, taking a user’s query and triggering a fan-out of related, implicit, comparative, and historically-relevant queries, retrieving content for each and then merging the results into a generative synthesis.

SEO implication: This is what you’re optimizing for. Not a keyword match. Not even a ranking position. But a position within a synthetic query universe, ranked by relevance to latent intent. If your content doesn’t show up in the results of these fan-out queries, it may never reach the generation layer, and you won’t be cited even if your page broadly has a high cosine similarity.

The query fan-out technique will make these efforts more like reputation management campaigns. Since there’s focus on source diversity to verify information, marketers will look to spread their messages across multiple pages and many sites to ensure that Google encounters their content no matter what they retrieve.

Memory and personalization based on user embeddings for AI Mode

One of the more fascinating features Google is bringing to AI Mode is Personal Context. Soon you’ll be able to integrate much of your data from across the Google ecosystem into Search to inform personalization of responses. While a compelling feature (with wide-reaching privacy implications) it also poses complications for measurement.

This diagram from FIG. 4 of the patent application titled User Embedding Models for Personalization of Sequence Processing Models reveals how Google’s systems, particularly in AI Mode, incorporate user-specific context embeddings to personalize how queries are interpreted and answered.

Rather than treating every query as standalone, this system builds a persistent vector-based profile for each user based on their interaction history, preferences, and behavior. That profile then conditions how AI Mode interprets queries and ranks or generates responses.

Below is a step-by-step breakdown of the process, aligned with SEO implications for each stage.

Step 402: Obtain contextual data from the user

The system collects a wide range of user-associated signals including prior search queries, engagement with content, browsing behavior, clicks, time on page, location, device type, and more.

SEO implication: The same query issued by two different users may result in completely different retrieval sets. Ranking is no longer global. It’s contextual. Brands can no longer rely on being “the best result overall.” Rank tracking will likely need to be done through Google accounts that emulate certain user activities to represent a persona.

Step 404: Generate embeddings representing the contextual data

An embedding model processes the user data and creates a dense vector representation of that user’s contextual profile. This becomes the personalization signal that gets paired with incoming queries.

SEO implication: This profile sits beside every search as it enters the system. That means your content isn’t just competing on query relevance,it’s also being filtered through how well it aligns with the user’s embedded context. If you’ve never created content for that user segment, or if your site is disjointed or confusing, you’ll be less likely to appear.

Step 406: Receive a task instruction (the search query)

The system receives a task instruction. In the case of AI Mode, this is typically a search query or user prompt.

SEO implication: This is the only part of the system most SEOs optimize for, just the input query. But in a system personalized by embeddings, this query is only one part of a much larger inference stack. Optimizing for intent clusters and user journeys, not just head terms is something we’ve always done, but now it’s a requirement.

Step 408: Embed the Task Instruction

The query or prompt itself is also embedded, using a separate vector space. The system now has two primary inputs: the user’s contextual embedding and the query’s semantic embedding.

SEO implication: Relevance is being determined in multi-vector space. You’re being matched not just to what the user asked for, but to their activities. This makes entity richness, topical breadth, and audience-aligned expectations much more important. If your content fails to align with the expectations or framing that appeals to a given persona, it won’t make it through the reasoning chain.

Step 410: Combine embeddings to generate a personalized output

The system fuses the user profile embeddings and the query embeddings. Together, they inform what content is retrieved or synthesized. The model’s final output is conditioned on this fusion, which means it’s personalized at the reasoning level, not just in ranking order.

SEO implication: This is where AI Mode becomes most different from traditional search. It’s not just that results are re-ranked. The answer itself is different. Your content might not appear in one user’s AI Mode output at all but could be featured prominently for another because it aligns better with their embedded context.

To win here, your content must:

Be useful across multiple personas and use cases
Be written in styles and formats that match diverse informational preferences
Support generative reuse in personalized contexts (e.g., comparisons, recommendations, regional relevance)

Relevance is no longer just about matching the query. It’s about fitting the reasoning process that starts with who the user is. AI Mode is the end of “one-size-fits-all” content optimization and beginning of an even more fractured information discovery experience.

Reasoning in Google’s AI Mode

Reasoning in LLMs refers to the model’s ability to go beyond surface-level pattern matching and instead perform multi-step, logical, or inferential thinking to reach a conclusion. Rather than simply retrieving or repeating information, an LLM engaged in reasoning evaluates relationships between concepts, applies context, weighs alternatives, and generates responses that reflect deliberate thought. This allows it to answer complex questions, draw comparisons, make decisions, or synthesize information across multiple sources much like a human would when “thinking through” a problem. Google’s AI surfaces employ this process to determine what information should be used in a final response.

How reasoning works in Google LLMs

The diagram in FIG. 7 from the Instruction Fine-Tuning Machine-Learned Models Using Intermediate Reasoning Steps patent application represents a training-time process for Google’s machine-learned sequence models, including LLMs that power AI Overviews and AI Mode. It shows how reasoning is explicitly trained and evaluated, not just by measuring final answers, but by analyzing the intermediate steps the model takes to reach those answers known as a reasoning trace.

This diagram is essential to understanding why SEO needs to evolve. If the model is being trained to reason step-by-step, then our content needs to support not just retrieval, but inference. Below is a breakdown of each step in the reasoning training process, and the corresponding SEO implication.

Step 702: Obtain training examples for the sequence model

The system begins by collecting a diverse set of training examples. These pairs of queries and expected outputs are used to train a machine-learned sequence model like a large language model.

SEO implication: The model is not being trained solely to retrieve relevant documents. It’s being trained to understand queries, synthesize responses, and produce reasoning sequences that align with human-labeled examples. Your content must align with the kinds of passages that make sense not only as outputs, but as steps in a multi-hop reasoning process.

Step 704-1: Input a query from the training set

For each training example, the model receives a query. This is the start of the reasoning chain.

SEO implication: Visibility begins with how the model interprets the query. If your content is only relevant to exact-match phrasing, it will miss. Content should be structured to match common question types, reformulations, and implied sub-questions.

Step 704-2: Input the query into the sequence model

The query is run through the LLM, initiating a forward pass to produce a predicted response.

SEO implication: This is the phase where content retrieval and synthesis begins. Your content is evaluated not just on keyword overlap, but on how well it supports the downstream composition of a coherent, fact-based response. This favors well-structured, extractable passages.

Step 704-3: Capture the response and its reasoning trace

The model doesn’t just output an answer. It produces a structured record of the intermediate steps or latent decisions it used to arrive at the answer. These might include document selection, passage scoring, fact extraction, or sub-question chaining.

SEO implication: This is where traditional SEO breaks down. If your content only serves one-step lookups or is too shallow to support reasoning hops, it won’t be part of the trace. To win here, your content must contain multi-faceted answers, be rich in entities and relationships, and support logical or causal progression.

Step 704-4: Evaluate the final answer against a ground truth

The output is compared to a human-annotated correct response to determine whether the model got the answer right.

SEO implication: This is where factual accuracy and completeness are rewarded. If your content is overly generalized, speculative, or optimized for clickbait over clarity, it won’t support accurate synthesis and will be excluded from future training signals.

Step 704-5: Evaluate the reasoning trace against ground truth

Even if the final answer is correct, the steps used to get there are also evaluated. If the model took a path that’s illogical, inefficient, or not human-aligned, it’s penalized, even if it got the right answer.

SEO implication: This is a game-changer. Your content is being judged not just on whether it ends up in the answer, but whether it helps the model reason in the right way. Clear headings, explicit logical structures, and semantically complete passages now matter more than ever.

Step 704-6: Update model parameters based on both answer and trace evaluation

The model is fine-tuned using the results of both evaluations. This ensures it’s not just learning what to say, but how to think like a human searcher or subject matter expert.

SEO implication: Over time, LLMs are being trained to favor content that helps them reason well. SEO needs to evolve into relevance engineering by structuring and contextualizing content to match the paths models take to synthesize accurate, high-confidence answers.

How pairwise passage-based reasoning works in AI Mode

The diagram from FIG. 4 of the patent application titled Method for Text Ranking with Pairwise Ranking Prompting shows how AI Mode performs reasoning-based re-ranking by comparing passages against one another in pairs. This technique bypasses traditional scoring models like BM25 or simple vector similarity, and instead uses LLMs to judge relevance in context.

What this means in practice is that your content is not scored in isolation. It’s scored in a head-to-head comparison against competing passages. And the decision is made by a generative model performing reasoning tasks, not just measuring term overlap.

Here’s a breakdown of the workflow, aligned with SEO implications at each stage.

Step 402: Generate prompt with query and two candidate passages

The system generates a prompt that includes a user query, a first passage (text from one candidate document), and a second passage (from another candidate). These are framed in a way that allows a language model to evaluate which is more relevant.

SEO implication: Your content is now being placed in direct competition with others, passage vs. passage. It’s not enough to be generally relevant. You must be more useful, more precise, or more complete than the next-best option. If your content doesn’t deliver clarity and distinctiveness in small sections, you lose the round.

Step 404: Prompt the LLM for comparison

This prompt is submitted to a generative sequence processing model (such as Gemini 2.5 or similar). The model reads the query and both candidate passages, and is expected to compare them on semantic grounds.

SEO implication: Traditional relevance signals like keyword density, internal links, or even core web vitals aren’t used here. Your passage must stand up to interpretive reasoning. This favors content that is:
- Clear about who, what, why, and how
- Context-rich without being bloated
- Structured in natural, conversational terms

Headlines, formatting, and embedded summaries all help here. So do strong introductory sentences that convey clear value.

Step 406: Perform pairwise reasoning between passages

The LLM evaluates which of the two passages better satisfies the user’s query. It may do this using internal chain-of-thought reasoning or learned relevance heuristics based on fine-tuned training.

SEO implication: The model isn’t measuring scores. It’s forming judgments. Think of it like an editorial process where our content is being evaluated as if by a temperamental reviewer. This means vague, hedging, or generic writing underperforms. You want to win these matchups by providing factually grounded, intent-aligned, and entity-rich content.

Step 408: Output a ranking decision

The model outputs a result of which of the two passages should be ranked higher for the query. This decision can be recorded as part of a training loop or used in real-time to determine which content enters the AI Mode synthesis.

SEO implication: Ranking is now comparative, not absolute. You’re not being judged in a vacuum. Every passage is scored based on how it stacks up against another plausible answer.

These patents confirm that reasoning is now part of the ranking pipeline. Your content isn’t just being retrieved, it’s being tested for how well it contributes to the model’s thought process. You’re not just optimizing for keywords. You’re optimizing for inference.

Comparison of AI Mode & AI Overviews functionality

There is a lot of overlap between how AI Overviews and AI Mode function. Our mental model of how search works is also evolving, so I share the following table as a cheat sheet to help clarify the differences.

Functionality	AI Overviews	AI Mode
Trigger Mechanism	Triggered automatically on specific queries within traditional Google Search	Activated when a user enters Gemini-style search in AI Mode section of Google Search
User Experience Context	Embedded within traditional SERPs; complements standard organic listings	Full-screen AI-native interface; replaces SERPs with interactive, agent-like experiences
Query Fan-Out	Performs limited internal expansion to support summary generation	Performs broad query fan-out using latent intents and several types of synthetic queries
Content Retrieval Approach	Retrieves candidate documents via standard search index (Web ranking) with additional LLM scoring	Uses dense retrieval and LLM-based pairwise ranking over passage-level embeddings
Content Unit of Retrieval	Full documents with salient passages summarized	Individual passages or chunks optimized for retrieval, reasoning, and citation
Citation Strategy	Citations are embedded in snippets (scroll-to-text or in-line references)	Citations are selected based on alignment with reasoning steps, not necessarily ranking
Reasoning and Answer Generation	Uses extractive and abstractive summarization via prompted LLM responses	Compositional reasoning across passages using Chain-of-Thought generation
Personalization	Minimal personalization beyond location and past queries	Leverages user embeddings, device context, and past interaction history
Multimodal Integration	Limited to text and links, possibly pulling from videos or images implicitly	Natively supports synthesis across modalities (text, image, video, audio, structured data)
Citation Relevance Criteria	Based on source ranking and salience in answer	Based on how directly passage supports reasoning or answer (per US20240362093A1)
Output Format	Static answer blocks with citations, often bullet points or brief prose	Dynamic interactive interface (e.g., cards, timelines, tables, agents) that adapt to query type
Source Pool	Typically drawn from top-ranking organic documents across the limited synthetic query set	May include documents not in the top SERP, based on a broad set of synthetic queries selected via similarity and reasoning relevance

The ‘it’s just SEO’ argument misses the point

There’s a persistent argument in the SEO community claiming that optimizing for AI surfaces (AI Overviews, AI Mode, or other conversational search platforms) isn’t a new discipline. It’s just SEO. That argument feels familiar. It’s the same tired energy we saw in the never-ending subdomain vs. subdirectory debate, or the endless 301 vs. 302 discourse. But this one is more consequential because it’s not just a technical disagreement. It’s a missed opportunity to reframe the entire value proposition of search.

We are at a real inflection point. The first in decades where we can reframe the value proposition of search itself. And yet within our own ranks (heh), we’re minimizing it, trying to fold it into a decades-old discipline that’s increasingly defined by low expectations and misaligned incentives. That opportunity should not be shrugged off in favor of protecting legacy definitions and navigating the fear of one’s eroding expertise.

Yes, technically, this could be rolled under the SEO umbrella. We’ve done that before. In fact, we do it every time Google socially engineers our community to execute its goals. But doing so now would be a massive strategic mistake.

This debate misses the moment

AI has the world’s attention. Conversational interfaces are becoming the new front door to information discovery. And these new surfaces are largely unclaimed as a marketing channel.

Meanwhile, SEO is already burdened with perceptions that limit its influence. In the C-Suite, SEO is viewed as a cost-saving channel. It’s associated with “free traffic.” And ironically, that framing, coined by the very people who built this channel into the web’s top referral source, has damaged our ability to command budgets, headcount, or strategic consideration in line with the value we create.

The “it’s just SEO” mindset doesn’t just miss the nuance. It reinforces a ceiling that’s been holding this field back for years. It keeps us stuck in the KPIs of yesterday, when what we need is a seat at the table in shaping the next frontier of information access. What’s happening in AI search isn’t just a new SERP layout. It’s a fundamental rearchitecture where language models reason about content, rank passages, and deliver synthesized answers.

That’s not just SEO. That’s something new.

Learn from other channels

There’s another algorithmic channel where content is the price of entry. It’s unpredictable. It’s difficult to attribute. It’s rarely expected to drive conversions on day one. And yet, the C-Suite doesn’t need projections, they just keep investing.

That channel? Social media.

And what is social media marketing, really? It’s just channel-specific content strategy. But social media marketers didn’t bury it inside content strategy. They gave it a name. So the C-Suite gave it budget. They gave it power. And it became a category.

We can, and should, do the same for conversational search.

The argument itself references an idealistic SEO that barely exists

When people say this is just SEO, they’re referencing an idealized version of the discipline that exists more in theory than in practice.

At iPullRank we actually do these things, so when I talk about them at conferences or in blog posts, I’ve often been told that the type of work we do sets an unrealistically high bar. That bar includes things like computational linguistics, deep understanding of retrieval systems, entity and true semantic optimization, and the ability to build software when the market doesn’t offer tools that do what we need. Our clients hire us because we give real answers, not just “it depends.” And they tell us that our work is more thorough than what they’ve seen from other SEO agencies.

That’s because most of the industry isn’t doing SEO at this level. It’s running old playbooks they found online and they deliver direct exports from tools dressed up as insight. So when someone says optimizing AI Overviews and AI Mode are “just SEO,” what they’re really saying is that the understanding of dense retrieval systems, passage-level semantic modeling, and reasoning are already commonplace. And that’s just not true because SEO software does not account for them.

This isn’t a knock on the broader industry. It’s a call for honesty in a discussion that is limiting our ability to evolve. Most SEO today is tactical, reactive, and stuck in a paradigm optimized for ranking documents, not improving the reasoning capabilities of passages.

Instead, what’s happening is people are following the same old best practices, using the same obsolete tools, and seeing diminishing returns, which further erodes confidence in SEO.

Are we here to do an expensive form of arts and crafts? Or are we here to drive business results? I know what I’m here for, so the argument is over for me.

This isn’t a fight over definitions. It’s a fight over perception.

And perception drives investment.

Calling it “just SEO” ensures we remain undervalued, underfunded, and misunderstood which is especially bad at a time when visibility, attribution, and even clicks themselves are being abstracted behind generative interfaces.

Former SEO evangelist Rand Fishkin said it best at SEO Week in his “Your Bigger than SEO talk.” SEO has a branding problem that it is not likely to overcome.

So a shift to Relevance Engineering (r17g) this isn’t just semantics. This is strategy.

We have a chance to define the category before someone else does. Let’s not waste it defending the past.

#Mode #Overviews #work #based #patents #strategic #focus #SEO