How to measure AI visibility now that precision is gone

How to measure AI visibility now that precision is gone

The funnel query pathway (FQP), the cohort-with-intent tree you populate from the conversion node upward, is the measurement framework for AI visibility. Measuring the FQP each quarter produces a defensible strategic learn you possibly can really act on.

The shift the methodology operationalizes is what I name the micro-macro shift. You’ll be able to’t measure AI-era visibility with the precision micro (rating) devices search educated us to count on as a result of assistive engines and brokers are too opaque for micro-level measurement. Macro is the one obtainable self-discipline.

Why the precision we used to take without any consideration not applies

The identical economics-versus-economics distinction I drew earlier applies right here: nook store versus Financial institution of England, micro devices versus macro intuition, with neither set of instruments working within the different’s atmosphere. 

AI-era visibility lives in the identical type of macro atmosphere that pressured economics to develop a distinct measurement self-discipline, and it forces our trade to do the identical.

Our trade operated at a micro scale with rating and monitoring, however the devices we use for search don’t apply in AI. Microeconomics versus macroeconomics is the canonical case.

The structural property is brand-user-algorithm (BUA) opacity. The consequence issues right here: 4 layers of opacity function on each AI-era model suggestion, and the model has no seen sign at any of them. 

The model is opaque to the engine contained in the walled backyard. The consumer is opaque to themselves about how the engine reasoned on their behalf. 

The engine is opaque to itself as a result of the interpretability drawback in massive language fashions stays unsolved. 

The model is opaque to its personal claim-level abstention occasions when the engine encounters contradictions within the corroboration spine and silently declines to floor a selected declare. 

The conversion price softens, and the model can’t see which contradiction precipitated the softening.

Image 256Image 256

BUA opacity is why micro-instruments fail on assistive and agential surfaces. You’ll be able to’t change that opacity.

It’s the atmosphere you’re working in, and my methodology initiatives by means of it on the macro stage, delivering pattern moderately than precision and accepting that the best reply is the one which holds up over time moderately than the one which’s actual within the second.

The place micro measurement nonetheless works — and the place macro takes over

Micro and macro coexist. Three modes function in parallel in 2026. 

  • Search (basically micro) hasn’t gone anyplace. It’s rising. 
  • Assistive (basically macro) has emerged alongside it. 
  • Agent has emerged alongside each (the pleasant mixture of micro and macro). 

Every mode has its personal measurement atmosphere, and the method that is smart for your online business will depend on the information that atmosphere can provide.

From my Google Marketing Live 2026 SEA Lab keynote: the three modes — search, assistive, agent — coexisting in 2026, each fulfilling a different need.From my Google Marketing Live 2026 SEA Lab keynote: the three modes — search, assistive, agent — coexisting in 2026, each fulfilling a different need.
From my Google Advertising Dwell 2026 SEA Lab keynote: the three modes — search, assistive, agent — coexisting in 2026, every fulfilling a distinct want.

Search retains the consumer in management

The consumer varieties a question, the engine returns 10 choices, and the consumer picks one. The model can see the question, observe the place, measure the press, observe the session, and attribute the conversion. 

Micro devices work as a result of the atmosphere helps them, and types working with search-era consumers on search-era surfaces ought to preserve operating micro methods for these consumers. So the best way you measure search doesn’t change, except you wish to add a macro methodology, which I personally assume is a good suggestion.

Assistive narrows the selection on the consumer’s request

The consumer asks ChatGPT, Perplexity, Claude, Gemini, or Copilot for a suggestion, and the engine retrieves, synthesizes, and commits to 1 or two choices on the consumer’s behalf. 

The model doesn’t see the sequence of exchanges, the retrieval, the synthesis, or, most significantly, the options the engine thought of earlier than committing. You’ll be able to see the conversion, however you possibly can’t attribute it explicitly.

Your entire journey runs inside walled gardens the place you possibly can’t measure with micro devices, which suggests macro is the one obtainable self-discipline. Assistive is essentially the most elusive of the three.

Agent removes the choice from the consumer completely

The consumer delegates, the agent executes, and the model receives the order. The negotiation and transaction are observable, attributable, and measurable: the agent queried, negotiated, and (hopefully) purchased your product, and you’ll micro-measure that.

What you possibly can’t see is why the agent selected your product over its rivals as a result of the choice logic the agent utilized occurred contained in the agent, drawing on retrievals, comparisons, and reasoning the model has no visibility into. 

The pathway to the conversion is macro, however the conversion itself is micro.

The client chooses the floor

You may assume search, assistive, and agential are a easy break up the place you possibly can apply a devoted measurement system to every. Not so.

Patrons transfer between search, assistive, and agent surfaces relying on what they’re shopping for, why, and the way complicated the choice is — typically throughout the similar journey. The model doesn’t select which floor its purchaser will use. The client does, case by case, and the measurement (and strategic) methodology has to deal with each floor combine the client chooses.

That’s why macro is the one viable answer.

The way you measure defines your methodology

The clearest option to present and inform is to translate every search-era measurement into its AI-era equal. Right here’s my take, although each practitioner operating this work critically can have their very own opinion on each row, which is the purpose: the way you outline every row turns into the inspiration of your methodology. 

The funnel question pathway defines which queries I’m going to trace, and the desk beneath applies the identical logic to each different measurement resolution. The variations between practitioners on these rows will change into more and more seen in our measurement outputs over the approaching months and years, and that visibility is the methodological sign value taking note of.

The macro methodology I’m publishing right here is in its infancy. I began constructing it critically this yr, and the desk beneath displays my present place after a number of months of thought, evaluation, and dwell information collected since 2015.

I’m doing my finest to finalize this checklist earlier than the top of 2026 and freeze the methodology from January 2027 due to a constraint that issues: as soon as you alter a parameter, you lose direct comparability with all the pieces you measured earlier than the change. Quarter-eight compounding is barely significant if the methodology stays steady throughout all eight quarters.

SearchAssistiveAgential
Engine visibilityCTR-weighted share of the key phrase cohort, normalized over timeThe FQP queries of their conversational floor type, every in an lively or aspirational stateShare of agent invocation occasions (catalog queries, mandate submissions, transactions) towards the addressable agent floor
Purchaser cohort definitionThe FQP queries of their search-context floor type, every in lively or aspirational stateThe FQP queries of their conversational floor type, every in lively or aspirational stateThe FQP queries of their agent-readable type, every in lively or aspirational state
Authority sign shareShare of corroboration authority throughout the class, normalized over timeShare of unbiased corroboration within the brand-trigger phrase contextShare of operational-evidence completeness towards what the agent must confirm earlier than committing (pricing, phrases, availability, match)
How you alter the outputPublish, construction, distribute towards the cohort, and measure the shift quarter over quarterEngineer the operational floor for agent legibility by means of MCP, structured information, and machine-actionable interfaces, and measure the shift quarter over quarterShare of citations and mentions throughout the brand-trigger phrase cohort, weighted by prominence within the synthesized reply
Income and revenue attributionShare of income and margin from the search-mode cohortShare of income and margin from the assistive-mode cohort, recognized by means of referrer alerts and user-agent stringsShare of income and margin from the agential-mode cohort, captured by means of agent-mandate logs and MCP telemetry

Take the measurement, categorical it as a share of the cohort, normalize it over time, and report the pattern moderately than the snapshot. That’s the transfer in each cell of the desk, and it’s what makes the three columns straight comparable.

Maintain operating the micro devices you already know from search-era follow: rating place 1-10 on a selected key phrase, CTR on a selected URL, and A/B take a look at outcomes on a selected web page factor. 

Use them for techniques, however preserve them out of the strategic dashboard as a result of they aren’t similar to something within the assistive or agential columns. In the event you combine them, you’ll lose the strategic worth.

The 5 rows match throughout the three modes: learn throughout any row and see your model’s relative place throughout all three engines in straight comparable models. 

Examine your search-mode share towards your assistive-mode share and your agential-mode share on the visibility row, the authority row, and the income row, and you’ve got a steady learn on which mode is producing one of the best return at this second and the way that weighting is shifting quarter over quarter because of your work. That provides you a macro-level view of your strategic priorities throughout all three engines.

The 5 rows additionally maintain for paid measurement. Paid and natural are converging on the identical engine and the identical macro methodology.

How measurement works throughout the funnel question pathway

The funnel question pathway isn’t one tree. It’s an orchard. Every cohort-with-intent intersection you domesticate is a tree, and the orchard grows as you plant extra bushes. Every tree has three elements.

  • The trunk is the conversion node — a consultant branded BOFU question that represents the shopping for second for that cohort-with-intent intersection. 
  • The branches are the MOFU analysis queries that the client asks when researching choices. 
  • The twigs are the TOFU consciousness queries that the client requested earlier than narrowing to particular choices or manufacturers. The orchard grows from the bottom of your model and enterprise operations, and the apples fall on that floor when the bushes bear fruit. 

The bottom makes the orchard productive over time, and the model that lets its floor go fallow watches the bushes die, no matter how properly its branches are optimized.

You run measurement at each layer of each tree, however for various causes, as a result of the client’s intent shifts as you progress up from trunk to branches to twigs, and the query you’re asking shifts with it.

Three funnel layers every have their very own diagnostic: 

  • Backside of funnel (BOFU), the place the client decides.
  • Center of funnel (MOFU), the place the client evaluates choices.
  • High of funnel (TOFU), the place the client remains to be asking topical questions.

Backside of funnel, brand-only: The trunk as a brand-confirming marketing campaign

The trunk of each tree is the buying-moment question along with your model identify in it. “Males’s purple shirt from Uniqlo” is the trunk of the XL males shopping for a purple shirt tree on the FQP I constructed for Uniqlo. Regardless of the equal appears to be like like to your model sits within the equal place on each tree in your orchard.

One consultant trunk question per tree is what Kalicube tracks interval over interval. The FAQ web page on the model’s website can carry as many variants of the BOFU question because the model desires (and will), however the methodology tracks one trunk question per tree because the structural learn on whether or not the tree is producing fruit. That single question is the consultant pattern for the entire trunk.

We measure three KPIs:

  • Model look: When the engine solutions the conversion question, does it floor your model? You count on 100% look as a result of the question carries your model identify, and the engine has no cause to omit you except one thing has damaged upstream. Any miss at this place is an audit-grade sign, and in industrial language, it’s the doubt tax or invisibility tax hitting on the backside of your individual funnel.
  • Sentiment of the looks: When the engine surfaces your model, the framing carries tone: optimistic, impartial, or unfavourable, with a fourth hedged state in brackets. Hedged framing tells you the engine has surfaced you however isn’t assured sufficient to commit, which is the cascading confidence loss Rand Fishkin first documented, made seen on the suggestion floor.
  • Accuracy towards the brand-defined AI résumé: The engine’s synthesis both matches your outlined narrative or drifts from it. The drift is the framing gap made measurable. Monitoring it quarter over quarter tells you whether or not the work on the open internet is shifting the engine’s understanding towards your outlined place or away from it. How I rating the drift is simple in precept: take the brand-defined model, examine it to what the engine produces, and measure the hole. Practitioners who know me will acknowledge the transfer.

Backside of funnel, competitor, runs as a separate marketing campaign on the trunk

Most practitioners depend brand-versus-competitor as center of funnel as a result of comparability looks like analysis.

I depend it as backside of funnel, however run it as a separate marketing campaign with a separate bucket as a result of the shopping for second is going on. The client is naming each manufacturers and asking the engine to determine. I separate these queries as a result of the measurement impacts the brand-only reads after they’re combined.

Three measurements run right here: 

  • Suggestion bias: Which model the engine particularly picks.
  • Sentiment bias: Tone towards your model towards the competitor’s.
  • Accuracy towards each 500-word brand-defined AI résumés: Your model and the competitor’s, written from their perspective as if you happen to have been them.

Center of funnel: The branches

Transfer one stage up the tree and also you land on the branches. The cohort remains to be your preferrred buyer profile (ICP), the intent remains to be the shopping for movement, however the model isn’t talked about within the question but as a result of the client remains to be researching. “Finest purple shirt for males” is a department on Uniqlo’s XL males shopping for a purple shirt tree.

We measure three KPIs:

  • Model look: When the engine solutions a analysis question, which manufacturers floor within the suggestions, if any? Monitor yours and every competitor. The manufacturers the engine reaches for at this layer are the manufacturers it considers candidate solutions to the analysis query, and the manufacturers that don’t floor are the manufacturers the engine has determined aren’t main candidates. That call was made towards the corroboration obtainable to the engine on the open internet earlier than the client ever requested.
  • Sentiment bias, normalized towards look quantity: Sentiment per look is the significant unit, not sentiment complete. A model that surfaces twice with impartial sentiment isn’t essentially shedding to a model that surfaces 10 instances with combined sentiment as a result of the comparability isn’t about quantity of point out. It’s concerning the high quality of point out per surfacing occasion. Uncooked totals get distorted by frequency in ways in which misinterpret the sign, and the normalization is what makes period-over-period comparability maintain.
  • Accuracy drift towards each narratives: A research-stage synthesis that matches your outlined narrative is ready as much as carry the client towards the conversion on the backside of the funnel with the best framing. One that’s unclear, inaccurate, or incomplete is one the engine will repeat throughout its solutions.

The ghost tax is the center of funnel tax: the competitor beneficial since you are absent, as a result of the engine is biased to them, or their framing was higher than yours. These final two are very important — simply counting the appearances doesn’t give a superb measurement of the likelihood that ICP will reliably find yourself at your door.

High of funnel: The twigs

On the prime of each tree sit the twigs: topical questions the client requested earlier than narrowing right down to analysis the acquisition or conversion. “Can males put on purple shirts to work?” is an efficient instance of a twig. 

The diagnostic query on the twigs differs from that on the trunk and branches as a result of the client isn’t asking about manufacturers and even decisions. The engine is reasoning on the topical layer, drawing on no matter content material has earned recruitment for the topical query, and model surfacing is uncommon and due to this fact not the first indicator of success (you’d be measuring nothing more often than not).

Three measurements run on every twig.

  • Topical reply adoption, scored by means of corpus similarity: The engine’s reply in contrast towards your content material corpus and towards every tracked competitor’s, with the model whose corpus scores highest being the model the engine has discovered from. It’s essentially the most novel measurement within the methodology and the one almost certainly to attract crucial replies. TOFU attribution within the AI search is solvable by studying the engine’s output again towards the candidate topical protection.
  • Model look: Manufacturers that floor on the topical layer sit in a stronger aggressive place than those that don’t. A model that constantly surfaces on the consciousness layer for a class is a model the engine treats as topically authoritative with a stage of possession for that class, and topical authority is what underwrites recruitment additional down the tree.
  • Competitor creep on the twigs: Which manufacturers are surfacing on the twigs when yours isn’t, and what does the sample let you know about whose content material the engine has recognized as topically authoritative?

Get the e-newsletter search entrepreneurs depend on.


The highest and center of the funnel have grown, not shrunk

AI has made analysis quicker, and quicker analysis means individuals do extra of it. TOFU and MOFU volumes have grown, even because the share of the combination has rearranged beneath. 

The three-layer mannequin is now “visibility, affect, transaction.” The AI engines are the largest influencers on the planet, the web site is the place the transaction closes, and types measuring AI visibility as a alternative for web site visitors are measuring the incorrect substitution. 

The substitution is within the affect layer, and the transaction layer is doing higher than it appears to be like when you perceive what’s influencing the brand new visitors and the place it’s coming from.

The analytics layer closes the loop to income

The FQP measurement tells you the place the engines are recommending you. Analytics tells you whether or not these suggestions convert. Closing the loop is the operational work, and it’s the place the methodology earns its preserve on the board stage.

You construct the AI-traffic cohort from referral alerts and user-agent strings: Gemini, ChatGPT, Perplexity, AI Mode, and Copilot. 

UTM tagging received’t assist for inbound visitors from the assistive engines themselves as a result of they don’t cross UTM parameters. So tag each supply you do management, shrink the “Direct” bucket so far as it’ll go, after which determine the residual AI visitors by means of referrer alerts, user-agent strings, and conduct patterns as soon as the session lands. 

The cohort you construct is a pattern you extrapolate from, small right this moment and rising.

From my Google Marketing Live 2026 SEA Labs keynote: Similarweb data on AI-referred session quality and conversion rates.From my Google Marketing Live 2026 SEA Labs keynote: Similarweb data on AI-referred session quality and conversion rates.
From my Google Advertising Dwell 2026 SEA Labs keynote: Similarweb information on AI-referred session high quality and conversion charges.

Take the cohort’s conversion price, common order worth, time on website, and repeat buy conduct. Apply it to the whole recruitment quantity the FQP measurement says try to be incomes. That’s your income learn. 

AI-influenced guests arrive with a perspective already shaped — that they had the model summarized for them earlier than they clicked — and they need to convert greater than natural. Monitor the AI-influenced cohort individually from the search cohort it’s principally changing.

On the analytics layer, you deliver revenue margin again into the image. The engine doesn’t know your margin, so it optimizes for consumer satisfaction. 

your margin, so that you weight your “orchard” funding towards the bushes (the cohort x intent intersections) the place conversion quantity x margin justifies the cultivation. That’s the natural equal of the cohort x intent x conversion price x margin math that advertisements have run for 15 years.

At all times do not forget that AI engine visitors will typically be extra engaged, spend longer in your website, and convert higher than search visitors. If it isn’t, that’s a “you” drawback, not an engine drawback.

Agential commerce is a measurement achieve

Brokers may appear like the worst measurement atmosphere but: the consumer delegates, the agent decides, and the model sees solely the conversion. Every little thing between the query and the acquisition is invisible. 

The intuition is to grieve the human alerts we’re shedding: mouse actions, scroll depth, hesitation patterns, micro-pauses on the comparability web page, and the back-and-forth between tabs that used to inform us a lot about consideration. These alerts are gone in agent mode. What replaces them is a measurement floor people by no means gave us within the first place.

Each interplay the agent has along with your infrastructure is a programmatic occasion. It queries your product catalog, retrieves particulars, comes again for clarification, initiates a value negotiation, submits a mandate, and confirms the acquisition. 

That’s a conversion funnel you possibly can observe step-by-step, together with the back-and-forth negotiation. As a programmatic consumer, the agent fires occasions by means of your MCP server, UCP endpoint, decoupled checkout, and mandate dealing with. 

Each protocol layer you construct for agential commerce can be a measurement layer, and the manufacturers that construct the infrastructure to transact with brokers get the bonus of measuring the agent’s full reasoning chain in a means nobody has ever been capable of measure human reasoning.

For me, that is a very powerful measurement framework for the trade within the subsequent section. Search, assistive, and agential every land on the received gate, with three click on varieties resolving the journey. 

  • Search produces the imperfect click on (the consumer picks from an inventory).
  • Assistive produces the proper click on (the AI offers one reply and the consumer confirms).
  • Agential produces the agentic click on (the agent acts with out the consumer seeing the candidates).
Brand-user-algorithm (BUA) opacityBrand-user-algorithm (BUA) opacity

Every of the three modes gives its personal measurement factors, and the factors aren’t equal. 

  • Search is observable on the micro scale throughout the total journey. 
  • Assistive is basically opaque on the micro scale and solely surfaces sparse tactical alerts: quotation monitoring, referrer patterns, user-agent strings, and behavioral cohort identification post-event. 
  • Agential is observable on the programmatic scale, however provided that the model has constructed the protocol layer (MCP, UCP, decoupled checkout, and mandate dealing with) to seize the occasions.

The self-discipline is similar throughout all three modes. Harvest each tactical measurement level you possibly can from each floor. Use these alerts for tactical choices as a result of that’s what tactical micro alerts are for. 

Resist the temptation to make strategic choices from any single mode’s tactical devices as a result of the image each produces is fragmented, partial, and structurally incomplete. 

Strategic choices stay bolted to the macro learn on the funnel question pathway, aggregating throughout all three modes on the FQP stage. The tactical devices serve the technique. They don’t substitute it.

Macro measurement works on a slower timeline

For many years, we measured search the best way the nook store measures stock: depend what’s on the shelf this week, depend it once more subsequent week, examine, and act. The devices delivered the precision the atmosphere supported, and also you and your boardroom obtained educated by means of years of weekly dashboards to count on that actual form of reply: a quantity this week towards the identical quantity final week, monitoring work you possibly can level at.

You’re not within the nook store anymore. You’re working inside an economic system in its personal proper: seven assistive engines, the brokers behind them, the apps every engine ships inside, the working methods that floor them, the {hardware} in each pocket and on each face, each personalised context inside each walled backyard, and the open internet shifting below all of it, all operating without delay, all reshaping who will get beneficial in the mean time of resolution. 

Asking me for a exact month-to-month learn on whether or not your model is profitable in that atmosphere is asking the Financial institution of England for a exact month-to-month learn on the loaf of bread you got yesterday.

The Financial institution offers you inflation at 3% per 30 days, on schedule, and the quantity is actual, comparable throughout months, and defensible throughout years. However you possibly can’t take 3% and apply it to your loaf as a result of your loaf may need gone up 8%, and the loaf within the subsequent store may need gone up 1%. The three% is the mixture learn on the system, not a measurement of any single transaction inside it.

That’s the self-discipline you’re shifting to. I can provide you a quarterly learn on whether or not your model is being beneficial throughout the economic system of engines, and the learn will probably be similar to final quarter and the quarter earlier than, and projected towards subsequent quarter and the one after. The pattern over time is what your technique rests on. 

What I can’t provide you with is a clear quantity for whether or not you received the Perplexity suggestion towards your prime competitor final Tuesday. That’s the loaf. The macro self-discipline offers you the inflation learn. The loaf-level query doesn’t have a defensible reply on this atmosphere, and the methodologies that fake it does are promoting you a false-precision quantity dressed up as the actual factor.

Strategic readability comes from quarterly pattern information

That is the transfer you must make, and the transfer you must stroll your boardroom by means of alongside you. You’re not measuring fewer issues than you used to. You’re measuring one thing far greater, and the devices that match the broader atmosphere work on a slower timeline.

In the event you run the methodology month by month, the drift will swamp the sign. You’ll learn noise and act on noise, and also you’ll do that each month. In the event you run it quarter by quarter, you get one delta towards one baseline, which nonetheless isn’t a pattern. It’s two factors and a line.

By the fourth quarter, you’ve gotten three deltas, the noise comes down, and the pattern reads by means of. By the eighth, the methodology has compounded right into a learn that your strategic choices can really relaxation on, with an actual pathway of comparability going backward throughout two full years.

Quarter eight can be the place most measurement packages die, as a result of boardroom impatience peaks at precisely the purpose when the methodology produces its first defensible reply. Maintain the road, and also you compound the maturity. 

Cave at month six, demand the weekly dashboard again, and also you’ll spend the following a number of years looking for precision the atmosphere can’t ship, whereas rivals who held the road stroll previous you with strategic readability you used to have and gave up.

Make the case to your boardroom plainly: 

  • We’re working inside an economic system, and your model’s standing inside it determines whether or not AI places you in entrance of the best purchaser on the proper second.
  • The measurement self-discipline that matches this atmosphere is the macro self-discipline economists developed for precisely the identical type of drawback 100 years in the past. 

Transfer to macro measurement, settle for the timescale, and the methodology compounds into the strategic readability, the micro devices stopped delivering the second your purchaser’s journey moved off your measurable surfaces and onto the engine’s.

The macro atmosphere received’t provide you with a single, clear dashboard quantity. What it offers you, if you happen to run this system with persistence, is a quarter-by-quarter, mode-by-mode, engine-by-engine learn of whether or not AI is recommending your model at each stage of each shopping for journey the orchard is constructed round. 

That’s the reply you possibly can construct a technique round to achieve a long-term aggressive benefit.


That is the fifteenth piece in my AI authority sequence.

Contributing authors are invited to create content material for Search Engine Land and are chosen for his or her experience and contribution to the search group. Our contributors work below the oversight of the editorial staff and contributions are checked for high quality and relevance to our readers. Search Engine Land is owned by Semrush. Contributor was not requested to make any direct or oblique mentions of Semrush. The opinions they categorical are their very own.


#measure #visibility #precision

Leave a Reply

Your email address will not be published. Required fields are marked *