ChatGPT closely favors the highest of content material when deciding on citations, in response to an evaluation of 1.2 million AI solutions and 18,012 verified citations by Kevin Indig, Progress Advisor.
Why we care. Conventional search rewarded depth and delayed payoff. AI favors instant classification — clear entities and direct solutions up entrance. In case your substance isn’t surfaced early, it’s much less more likely to seem in AI solutions.
By the numbers. Indig’s group discovered a constant “ski ramp” quotation sample that held throughout randomized validation batches. He known as the outcomes statistically indeniable:
- 44.2% of citations come from the primary 30% of content material.
- 31.1% come from the center (30–70%).
- 24.7% come from the ultimate third, with a pointy drop close to the footer.
On the paragraph degree, AI reads extra deeply:
- 53% of citations come from the center of paragraphs.
- 24.5% come from first sentences.
- 22.5% come from final sentences.
The massive takeaway. Entrance-load key insights on the article degree. Inside paragraphs, prioritize readability and data density over compelled first sentences.
Why this occurs. Massive language fashions are skilled on journalism and tutorial writing that observe a “backside line up entrance” construction. The mannequin seems to weight early framing extra closely, then interpret the remainder by means of that lens.
- Trendy fashions can course of huge token home windows, however they prioritize effectivity and set up context shortly.
What will get cited. Indig recognized 5 traits of extremely cited content material:
- Definitive language: Cited passages had been almost twice as seemingly to make use of clear definitions (“X is,” “X refers to”). Direct subject-verb-object statements outperform obscure framing.
- Conversational Q&A construction: Cited content material was 2x extra more likely to embody a query mark. 78.4% of citations tied to questions got here from headings. AI usually treats H2s as prompts and the next paragraph as the reply.
- Entity richness: Typical English textual content comprises 5% to eight% correct nouns. Closely cited textual content averaged 20.6%. Particular manufacturers, instruments, and folks anchor solutions and scale back ambiguity.
- Balanced sentiment: Cited textual content clustered round a subjectivity rating of 0.47 — neither dry truth nor emotional opinion. The popular tone resembles analyst commentary: truth plus interpretation.
- Enterprise-grade readability: Profitable content material averaged a Flesch-Kincaid grade degree of 16 versus 19.1 for lower-performing content material. Shorter sentences and plain construction beat dense tutorial prose.
Concerning the knowledge. Indig analyzed 3 million ChatGPT responses and 30 million citations, isolating 18,012 verified citations to look at the place and why AI pulls content material. His group used sentence-transformer embeddings to match responses to particular supply sentences, then measured their web page place and linguistic traits resembling definitions, entity density, and sentiment.
Backside line. Narrative “final information” writing might underperform in AI retrieval. Structured, briefing-style content material performs higher.
- Indig argues this creates a “readability tax.” Writers should floor definitions, entities, and conclusions early—not save them for the top.
The report. The science of how AI pays attention
Search Engine Land is owned by Semrush. We stay dedicated to offering high-quality protection of promoting subjects. Except in any other case famous, this web page’s content material was written by both an worker or a paid contractor of Semrush Inc.
#ChatGPT #citations #content material #Research

