SEO

How AI Search Actually Picks Your Content — And What To Write Differently

Slone

18 Mar 2026 — 18 min read

Takeaway

AI search systems generate answers first, then score your content against them using embedding distance. This framework maps the retrieval mechanics, passage-level extraction rules, and platform-specific pipelines that determine which content gets cited.

Ask a Question

Google's AI systems generate an answer first, then score your content against it using embedding distance. This is not how most people think AI search works, and the gap between perception and reality is where almost every optimization mistake originates.

The mechanism is confirmed by patent US11769017B1 ("Generative Summaries for Search Results")^[1], validated by iPullRank's deep analysis of six related patents^[2], and stress-tested against 1.2 million ChatGPT responses^[3], 863,000 SERPs^[4], and multiple independent reproducibility studies. The implications are specific, actionable, and — in several cases — the direct opposite of what the SEO industry currently sells.

This is the complete technical framework for writing content that AI search systems actually retrieve, evaluate, and cite. It covers how the retrieval pipeline works at each stage, what differs across the five major platforms, which copywriting practices have rigorous evidence behind them, which are speculative, and which are outright snake oil. If you produce content for the web in 2026, this is the mechanical reality your writing now operates inside.

The generate-first, verify-second architecture

The core patent underlying Google AI Overviews describes a sequence that upends the standard assumption about how AI search uses your content. The system receives a query, selects search result documents responsive to that query and related queries, extracts content snippets, and processes them through an LLM. The LLM drafts an answer first. Each sentence of that drafted answer is then converted into a vector embedding and compared against vector embeddings of source passages. Passages whose embeddings fall within a threshold distance get cited.^[1]

This is not traditional Retrieval-Augmented Generation where retrieved documents fill a context window before the LLM writes its response. It is a generate-verify loop where the LLM's own "expected answer" serves as the scoring reference.^[2] The distinction matters enormously for how you structure content: your writing needs to match the shape of what the AI expects to say, not just cover the right topic.

One counterintuitive detail buried in the patent architecture: higher-confidence responses have a lower chance of displaying citations. When the system is highly confident in its generated answer, it may omit source attribution entirely. The most authoritative, well-established facts are paradoxically the least likely to generate visible citation links — which means the optimization opportunity is greatest for long-tail topics, niche expertise, and emerging subject matter where the AI lacks strong internal priors.^[1]

Testing by Previsible SEO against 143 AI Mode and AI Overview responses found a 97.9% appearance rate for pages matching expected-answer structural criteria.^[5] Their published analysis of 5,000 AI responses across four intent types found that 82% of cited pages used explicit entity naming, 64% included feature or capability lists, and 71% kept paragraphs under four lines. These structural patterns are consistent with what embedding-distance scoring would reward.

The two-layer model that resolves the causation debate

A sharp disagreement among top SEO practitioners — Rand Fishkin, David McSweeney, Chris Sonders — centers on whether AI citations cause brand recommendations or merely correlate with them.^[6]^[7] The technically precise answer requires distinguishing between two layers of the system.

Fishkin's argument draws on post-hoc rationalization: the documented reality that LLMs cannot introspect on their own reasoning and often hallucinate explanations for their outputs.^[7] Academic research confirms this is real — a March 2025 arXiv paper found GPT-4o-mini exhibits a 13% post-hoc rationalization rate in chain-of-thought reasoning. But applying this valid insight about LLM self-explanation to the entire retrieval pipeline is a mistake.

In standard RAG — and all major AI search platforms now use RAG for most queries — retrieved content enters the context window before the LLM generates its answer. The content literally sits in the prompt. This is a genuinely causal pathway. Perplexity's architecture is the clearest example: independent analysis confirms that its citations are structurally embedded in the generation prompt and response logic, not retrofitted post-hoc.^[8]

Layer 1 is retrieval candidacy. If your page is not retrieved during the fan-out process, it cannot be cited. Content structure, passage clarity, entity naming, freshness signals, and indexability directly determine whether your content enters the candidate pool. This layer is unambiguously causal. You can influence it through page-level decisions.^[5]^[6]

Layer 2 is brand selection within the answer. Once content is retrieved, the model generates its answer by combining RAG-retrieved passages with parametric knowledge from training data. For well-known brands, training data dominates — the model has strong priors about QuickBooks being an accounting tool regardless of what it retrieves in real time. For lesser-known brands, RAG-retrieved content has outsized influence because the model lacks strong parametric priors.^[8]

Sonders' study of 1,200 ChatGPT interactions found that only about 11% of all mentioned brands achieve dominant status, appearing in 80% or more of responses. These are overwhelmingly well-known entities with deep training-data presence. The remaining 89% of brands cycle through the long tail probabilistically — and this is precisely where Layer 1 retrieval optimization creates the most leverage.^[9] In niche categories, 21% of brands reach dominant status versus just 7% in competitive categories, confirming that where training data is thinner, retrieved content carries more weight.

For practitioners operating in niche, low-competition spaces — solopreneur content, emerging B2B categories, specialist consulting — Layer 1 optimization matters disproportionately more than it does for Fortune 500 brands. The models have weaker parametric priors about your topics, which means page-level writing decisions carry outsized influence.

Query fan-out means single-keyword optimization is obsolete

When a user submits a query, AI systems do not execute a single search. Through a mechanism called query fan-out, they expand the initial prompt into a constellation of synthetic subqueries sent in parallel to retrieve different passage sets.^[10] The volume depends on the platform: Google AI Overviews generates 8 to 12 synthetic subqueries per prompt, Google AI Mode generates potentially hundreds^[2], and ChatGPT scales from roughly 1 query for instant responses to 12–20 for complex prompts.

Because retrieval is driven by these diverse synthetic subqueries, traditional organic rankings for a primary keyword are increasingly disconnected from AI search visibility. Data shows that only 17–32% of Google AI Overview citations come from pages ranking in the organic top 10 for the user's original query.^[4] The remaining 68–83% of citations are pulled from pages that rank well for the system's synthetic fan-out subqueries.

AI systems also inject freshness filters into these subqueries. Research shows that 28.1% of generated subqueries automatically include the current year, systematically filtering out older content regardless of its quality.^[11]

The strategic consequence is that a single piece of content targeting one primary keyword will only be retrieved by one subquery during the fan-out process. Content that covers multiple facets of a topic gets retrieved by several different subqueries, increasing its weight in the candidate set and making it far more influential in shaping the AI's final answer. This means a page targeting "best project management tools for solopreneurs" must also explicitly contain passages about solo workflows, individual pricing, integration paths, migration from team tools, and comparison against enterprise alternatives — because the AI will independently query those subtopics during its fan-out phase.^[10]

Being retrieved by multiple fan-out subqueries also increases your chances of becoming what practitioners call a "retrieval citation" — a source that heavily informs the AI's facts and vocabulary without necessarily receiving a visible link.

Five platforms, five different citation pipelines

Treating "AI search" as monolithic is a structural mistake. Each platform uses a different retrieval backend, applies different reranking logic, and rewards different content characteristics.

Google AI Overviews operates a five-stage pipeline: query understanding through parallel semantic parsing, fan-out of 8–12 synthetic subqueries per the Thematic Search patent, multi-source hybrid retrieval from Google's full index plus Knowledge Graph plus YouTube transcripts, pairwise LLM reranking of candidate passages, and synthesis with embedding-distance citation attribution.^[12] Despite the complexity of the AI layer, traditional organic ranking still matters here: 85–94% of AI Overview citations are pulled from pages ranking in the top 20 organic results. But an Ahrefs study found that only 38% of AI Overview citations come from top-10 organic results — down from 76% just eight months earlier.^[4] The erosion is accelerating.

Google AI Mode shares foundational infrastructure with AI Overviews but operates with dramatically expanded fan-out (hundreds of subqueries), persistent conversational state, deep user-embedding personalization drawing from Chrome history and Gmail behavior, and multi-LLM orchestration.^[2] An independent Moz study of 40,000 queries found that 88% of AI Mode citations do not appear in the organic SERP for the same query — the widest gap of any platform. SE Ranking found only 9.2% URL consistency across three runs of the same query on the same day, with 21.2% of queries producing zero URL overlap between runs.

ChatGPT with web search does not maintain its own index. It sends reformulated queries to Bing's API and fetches full page content at runtime. Only 26.7% of ChatGPT citations overlap with Bing's top organic results, indicating significant LLM-driven reranking.^[9] ChatGPT's bot does not execute JavaScript, so pages need pre-rendered HTML. Page speed is a massive factor: pages with First Contentful Paint under 0.4 seconds average 6.7 citations versus 2.1 for slower pages. ChatGPT also exhibits the strongest recency bias of any platform — URLs cited by ChatGPT average 393 days newer than traditional Google organic results, and 76.4% of top-cited pages have been updated within the last 30 days.^[13]

Perplexity is natively search-first — every query triggers real-time retrieval from both Google and Bing indexes plus its own crawlers. Citations are generated during synthesis rather than post-hoc, with 3–8 sources per response and a transparency-first design. Independent benchmarking scored Perplexity highest in relevance at 4.36 out of 5.^[14] Its transparent citation pipeline makes it the best platform for testing and reverse-engineering citation behavior.

Claude uses Brave Search as its retrieval backend, producing the simplest optimization surface of any platform. Statistical analysis shows 86.7% overlap between Claude's cited results and Brave's top non-sponsored organic results — meaning Claude largely passes through Brave's rankings with minimal reranking. Claude also showed unique behavior in a data void experiment: it was the only major chatbot that refused to cite a fabricated satirical article, apparently employing coherence checking that caught the single-source nature of the claim.^[15]

Only 11% of domains are cited by both ChatGPT and Perplexity.^[14] No single page optimization approach serves all five platforms equally — but passage-level clarity, entity specificity, and answer-first structure improve citation probability across all of them.

Three citation layers most practitioners conflate

AI search operates through three distinct citation layers, and most measurement tools can only see one of them.^[16]

Interface citations are the visible links displayed to the user alongside the AI's response. These are what most AI visibility tools track. They represent the smallest and least influential layer.

Generation citations are brands explicitly named in the AI's generated text, even without an accompanying link. BrightEdge data shows ChatGPT mentions brands 3.2 times more often than it cites them with visible links. Generation citations are partially observable by parsing response text for brand mentions.

Retrieval citations are sources pulled into the candidate set during fan-out and reranking that influence answer construction even if never shown to the user. The full retrieval set is never exposed to developers or users. A source can shape an AI's factual content, vocabulary, and framing without receiving any visible attribution. No external tool can observe this layer, which means the most influential citation mechanism is structurally invisible to measurement.^[2]

This three-layer distinction explains why SparkToro found less than a 1-in-100 chance of identical brand lists across 100 runs of the same prompt.^[17] The retrieval set shifts probabilistically with every query, producing different synthesis outputs each time.

For copywriters, each layer demands different optimization. Interface citation optimization requires verifiable facts, self-contained paragraphs, and structured data that survive extraction. Generation citation optimization requires brand prominence across the training corpus — web-wide mentions, Wikipedia presence, review platforms, community discussions. Retrieval citation optimization requires content that satisfies multiple intent dimensions from a single query, because your passage needs to be pulled by multiple fan-out subqueries to enter the candidate set at high enough frequency to influence synthesis.

Data voids hand first movers an outsized advantage

When AI retrieval finds few or no authoritative documents on a topic, the system becomes less selective about what it cites. There is no competing content to serve as a corroboration signal. The single available source gets elevated with the same confidence framing as multi-source, well-corroborated claims.

This vulnerability was vividly demonstrated in February 2026 when a journalist published a fabricated article on a low-traffic personal site ranking "the best tech journalists at eating hot dogs." Within 24 hours, ChatGPT, Google Gemini, Google AI Overviews, and Grok were all citing it as established fact. Claude was the sole holdout.^[15]

The strategically significant finding was what happened next. When another journalist attempted to replicate the experiment two days later, she failed completely. The original author had already captured the first-mover position — his article, combined with news coverage discussing it, created a small corpus that the AI treated as authoritative. The AI retroactively applied skepticism to the late arrival.

Content freshness amplifies the data void effect. Ahrefs' analysis of approximately 17 million citations found AI-cited content is 25.7% fresher than traditional Google organic results.^[13] When freshness bias meets data void conditions, a single new piece of content can achieve near-total citation dominance.

For practitioners targeting underserved topic spaces — niche B2B audiences, solopreneur-focused content, emerging technology categories — this creates a compounding first-mover advantage. SE Ranking found that question-based titles carry 7 times more impact on AI citations for smaller domains compared to large enterprise sites. LLMs cite only 2–7 domains per response, meaning well-structured content from smaller publishers can compete directly in low-competition spaces.

The advantage is real but degradable. Citation turnover runs 40–60% monthly, so first-mover positions must be defended with regular updates, expanding coverage, and building corroborating content across multiple domains that reinforce the original source.

The measurement crisis: why "AI ranking position" is not a metric

Traditional SEO relies on tracking stable ranking positions. AI search makes this structurally impossible.

SparkToro's study of 2,961 prompt runs across 600 volunteers found a less than 1-in-100 chance of generating identical brand lists on the exact same prompt, and roughly a 1-in-1,000 chance of identical ordering.^[17] SE Ranking corroborated this with only 9.2% URL consistency across three runs of the same AI Mode query on the same day.

The instability has three causes. First, the retrieval set shifts probabilistically with every query — even identical queries pull slightly different candidate passages. Second, Google AI Mode applies deep personalization based on user history, meaning two users asking the same question see different citations. Third, the most influential citation layer (retrieval citations) is invisible to all external measurement tools.

Any tool or service claiming to guarantee or track specific AI citation positions is selling snake oil. The underlying architecture makes this structurally impossible. The only statistically valid metric is visibility rate — how frequently your brand appears across many runs of the same prompt.^[17] Practitioners should run target prompts multiple times per tracking cycle and categorize brands as long-tail (under 20% appearance), mid-tier, or dominant (80%+ appearance) rather than tracking specific positions.

Evidence-backed writing practices: what actually works

The patent evidence, experimental data, and platform mechanics converge on specific writing practices organized here by the strength of evidence behind them.

Tier 1: Strongly supported by multiple independent sources

Front-load your most citable content in the first 20–30% of the page. Kevin Indig's analysis of 1.2 million ChatGPT responses found a statistically indisputable "ski ramp" distribution (p-value of 0.0): 44.2% of all citations come from the first 30% of a page's content.^[3] A CXL study of Google AI Overview citations found an even steeper pattern at 55% from the top 30%.^[18] Narrative "ultimate guide" structures that slowly build to a conclusion underperform in every AI extraction study. Place your clearest definitions, most specific data points, and strongest entity-rich statements early.

Write 40–60 word answer capsules immediately after question-style H2 headings. Indig found that 72.4% of cited pages included a concise answer in this range near section tops.^[3] The strongest-performing configuration — capturing 34.3% of all citations — combined an answer capsule with original proprietary data. Use definitive "X is Y" constructions. Do not place links inside answer capsules: an outgoing link signals to the AI that the actual authoritative answer exists elsewhere.

Include original data, statistics, and expert citations. The Princeton GEO study found that adding citations improved AI visibility by 30–40%, statistics by 22%, and expert quotations by up to 41%.^[19] These are the largest effect sizes found in any generative engine optimization research. Keyword stuffing, by contrast, reduced visibility by 10%.

Use high entity density with explicit proper nouns. Cited text averages 20.6% proper nouns versus a typical 5–8% in standard English.^[3] Name tools, people, and companies explicitly rather than using pronouns. Place your primary entity as the grammatical subject of sentences in active voice — practitioner experiments show this measurably increases salience scores. Ben Garry's analysis demonstrated that "Frodo took the ring to Mordor" gives Frodo a salience score of 0.74, while the passive "The ring was taken to Mordor by Frodo" drops Frodo to 0.11. Grammar controls salience.

Tier 2: Supported with moderate uncertainty

Structure content into self-contained 50–150 word passages. Because AI systems extract text chunks stripped of surrounding context, every section must be independently comprehensible. Mike King's recommendation is based on RAG retrieval mechanics: large, undifferentiated text blocks confuse chunking algorithms and lead to retrieval failures.^[2] However, Ahrefs correctly notes that you cannot meaningfully control how LLMs chunk your content — the exact boundaries are determined by model pipelines, token limits, and retrieval strategies. Clear structure helps, but the mechanism is probabilistic.

Maintain a balanced analytical tone. Indig's analysis found that cited content maintains a subjectivity score around 0.47 — fact plus interpretation, like analyst commentary. Neither pure Wikipedia neutrality nor pure opinion. Flesch-Kincaid grade 16 is the sweet spot: business-grade clarity without dumbing down. Content at grade 19 or above (academic or PhD level) actively hurts citation likelihood.^[3]

Use semantic HTML5 markup and clean, fast-loading code. AI crawlers show 47 times the inefficiency of traditional Googlebot and struggle with complex JavaScript rendering.^[20] Only Google's Gemini and AppleBot currently render JavaScript among major AI crawlers. ChatGPT's bot cannot execute JavaScript at all. Use proper semantic tags — article, section, table, ordered and unordered lists — and ensure pre-rendered HTML.

Run content against a vocabulary gap tool once, then close it. Content optimization tools like Surfer or Clearscope map to first-stage BM25 lexical matching. Their value is identifying terms with zero mentions that your audience searches for — the BM25 zero-score cliff makes that first mention the single highest-impact edit. Going from 4 mentions to 8 is nearly worthless due to term saturation. Do not write to a live score. Do not chase keyword density targets. The tool's job is done once you've identified missing vocabulary.

Tier 3: Forward-compatible infrastructure

Implement JSON-LD schema markup for FAQPage, Article, HowTo, and Organization types. The ranking impact is zero — Google has explicitly confirmed this since 2018. But schema enables rich results that improve click-through rates (SearchPilot A/B tests showed FAQ schema producing 3–8% traffic uplift, review schema roughly 20% for e-commerce), and it provides machine parsability for emerging agentic search systems.^[21]

Create an llms.txt file for forward compatibility with AI agents. The standard lacks official adoption but Perplexity and Anthropic already use it for their own documentation. It is gaining practitioner momentum as a way to format site content specifically for autonomous AI crawlers.^[22]

Allow AI crawlers in robots.txt. GPTBot, ClaudeBot, PerplexityBot, and GoogleOther should be explicitly permitted if you want AI citation visibility. This is table-stakes infrastructure.

What is debunked and what is snake oil

The SEO industry sells a significant amount of optimization advice that is either outdated, speculative, or demonstrably false. Here is what the evidence says about common claims.

LSI keywords do not exist. Google's John Mueller stated directly: "There's no such thing as LSI keywords — anyone who's telling you otherwise is mistaken, sorry."

Optimal keyword density is a myth debunked since 2011. The Princeton GEO study found keyword stuffing actually reduced AI visibility by 10%.^[19] Chasing specific term frequency targets produces over-optimized "digital mulch" — a term Google's Mueller has endorsed — that the Helpful Content system is designed to penalize.

Content optimization scores do not predict rankings. Three major correlation studies (Surfer at 0.28, Clearscope at 0.30, Originality.ai testing multiple tools) all share the same fatal flaw: no study controlled for domain authority, backlinks, or click signals. The methodology is fundamentally circular — tools generate recommendations by analyzing pages that already rank, then studies test whether already-ranking pages score well. Clearscope founder Bernard Huang acknowledged the limitation directly.

Schema is not a ranking factor and likely not an AI citation factor. While schema distribution aligns closely with AI citations, this is almost certainly correlation — well-maintained sites tend to have both good schema and good content. Claims that implementing schema alone will move citation probability are classic correlation-causation confusion.

The SMITH model was never deployed in production. Google explicitly confirmed this. Many SEO blogs conflated SMITH with passage ranking; the connection is purely speculative.

Response caching as an optimization target is unconfirmed. While semantic caching is architecturally plausible, the less than 1% reproducibility data directly contradicts the idea that you can time content publication to "lock in" a cached position.^[17] If caching exists, it operates with extremely short TTLs or only at the retrieval layer rather than the final answer layer.

Any guarantee of specific AI citation positions is structurally impossible. The underlying architecture produces probabilistic, personalized, volatile outputs. Period.

The uncomfortable synthesis

The practices that genuinely improve performance across AI search surfaces are not proprietary techniques requiring specialized tools. They are the natural output of clear, authoritative, well-structured writing by subject-matter experts who know their topic deeply enough to provide original data, name specific entities, and front-load definitive answers.

Two genuinely novel findings emerge from the evidence. First, the multi-surface divergence is real and accelerating. Traditional Google ranking, AI Overview citation, and LLM citation now operate on substantially different retrieval systems with different authority pools. Optimizing exclusively for Google's organic rankings increasingly leaves value on the table — but the instability of AI citations means this new surface is not yet reliably trackable.

Second, the shift from page-level to passage-level competition is genuine. In RAG-based AI systems, your passages compete head-to-head against specific passages from other sources, not against whole pages. Information density per section — not per page — is the operative unit of competition. The writer who buries a definitive answer at paragraph 47 of an ultimate guide will lose to the writer who places it in a 50-word capsule after an H2, not because of any SEO trick, but because of how retrieval systems fundamentally work.^[2]

The biggest risk is doing too much of the wrong thing. Over-optimization triggers the exact signals the Helpful Content system was designed to detect. A Cyrus Shepard study of 50 sites found the strongest positive correlation with algorithm survival was first-hand experience at 0.333, while excessive anchor text variation showed the strongest negative correlations.^[23] Google is actively penalizing the fingerprints of formulaic optimization.

The evidence-based path forward: write with genuine expertise, front-load definitive answers, use proper nouns instead of pronouns, include original data, structure for independent section extraction, cover the full semantic neighborhood of your topic, and publish first in the spaces where you have a knowledge advantage. Invest in brand signals — at 0.334 correlation with LLM citation, they remain the single strongest predictor of AI search visibility.^[9]

The fancy patent mechanics ultimately reward the same thing good writing has always required: being the most useful, most specific, most clearly structured answer to the question someone is asking. The difference is that now your competition for that position includes the LLM's own expected answer.

Sources

Google Patent US11769017B1, "Generative Summaries for Search Results." Filed March 2023, granted September 2023. Named inventors include Srinivasan Venkatachary. Analysis via iPullRank. — patents.google.com
King, Michael. "How AI Mode and AI Overviews Work Based on Patents." iPullRank / Search Engine Land, June 2025. Deep dive into six patents powering Google's AI search pipeline. — ipullrank.com | searchengineland.com
Indig, Kevin. "The Science of How AI Pays Attention." Growth Memo, 2026. Analysis of 1.2 million ChatGPT responses and 18,012 verified citations revealing the "ski ramp" distribution. — growth-memo.com | Coverage: searchengineland.com
Ahrefs. "AI Overviews vs AI Mode." March 2026. Study of 863,000 SERPs finding only 38% of AI Overview citations from top-10 organic results, down from 76% eight months prior. — ahrefs.com
McSweeney, David. Previsible SEO testing of 143 AI Mode/AI Overview responses and published structural analysis of 5,000 AI responses. Referenced via Twitter/X thread and LinkedIn post. — x.com/top5seo
McSweeney, David. LinkedIn post responding to the Fishkin causation debate. "Rand Fishkin's post argues that AI citations are correlated with brand appearances in AI answers, but not causal. I agree with most of it… But the mechanism itself is a scoring function." — linkedin.com/mcsweeney
Fishkin, Rand. LinkedIn post and 5-Minute Whiteboard video. "I'm really worried that marketers are getting bamboozled by all the AI Citation Research… Citations are correlated, sure, but not causal." — linkedin.com/randfishkin
Sonders, Mike. "How Much Can You Influence Which Brands ChatGPT Recommends?" Analysis of 2,654 ChatGPT-cited URLs. — visible.beehiiv.com
Sonders, Mike. Study of 1,200 ChatGPT interactions (12 prompts × 100 runs) finding ~11% brand dominance rate. Referenced in LinkedIn thread. — visible.beehiiv.com
iPullRank. "Expanding Queries with Fan-Out." Analysis of Google's query fan-out patent and introduction of the Qforia tool. — ipullrank.com | Also: digiday.com
Qwairy analysis of 102,018 AI-generated queries finding 28.1% automatic current-year injection into synthetic subqueries. Referenced via ekamoira.com
Google Patent US12158907B1, "Thematic Search." Describes the system generating structural outlines of themes and subthemes before populating AI Overview responses. — patents.google.com
Ahrefs. "Do AI Assistants Prefer to Cite Fresh Content?" Study of ~17 million citations finding AI-cited content is 25.7% fresher than traditional organic results. — ahrefs.com
Qwairy Q3 2025 cross-platform study finding only 11% domain overlap between ChatGPT and Perplexity citations. Perplexity scored 4.36/5 in relevance benchmarking. — aeoengine.ai
Ray, Lily (Amsive Digital) and Germain, Thomas (BBC). Data void experiment demonstrating AI citation vulnerability, February 2026. — bbc.com
Weller, Brooke (LinkedIn). Three-layer citation taxonomy: interface citations, generation citations, and retrieval citations. Articulated in LinkedIn thread responding to the Fishkin/McSweeney debate. — linkedin.com (thread)
Fishkin, Rand and Gumshoe/SparkToro. Study of 2,961 prompt runs across 600 volunteers finding <1% chance of identical brand lists. — sparktoro.com
CXL. "Where Google AI Overviews Pull Their Answers From." Study finding 55% of AI Overview citations extracted from the top 30% of page content. — cxl.com
Aggarwal et al. "GEO: Generative Engine Optimization." Princeton University, Georgia Tech, Allen AI. Published at ACM KDD 2024. 10,000-query study finding +22% visibility from statistics, +41% from expert quotations, -10% from keyword stuffing. — arxiv.org
Directive Consulting. "How to Optimize Content for AI Search." Analysis of AI crawler inefficiency (47× compared to Googlebot) and JavaScript rendering limitations. — directiveconsulting.com
SearchPilot A/B testing of schema markup impact on CTR. FAQ schema: 3–8% traffic uplift. Review schema: ~20% for e-commerce. — searchpilot.com
Search Engine Land. "llms.txt Proposed Standard." Coverage of the emerging llms.txt specification for AI agent compatibility. — searchengineland.com
Shepard, Cyrus. Study of 50 sites finding first-hand experience (0.333 correlation) as strongest predictor of algorithm update survival. Referenced in dev.to analysis

🤖

This framework maps the retrieval mechanics behind how Google AI Overviews, ChatGPT, and Perplexity extract and cite passage-level content. Built with Claude and inspired by David McSweeney, it synthesizes Google patents, iPullRank's analysis, Kevin Indig's 1.2M citation study, and the Fishkin causation debate.