Tools · Fan-Out Analyzer

AI Mode Query Fan-Out Analyzer.

A Streamlit app that scores your content against the way Google AI Mode actually reformulates queries. Paste a URL and a seed query. The tool generates up to 20 realistic query variants, embeds each passage on the page, and tells you which passages are covered and which are gaps.

Launch the App

AI Mode Query Fan-Out Analyzer main interface, showing the Generated Queries list and a Content Similarity Analysis table with overall similarity, max passage similarity, and average passage similarity scores.
The main surface — a generated query fan-out and a ranked similarity table per URL.

Why I Built It

Clients kept asking the same question: "Is this page going to show up in AI Overviews?" The honest answer used to be, "Publish it and find out in a month." That's expensive. Content takes weeks to produce, and waiting for AI to re-crawl and decide costs real money.

AI Search doesn't rank URLs the way classic SEO did — it pulls passages. A page either has passages that match the retrieval query's intent or it doesn't. If I could simulate the retrieval before publishing, I could tell you which passages to keep, rewrite, or add.

That's what this does.

The Nine Query Variant Types

When a user types a query into AI Mode, the model doesn't just answer that one string — it internally reformulates. A seed query fans out into a family of related queries, and the system pulls passages that satisfy any of them. To simulate that, I prompt Gemini to generate up to 20 variants across nine types. The model picks the types that make sense for the seed — not every type fires for every query.

The first variant is always the original query, exactly as typed.

  1. Equivalent. Rephrasings of the same question.
    "did roger moore drive an aston martin""what car did roger moore drive"
  2. Follow-up. Logical next questions that build on the original.
    "did da vinci paint mona lisa""who commissioned da vinci to paint mona lisa"
  3. Conversational follow-up. How people actually talk to AI Mode after getting a first answer. The topic stays in the query for semantic match.
    "solar panels""are solar panels worth it?" / "how long do solar panels last?"
  4. Generalization. Broader version of the question.
    "best Italian restaurants in Manhattan""best restaurants in New York City"
  5. Specification. More detailed or specific version.
    "climate change""climate change effects on coastal cities"
  6. Canonicalization. Slang or informal phrasing turned into standard terms.
    "how to get rid of belly fat fast""abdominal fat reduction methods"
  7. Entailment. Consequences, prerequisites, or implied facts.
    "solar panel installation""solar panel maintenance requirements"
  8. Clarification. Disambiguation when the seed query has multiple meanings.
    "apple""apple fruit nutrition" or "apple iphone features"
  9. Related entity. Closely related people, concepts, or products.
    "iPhone 15 features""smartphone comparison 2024"

I force at least two or three conversational follow-ups in every run, because that's where AI Mode actually lives. Static keyword SEO gets you the equivalent and specification buckets. AI Search is where the other seven types matter.

How It Works Under the Hood

Six stages. The first three are setup. The last three are the analysis.

1. Scrape the Page

Three scraping modes, in order of strength:

2. Chunk Into Passages

Two granularities:

Passage-based mode also supports a sliding sentence-overlap window, so a passage bleeds a sentence or two into its neighbors — useful when a single retrieval-worthy idea crosses a paragraph break.

3. Generate the Query Fan-Out

Gemini receives the prompt described above with the seed query. It returns a Python-parseable list of query strings. Nothing fancy — no chain of thought, no voting. It's a one-shot call.

4. Embed Everything

Eight embedding models available. Pick one for the whole run:

Embeddings are cached by SHA-256 of (model name + text), so re-runs on the same content don't pay the API cost twice.

5. Compute Pairwise Cosine Similarity

Every passage is compared to every query. For a page with 40 passages and 7 queries, that's 280 comparisons — each a dot product of normalized vectors. Fast even on CPU.

6. Highlight and Score

Three bands, keyed to real empirical thresholds I've used in client work:

The UI renders the source page's HTML with passages color-coded inline, plus a ranked table of passages × queries.

Detailed passage analysis for the query 'benefits of server-side rendering' — the source page rendered with green passages (covered) and red passages (gaps) highlighted inline, alongside a Top 5 Passages list ranked by similarity score.
Passage analysis for a single query. Green = covered (≥0.75). Red = gap (<0.60). The right column ranks the top passages by score.

Inputs and Outputs

Inputs. A seed query. Number of variants (3–20, seven is the sweet spot). Input mode (URL list, pasted text, or persona-prompt ranking). Scraping method. Analysis granularity. Embedding model.

Outputs. Inline highlighted HTML of the source page (green / amber / red passages you can read in place). Ranked passage × query table. A gap report for passages below 0.60 against every query. Optional Gemini-generated SEO recommendations. And a prompt-ranking mode if you're choosing between candidate prompts for an AI app.

Passage similarity heatmap showing 20+ queries (rows) scored against 51 passages (columns) in a green-to-yellow gradient. Brighter cells indicate stronger matches between specific query and passage pairs.
Passage × query heatmap. One row per query, one column per passage. Bright cells are retrievable matches; darker cells are gaps.
AI-Powered SEO Recommendations dashboard showing average similarity of 82.73 percent, 0 content gaps, 20 strong matches, and structured recommendations across Content Gaps and Semantic Expansion, Content Structure for AI Extraction, Semantic Coverage, and AI Search Optimization.
The optional SEO recommendations view — a Gemini-generated rewrite list organized by section, grounded in the actual similarity scores above.

Stack

Python, Streamlit, no database. The whole thing runs in a single process: sentence-transformers for local embeddings, google-genai for query generation and Gemini embeddings, openai for the OpenAI embedding family, huggingface_hub for gated-model auth, scikit-learn for cosine similarity, plotly for the visualizations, beautifulsoup4 + trafilatura + selenium-stealth for scraping, nltk for sentence splitting.

Deployed on Posit Connect Cloud.

What It Doesn't Do

Worth being direct here:

When to Use It

Try It Now

Live on Posit Connect Cloud. Open full-screen →

Working With Me on This

The Fan-Out Analyzer is free to use. The harder part is interpreting the scores and writing the passages that close the gaps — that's what the AI SEO consulting service is for. If you want me to run it against your site and return a rewrite list, start a conversation.

← All Tools