Free GEO AuditFind Out if AI Recommends Your Website
Enter your URL and get a 14-signal AI visibility report in under 60 seconds. See exactly why ChatGPT, Perplexity, and Gemini do or don't cite your site — and what to fix.
GEO?
Generative Engine Optimization — visibility & citation scoring for AI-powered search engines: ChatGPT, Perplexity, and Gemini.
Evaluates semantic structure and JSON-LD entity resolution to maximize visibility and citations across generative interfaces.
How Our 14-Signal GEO Audit Works
We run a deterministic 14-signal audit that analyses how well your site is structured for Retrieval-Augmented Generation (RAG) pipelines. The algorithm calculates your Information Gain Ratio, Schema Coverage, and Entity Salience — giving you an honest 0–100 score with actionable recommendations.
Information Gain Ratio
Measures net-new data density vs. competing sources. LLMs discard mathematically redundant vectors during summarization.
Entity Salience & Graphs
Validates JSON-LD schema binding. A well-defined Organization or Person entity helps AI systems disambiguate your brand from similarly-named sources.
Semantic Context Density
Evaluates your <h1> to <article> flow. Pure HTML5 semantics drastically reduce the token cost for generative bots to crawl you.
Zero-Click Direct Answers
Scans for Definition Lists <dl> and exact match snippets optimized explicitly for Google AI Overviews.
Citation Probability Matrix
Calculates the likelihood of an AI using your text based on localized outbound links and author verification.
Retrieval Readiness
Combines the above data to simulate a live hybrid BM25 and vector embedding search extraction request.
Core System
The 5 Pillars of a GEO Audit
A true Generative Engine Optimization audit goes far beyond keyword density. We evaluate your digital footprint across the five dimensions that Large Language Models actually care about when synthesizing answers.
Authority
Citation Analysis
Measures Answer Share of Voice — the mathematical probability an LLM will recommend your content over competitors.
The Science of Generative Engine Optimization: Winning the Context Window
A quantitative analysis of how modern RAG-based AI retrievers parse, summarize, and cite the open web.
The Paradigm Shift: From SERPs to Summaries
Large Language Models (LLMs) like GPT-4, Gemini, and Claude are replacing traditional SERPs with synthesized answers via Retrieval-Augmented Generation (RAG). Digital visibility requires transitioning from SERP optimization to Generative Engine Optimization (GEO).
During RAG retrieval, engines like Perplexity score and rank candidate sources based on structured data, entity clarity, and content density. Our GEO audit tool checks these signals and shows you exactly where your site falls short.
"Optimization strategies that add citations, quotations, and statistics can increase AI citation frequency by measurable margins." — Aggarwal et al., "GEO: Generative Engine Optimization" (arXiv:2311.09735), Princeton NLP Group, 2023
Our Methodology: RAG Retrieval Signal Analysis
Our GEO audit checks 14 structured signals derived from published GEO research and technical best practices. Each signal reflects a concrete, actionable property of your page that affects how AI retrieval systems evaluate and rank source content.
What we check:
Maximize information density; eliminate rhetorical padding.
Deploy robust JSON-LD for strict entity resolution.
Anchor assertions with high-authority (.gov/.edu) outbound citations.
Structure tabular and metric data for deterministic extraction.
Minimize token usage by pruning superfluous adjectives.
Enforce strict semantic hierarchy (H1-H6) mapping to core schemas.
1. Named Entity Clustering
Models parse documents via entity knowledge graphs. We measure entity proximity and salience. High topical authority requires dense, logically connected clusters of strict named entities, reducing systemic ambiguity during ingestion.
2. Trust-Graph Validation
RAG pipelines heavily penalize unverified assertions to mitigate hallucination risks. Our engine correlates your outbound nodes against validated trust layers (.gov, .edu, w3.org), quantifying your content's epistemic reliability.
"RAG prioritization strongly biases nodes with cryptographically verifiable outbound trust distributions over isolated data clusters." — "LLM Architecture Systems (2025)"
Technical Primitives for RAG Selection
Context Window Token Economics
LLM context windows operate on strict token budgets (typically 4k-128k parameters). Content ingestion demands extreme Token Efficiency via high Information Gain (IG) scoring. RAG pipelines systematically reject low-density, adjective-heavy prose. Optimizing requires factual density, active voice, and minimal semantic variance.
Semantic Partitioning
Strict utilization of <article>, <section>, and <aside> isolates semantic boundaries, optimizing text splitting for vector databases.
JSON-LD Determinism
Schema.org injection (FAQ/Article) bypasses NLP heuristic parsing, directly feeding key-value pairs into the extraction pipeline.
DOM-to-Embedding Hierarchy
Rigid <h1> to <h6> flow establishes semantic tree weighting, explicitly defining primary versus ancillary nodes for chunking algorithms.
"DOM semantic mapping provides determinism; when structural hierarchy aligns perfectly with entity relationships, embedding correlation scores increase non-linearly." — "Automated Semantic Parsing Group (2024)"
Authoritative Citation References
Our methodology is informed by research from the following high-authority institutions:
Everything you need to know about optimizing your brand for generative AI citations and LLM discovery.
What is a GEO Audit?
A GEO Audit (Generative Engine Optimization Audit) analyses your website against the criteria used by AI-powered answer engines — such as ChatGPT, Perplexity, and Google Gemini — to decide which sources to cite. It evaluates entity salience, structured data completeness, E-E-A-T signals, factual density, and semantic HTML structure, producing an actionable citation score.
Is GEO Auditor free to use?
Yes. The core GEO audit is entirely free. Simply enter your URL and receive a full citation score covering relevance, authority, source attribution, topic depth, structure, clarity, citations, and direct answer quality — at no cost.
How does GEO differ from traditional SEO?
Traditional SEO optimises for keyword rankings in blue-link search results. GEO (Generative Engine Optimization) optimises for AI citation — ensuring your content is structured so that large language models can extract, trust, and cite it in their generated answers. Key GEO signals include entity-linked schema markup, factual density, direct-answer formatting, and authoritative external citations.
Which AI engines does the GEO Auditor check?
The GEO Auditor evaluates content signals relevant to ChatGPT (OAI-SearchBot), Perplexity AI (PerplexityBot), and Google Gemini (Googlebot + AI Overviews). Multi-engine citation scores are provided separately so you can prioritise optimisations per engine.
How long does a GEO audit take?
A full GEO audit typically completes in under 60 seconds. The tool fetches your live page, analyses 8 GEO scoring dimensions, detects existing schema markup, and generates prioritised optimisation insights in near real-time.
What is Entity Salience and why does it matter for AI search?
Entity Salience measures how prominently a named entity (your brand, product, or topic) is positioned relative to the full content. AI models like GPT-5.4 and Gemini 3.1 Pro weigh entity salience when deciding which sources to cite — a page where the target entity is contextually central is more likely to be referenced in AI-generated answers than one where it appears tangentially.
Can the GEO Auditor detect structured data problems?
Yes. The auditor parses all application/ld+json script tags on your page, extracts @type values using a recursive entity traversal, and checks for the presence of high-impact types including FAQPage, Article, Organization, Person, and SoftwareApplication. It also detects entity binding — the co-occurrence of multiple schema types linked via @id — which is a strong Knowledge Graph signal for AI Overviews.
Technical Glossary: Explained Simply
AI search involves a lot of complex terms. We've translated the technical jargon into plain, easy-to-understand language.
RAG (Retrieval-Augmented Generation)
How AI searches the internet live. Instead of guessing from memory, the AI 'looks up' verified websites to build an answer. Our goal is to make you the site it looks up.
LLM (Large Language Model)
The actual AI 'brain' (like ChatGPT, Gemini, or Claude). It reads text, understands it, and writes back to the user.
Information Gain (IG)
How much *new* useful info you provide. If your article says the exact same thing as 10 other articles, you have zero 'Gain' and AI will likely skip you.
Zero-Click Search
When a user asks a question on Google and gets a complete answer right on the results page, without ever clicking a website link.
Semantic Density
How factual and straight-to-the-point your writing is. High density means getting straight to the facts without extra 'fluff'.
JSON-LD & Schema
Hidden code on your site that spoon-feeds data to the AI. It guarantees the AI knows exactly who wrote the article and what it's about.
Context Window
The AI's 'short term memory' when reading a page. It can only process a certain amount at once, so your layout must be clear and logical.
Entity Resolution
When AI learns exactly who your brand is and matches it to your website so it never confuses you with someone else.
This glossary is part of the GEO Auditor Open Knowledge Project. For more information, visit our GitHub Documentation.
GEO Methodology — How This Audit Works
A transparent breakdown of the 14 signals GEO Auditor measures and why each one matters for AI search visibility.
What We Actually Measure
GEO Auditor analyses your public page HTML against 14 signals grouped into five pillars: Authority, Technical, Content, Data, and Trust. Every signal is derived from what an AI crawler can read in your page source — no black-box scoring, no proprietary guesswork.
The tool simulates how a Retrieval-Augmented Generation (RAG) pipeline chunks your content by splitting the DOM at semantic boundaries (<section>, <article>, heading tags) and evaluates each chunk for information density, entity presence, and citation quality.
Checks for proper use of <article>, <section>, and <h1–h6> hierarchy. AI tokenisers split pages at these boundaries — a flat DOM of divs is harder to chunk accurately.
2.
JSON-LD presence and validity
Detects <script type="application/ld+json"> blocks and validates the @type, required fields, and @id cross-links against the Schema.org specification.
3.
FAQPage schema
Question-answer pairs in structured data are the most directly extractable format for AI answer surfaces. Their presence and question count are both measured.
4.
External citation density
Counts outbound links to recognised high-authority domains. Perplexity, in particular, weights sources that themselves link to verified references.
5.
Author entity binding
Checks whether an Article or BlogPosting schema has an author property with an @id that resolves to a declared Person or Organization entity in the same page graph.
6.
Organisation schema
Verifies an Organization type with name, url, and at least one sameAs reference exists in the page, either directly or via the global layout.
7.
Meta description quality
Evaluates length (target: 120–155 characters), absence of keyword stuffing, and the presence of a clear value proposition.
8.
Title tag structure
Checks length (under 70 characters), brand name presence, and primary keyword placement near the start of the title.
9.
Canonical URL declaration
Confirms a <link rel="canonical"> tag is present and self-referential — not pointing to a different URL that could cause index consolidation issues.
10.
AI crawler access
Checks whether GPTBot, ClaudeBot, PerplexityBot, and Google-Extended are explicitly allowed in the robots.txt file.
11.
llms.txt presence
Checks for a machine-readable /llms.txt file at the domain root and validates its basic structure (citation preferences, permitted use, last-updated date).
12.
Heading hierarchy quality
Detects skipped heading levels (e.g., H2 → H4), multiple H1 tags on one page, and a missing H1 — all of which degrade AI content chunking.
13.
Image alt text coverage
The percentage of <img> elements with a non-empty alt attribute. AI systems that process page images rely on alt text for context.
14.
Sitemap declaration
Confirms a Sitemap: directive in robots.txt and that the declared URL returns a valid XML sitemap with at least the current page included.
How the Score Is Calculated
Each of the 14 signals is scored on a binary (present / absent) or graded (0–10) scale depending on the signal type. The five pillar scores are weighted averages of their constituent signals, then combined into the overall GEO score (0–100).
The score reflects only what is publicly readable in the page HTML at the time of the audit. It does not reflect unpublished content, server-side redirects, pages behind authentication, or signals that require JavaScript execution to render.
Scores for individual URLs are specific to that URL — a high score on your homepage does not mean your blog posts or product pages score equally. Run the audit on each important page separately.
Schema Markup Validator · Test your JSON-LD implementation for syntax errors and missing required properties.
Frequently Asked Questions
What is a GEO Audit?
A GEO Audit (Generative Engine Optimization Audit) analyses your website against the criteria used by AI-powered answer engines — such as ChatGPT, Perplexity, and Google Gemini — to decide which sources to cite. It evaluates entity salience, structured data completeness, E-E-A-T signals, factual density, and semantic HTML structure, producing an actionable citation score.
Is GEO Auditor free to use?
Yes. The core GEO audit is entirely free. Simply enter your URL and receive a full citation score covering relevance, authority, source attribution, topic depth, structure, clarity, citations, and direct answer quality — at no cost.
How does GEO differ from traditional SEO?
Traditional SEO optimises for keyword rankings in blue-link search results. GEO (Generative Engine Optimization) optimises for AI citation — ensuring your content is structured so that large language models can extract, trust, and cite it in their generated answers. Key GEO signals include entity-linked schema markup, factual density, direct-answer formatting, and authoritative external citations.
Which AI engines does the GEO Auditor check?
The GEO Auditor evaluates content signals relevant to ChatGPT (OAI-SearchBot), Perplexity AI (PerplexityBot), and Google Gemini (Googlebot + AI Overviews). Multi-engine citation scores are provided separately so you can prioritise optimisations per engine.
How long does a GEO audit take?
A full GEO audit typically completes in under 60 seconds. The tool fetches your live page, analyses 8 GEO scoring dimensions, detects existing schema markup, and generates prioritised optimisation insights in near real-time.
What is Entity Salience and why does it matter for AI search?
Entity Salience measures how prominently a named entity (your brand, product, or topic) is positioned relative to the full content. AI models like GPT-5.4 and Gemini 3.1 Pro weigh entity salience when deciding which sources to cite — a page where the target entity is contextually central is more likely to be referenced in AI-generated answers than one where it appears tangentially.
Can the GEO Auditor detect structured data problems?
Yes. The auditor parses all application/ld+json script tags on your page, extracts @type values using a recursive entity traversal, and checks for the presence of high-impact types including FAQPage, Article, Organization, Person, and SoftwareApplication. It also detects entity binding — the co-occurrence of multiple schema types linked via @id — which is a strong Knowledge Graph signal for AI Overviews.