You want to know whether ChatGPT, Gemini, or Perplexity cite your website — and if not, why. A GEO audit answers both questions. It evaluates your site against the criteria AI engines use to select their sources, identifies blockers, and gives you a prioritized action plan.

This article walks you through a complete GEO audit, step by step, that you can run yourself or accelerate with an automated tool.

What a GEO audit evaluates (and what it does not)

A GEO audit is not an SEO audit. The two complement each other but measure different things.

What a GEO audit measures:

  • Can your content be extracted and cited by an AI? (extractability)
  • Can AI bots access your website? (crawlability)
  • Is your site understood by machines? (structured data)
  • Is your content perceived as trustworthy? (verifiability, authority, neutrality)
  • Do you exist outside your own website? (external presence)
  • Is your content up to date? (freshness)

What a GEO audit does not measure:

  • Your Google rankings (that is SEO)
  • Your page speed (that is technical SEO)
  • Your backlink profile in detail (that is off-page SEO)
  • The exact number of times AI engines cite you (no tool can measure this with certainty)

A GEO audit measures your citation potential — the technical and editorial conditions that maximize your chances of being selected as a source by AI engines.

The 8 GEO criteria that determine whether AI cites you →

Step 1 — The citation test (15 minutes)

Before diving into the technical side, start with the most concrete test: ask AI engines directly whether they know your website.

How to do it

Open ChatGPT, Gemini, and Perplexity. For each, ask 5 questions:

Recommendation questions (your market):

  1. "What's the best [your service/product] in [your city]?"
  2. "Recommend a [your service/product] for [specific need your clients have]"

Expertise questions (your domain):

  1. "How do I choose a good [your industry]?"
  2. "What's the difference between [A] and [B] in [your sector]?"

Brand question (your name):

  1. "What does [your company name] do?"

How to analyze the results

For each response, record in a spreadsheet:

  • Cited? — your site or brand appears in the response (yes/no)
  • Competitors cited — which names appear instead of yours
  • Type of source cited — official website, blog post, forum, directory?

Calculate your presence rate: number of citations / (5 questions x 3 engines) = X%.

Below 20%, you have a significant AI visibility problem. Above 50%, you are in a good position. Above 80%, you can shift to fine-tuning mode.

What the results tell you

If you appear nowhere but your competitors are cited, the problem is likely technical (crawlability, structured data) or editorial (extractability, content quality).

If nobody appears in your sector, that is an opportunity: the first to optimize will capture the citations.

If you appear on some queries but not others, compare the corresponding pages: the one that gets cited probably has something the others lack (better structure, more data, FAQ...).

Step 2 — The crawlability audit (10 minutes)

This is the binary criterion: if AI bots cannot access your site, nothing else matters.

Check robots.txt

Go to your-website.com/robots.txt and look for directives targeting these user-agents:

  • GPTBot — ChatGPT
  • ClaudeBot — Claude
  • PerplexityBot — Perplexity
  • Google-Extended — Gemini / AI Overviews

If you see Disallow: / for any of these bots, you are actively blocking them.

Expected result: no AI bot should be blocked (unless it is a documented strategic decision).

Check the llms.txt file

Go to your-website.com/llms.txt. This file, if it exists, guides AI crawlers to your most important content.

Expected result: the file exists and lists your main pages.

llms.txt, robots.txt, and AI crawlability: the technical guide →

Check JavaScript accessibility

Disable JavaScript in your browser (Chrome DevTools → Ctrl+Shift+P → "Disable JavaScript") and navigate your site. If the main content disappears, AI crawlers probably cannot read it either.

Expected result: main content is visible without JavaScript.

Check the sitemap

Go to your-website.com/sitemap.xml. Verify that it is accessible, lists all your important pages, and that <lastmod> dates are up to date.

Expected result: sitemap accessible and up to date.

Step 3 — The extractability audit (30 minutes)

This is the heaviest criterion in the GEO score (25 points out of 100) and the one that fails most often.

Analyze your 5 main pages

Take your 5 most important pages (homepage, main service/product page, key blog post, FAQ, About page). For each page, evaluate:

The first 100 words — do they contain a direct answer to the question the visitor is asking? Or do they start with vague marketing copy ("Welcome to...", "Leading innovation...")?

Test with this method: copy the first 100 words of your page and paste them into ChatGPT, asking "Based on this text, what does this company do and what does it offer?" If ChatGPT cannot answer clearly, your intro is not extractable.

H2/H3 subheadings — are they descriptive or vague? "How our service works" is good. "Learn more" is bad. "Our advantages" is mediocre.

Paragraph autonomy — pick a random paragraph from the page. Is it understandable out of context? AI engines often extract a single paragraph to answer a question. If that paragraph requires the previous paragraph for context, it loses its value.

Structure — are there lists, tables, numbered steps? Structured information is more easily extractable than continuous text.

Quick scoring

For each page, assign a subjective score:

  • Good (4-5/5): factual intro, descriptive subheadings, autonomous paragraphs, lists/tables
  • Average (2-3/5): mix of informative and commercial content, partial structure
  • Weak (0-1/5): commercial intro, no structure, vague continuous text

Step 4 — The structured data audit (15 minutes)

Test with the Rich Results Test

Go to Google's Rich Results Test and enter your homepage URL. Note:

  • How many rich result types are detected?
  • Are there any errors or warnings?

Repeat for your FAQ page and your main blog post.

Essential schema checklist

Check for these schemas:

  • Organization on the homepage — name, logo, URL, address, contact, sameAs (links to LinkedIn, etc.)
  • Article on editorial content — title, author (with name and LinkedIn URL), dates, description
  • FAQPage on pages containing Q&As
  • BreadcrumbList on internal pages
  • WebSite with SearchAction on the homepage

Expected result: at minimum Organization + Article. Ideally all 5 schemas above.

Schema.org and AI: ready-to-copy code examples →

Step 5 — The verifiability audit (20 minutes)

Scan content for evidence

Go through your 5 main pages and count, for each page:

  • Quantified data — statistics, percentages, dollar amounts, time frames (target: 1 data point per 200 words)
  • Named sources — "according to Gartner", "per a McKinsey study" (not "according to experts" or "studies show")
  • Reference dates — "in 2026", "since March 2025" (not "recently" or "for some time now")
  • Concrete examples — client case studies, usage scenarios, measured results

Identify unsourced claims

Look through your pages for claims that are not backed by evidence:

  • "We are leaders in..." — leaders by what metric?
  • "Exceptional results" — which results exactly?
  • "Many satisfied customers" — how many exactly?

Every unsourced claim is a negative signal for AI engines.

Step 6 — The E-E-A-T authority audit (10 minutes)

About page

Does your About page contain:

  • The full name of the founder/CEO?
  • A biography with background and expertise?
  • A photo?
  • Complete contact information (address, email, phone)?
  • Trust signals (client logos, certifications, press mentions)?

Identified authors

Are your blog posts signed by a named author with:

  • A link to a professional profile (LinkedIn)?
  • A short bio at the bottom of the article?

Legal disclosures

Are your legal disclosures complete: company name, EIN or state registration number, address, responsible officer?

Expected result: complete About page, identified authors, thorough legal disclosures.

Step 7 — The neutrality audit (10 minutes)

Analyze the tone

Reread your key pages with a critical eye. Look for:

  • Unsourced superlatives: "the best", "the most innovative", "unmatched", "revolutionary"
  • Vague promises: "exceptional results", "a transformative solution"
  • Manipulation: artificial urgency ("limited offer!"), social pressure ("everyone uses..."), fear ("you're losing money")

Every occurrence reduces your credibility with AI engines.

The tone test

Take the most commercial paragraph on your website and ask yourself: "Would a Wikipedia article say this?" If not, it is too promotional for AI engines.

The goal is not to eliminate all commercial language, but to ground it in facts: "Trusted by 312 companies since 2019" rather than "Market-leading solution".

Step 8 — The external presence audit (15 minutes)

Check your web footprint

Search for your brand on Google (in quotes): "your company". How many results come from third-party sites (not your own)?

Also search for:

  • Your name on Reddit (via Google: site:reddit.com "your company")
  • Your company LinkedIn profile — is it active?
  • Your Google Business Profile — does it exist and is it verified?
  • Press mentions — articles that talk about you?

Evaluate diversity

Ideally, you should be mentioned on several types of sources:

  • Industry press/media
  • Forums/Reddit
  • Professional directories
  • Google Business Profile
  • LinkedIn (active company page)

Expected result: at least 3 different types of external sources.

Step 9 — The freshness audit (5 minutes)

Check the dates:

  • When was your last blog post published?
  • Do your main pages display a visible last-updated date?
  • Are there outdated dates in your content ("2024 trends", "2023 new releases")?
  • Are the dateModified fields in your Article schemas up to date?

Expected result: content updated within the last 3 months, no outdated dates.

Synthesize the results

How to prioritize

Prioritize blocking criteria first — crawlability and extractability. If bots cannot access your site or if your content is not extractable, nothing else matters.

Next, focus on high-impact criteria — verifiability and structured data. These are the levers that take a site from "invisible" to "citable".

Finally, address reinforcement criteria — authority, neutrality, external presence, freshness. They improve a site that is already citable.

Automate your audit

The manual audit we just described takes about 2 hours. It is an excellent exercise for understanding the stakes, but it is not sustainable for regular monitoring.

Detekia automates all 8 criteria in under 60 seconds: it scrapes the actual DOM of your page, analyzes each criterion, and gives you a score out of 100 with recommendations prioritized by impact.

The automated audit is especially useful for:

  • Initial diagnosis — where do you stand exactly?
  • Post-optimization tracking — did your fixes improve the score?
  • Competitive benchmarking — how do you compare to your competitors?

Run your automated GEO audit — in under 60 seconds, no signup required →