The 8 GEO Criteria That Determine Whether AI Cites You

The Detekia GEO score is not a black box. Behind every rating from 0 to 100, there are 8 specific criteria measured automatically from your site analysis. Understanding how each criterion is evaluated, why it matters, and how to improve it — that's what this article is about.

This methodology is the result of months of research into what actually determines whether a site gets cited by LLMs. It draws on academic work in GEO (notably Aggarwal et al., 2023, presented at KDD 2024), Google's E-E-A-T guidelines, and empirical observations from hundreds of audits.

→How to interpret your overall GEO score →

Overview of the 8 criteria

The 8 criteria are organized into three layers:

Technical layer (what AI can read): Extractability, AI Crawlability
Semantic layer (what AI understands): Structured Data, Freshness
Trust layer (what AI values): Verifiability, E-E-A-T Authority, Editorial Neutrality, External Presence

Each criterion is scored from 0 to 100. The overall score is a weighted average. The weights reflect the empirical impact of each criterion on the probability of being cited.

The 8 criteria in detail

#1ExtractabilityWeight: 20%

What it is: The ability of AI engines to easily extract factual information from your content.

How it's measured: Analysis of content structure — presence of clear headings (H1, H2, H3), bullet lists, numerical data, and explicit definitions. We measure information density and organizational clarity.

Why it matters: LLMs work by pattern extraction. Dense, poorly structured text will be paraphrased loosely or ignored entirely. Content with clearly presented facts will be cited verbatim.

How to improve:

Structure content with hierarchical headings (H2 for sections, H3 for sub-points)
Transform dense paragraphs into bullet lists where possible
Include precise numerical data (not "a lot" but "72%")
Use summary boxes at the end of each section

#2VerifiabilityWeight: 15%

What it is: The extent to which your claims can be verified by the AI or its users.

How it's measured: Presence of cited sources (external links to studies, official data), dates on information, identified authors, explained methodologies. We also check that outbound links point to recognized sources.

Why it matters: LLMs are trained to favor verifiable information. An unsourced claim is perceived as less reliable than a sourceable one. Citing studies increases the probability that AI will reuse your exact phrasing.

How to improve:

Cite the studies and reports you reference (link + author + year)
Clearly date your content ("last updated: March 2026")
Mention primary sources for statistics
Avoid unsourced assertions ("experts agree that...")

#3E-E-A-T AuthorityWeight: 20%

What it is: The Experience, Expertise, Authoritativeness, and Trustworthiness of the site and its authors — Google's framework adopted by LLMs.

How it's measured: Presence of a detailed "About" page, author bios with credentials, mentions of partners/clients/certifications, accessible contact page, privacy policy, quality inbound links.

Why it matters: AI engines cite trustworthy sources. A site with no identified author, no "About" page, and no legitimacy signals will be systematically deprioritized against a competitor that has them.

How to improve:

Create an "About" page with the company story and credentials
Add author bios to every article (name, role, expertise)
Mention recognized partners, certifications, or clients
Make sure the contact page and terms of service are easily accessible

#4AI CrawlabilityWeight: 15%

What it is: The ability of AI bots to access and read your site.

How it's measured: Analysis of robots.txt for AI user-agents (GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended), presence of an llms.txt file, page load speed, accessibility of key pages.

Why it matters: A site that blocks AI bots in its robots.txt will simply be ignored. A slow site or one with JavaScript-only content will be partially read. This is an absolute prerequisite.

How to improve:

Verify that GPTBot, ClaudeBot, and PerplexityBot are not blocked in robots.txt
Create an llms.txt file with a site summary and key pages
Ensure important content is in rendered HTML (not only in client-side JS)
Maintain an up-to-date XML sitemap

→Full technical guide: robots.txt and llms.txt for AI bots →

#5Structured DataWeight: 15%

What it is: The presence and quality of Schema.org markup in JSON-LD on key pages.

How it's measured: Detection and validation of JSON-LD schemas (Organization, WebSite, Article, FAQPage, Product, BreadcrumbList, LocalBusiness). We check for presence, completeness, and consistency with the page content.

Why it matters: JSON-LD schemas are designed specifically so machines can understand content without ambiguity. A well-populated FAQPage schema will be extracted directly by LLMs to answer user questions.

How to improve:

Add Organization on the homepage
Add Article on every blog post
Add FAQPage on pages that contain Q&A content
Validate with Google's Rich Results Test tool

→Schema.org and AI: the 5 priority schemas for GEO →

#6Editorial NeutralityWeight: 10%

What it is: Your content's ability to inform objectively, without excessive commercial promotion.

How it's measured: Language analysis — density of superlatives ("best," "revolutionary," "incredible"), presence of pros/cons arguments, honest mentions of product/service limitations, informative vs. persuasive tone.

Why it matters: AI engines avoid citing content perceived as marketing. They favor sources that resemble encyclopedias or expert guides. An article that presents nuance and acknowledges limitations is more credible than one that's exclusively positive.

How to improve:

Replace superlatives with measurable facts
Include "limitations" sections or "who this product/service is NOT for"
Present existing alternatives when relevant
Avoid phrasing like "the best solution on the market"

#7External PresenceWeight: 10%

What it is: Mentions and citations of your brand/site on other platforms.

How it's measured: Detection of quality backlinks, mentions on third-party platforms (LinkedIn, specialized forums, industry publications), presence on Wikipedia or recognized directories, citations in other articles.

Why it matters: LLMs were trained on a large web corpus. If your brand is mentioned in many independent sources, the AI knows it and trusts it. External presence is a proxy for perceived authority.

How to improve:

Publish original studies or data that others will cite
Contribute to industry publications (guest posts, interviews)
Be listed in relevant directories for your industry
Encourage testimonials and reviews on third-party platforms

#8FreshnessWeight: 5%

What it is: How recent and regularly updated your content is.

How it's measured: Publication and modification dates of pages, frequency of new content publication, presence of dates in Schema.org markup and visible HTML.

Why it matters: AI engines prefer recent information for evolving topics. A 2021 article about AI will be cited less than a 2026 article, even if the content is similar. Freshness matters less for stable subjects (mathematics, history) than for technology topics.

How to improve:

Update existing articles regularly (and indicate it with an update date)
Publish new content at least once a month
Include dates in the Article schema (datePublished + dateModified)
Mention the year in titles for time-sensitive topics

How the criteria add up

The overall score is a weighted average. But there's an important subtlety: the technical criteria (Extractability, Crawlability) act as prerequisites. A site blocking AI bots in its robots.txt will get a Crawlability score of 0, which mechanically caps its overall score — no matter how good the content is.

Recommended optimization order:

Unblock AI bots (Crawlability) — absolute prerequisite
Structure your content (Extractability) — highest immediate impact
Add priority schemas (Structured Data) — technical quick win
Strengthen authority (E-E-A-T) — medium-term investment
Source your claims (Verifiability) — continuous improvement
Adjust the tone (Neutrality) — review and rewrite
Build external presence — long-term work
Maintain freshness — editorial discipline

What the score doesn't measure

The Detekia GEO score measures potential citability. It does not measure:

Whether you're already being cited — for that, you need to test directly in ChatGPT, Perplexity, etc.
Content quality — a factually incorrect but well-structured article can score well technically
Topic coverage volume — one excellent article vs. a site with 50 mediocre ones
Query popularity — being citable on a topic nobody searches for won't drive traffic

That's why the GEO score should be interpreted as a citability potential, not a guarantee. A complete strategy combines technical optimization (GEO score) with editorial strategy (topics to cover) and distribution (external presence).

→Analyze your GEO score for free →

Frequently asked questions about the methodology

Is the GEO score valid for all types of websites?

The methodology was designed for B2B and B2C sites with editorial content (blogs, guides, product pages). It's less relevant for pure web applications (SaaS with no public content) or highly technical sites without a general audience.

How often should I re-evaluate my score?

After every significant optimization (technical overhaul, new articles, schema updates), and at minimum once per quarter. LLM algorithms evolve, and what's optimal today can change.

Does the GEO score replace the SEO score?

No — the two scores measure complementary things. A strong SEO score (domain authority, backlinks, Google rankings) contributes to the GEO score (external presence, verifiability). But pages that rank well on Google can still have a poor GEO score if the content isn't extractable by AI engines.

→SEO vs GEO: key differences and how to combine them →

Overview of the 8 criteria

The 8 criteria in detail

How the criteria add up

What the score doesn't measure

Frequently asked questions about the methodology

Is the GEO score valid for all types of websites?

How often should I re-evaluate my score?

Does the GEO score replace the SEO score?

Test your GEO Score for free