Blind Spot

AI Crawler Visibility & llms.txt Audit

Can AI crawlers actually find and understand your site? We simulate real bot requests from GPTBot, ClaudeBot, PerplexityBot, and Google-Extended to verify crawler access, detect cloaking, analyze machine-readable signals, and assess llms.txt quality when present.

What We Test

›
llms.txt file (emerging convention) — presence, HTTP status, markdown structure, broken links, and content quality when the file exists
›
AI bot simulation — real HTTP requests with GPTBot, ClaudeBot, ChatGPT-User, OAI-SearchBot, PerplexityBot, Google-Extended user agents
›
robots.txt parsing — per-bot Disallow/Allow rule evaluation for training and search-tier crawlers
›
Cloaking detection — compare browser vs bot responses for status, content-type, title, body differences
›
Sitemap accessibility — detect sitemap.xml location, check robots.txt blocking per bot
›
Schema.org structured data — JSON-LD validation, Organization/WebSite/FAQPage schemas, template placeholder detection
›
Semantic HTML structure — heading hierarchy (H1/H2/H3), main landmark, FAQ sections
›
Freshness signals — dateModified, article:published_time, Last-Modified header
›
AI interpretation match — AI model reads llms.txt and compares understanding with your meta description

Why This Matters for SEO & AI Discovery

AI assistants are becoming a meaningful discovery channel. If your site blocks GPTBot or serves different content to bots, you become harder to surface in AI-generated answers. llms.txt can help by giving models cleaner guidance, but it is an emerging convention rather than a mandatory web standard. We focus first on crawler access and machine-readable quality, then evaluate llms.txt as a supporting signal when available.

AI-Powered Analysis & Automated Fixes

When we detect AI visibility issues, we provide actionable fixes with context. For blocked bots, we identify the exact rule or status pattern and suggest allowlist changes. For cloaking detection, we flag differences between browser and bot responses. For weak machine-readable signals, we suggest schema and structure fixes. If llms.txt is missing, we can generate a starter template, but we do not treat that file as the only path to good AI visibility.