Methodology · 17 of 17 articles published
Per-check methodology
Every signal agent.opensverige.se measures has a deep-dive. Why it matters for AI agents, how to fix it, common false positives, and primary sources you can cite. Open source under FSL-1.1-MIT.
Discovery
Can AI agents find and read the site?
Does your WAF actually let AI crawlers reach your pages?
criticalcrawler-access tests whether your CDN, WAF or bot manager (Cloudflare, Akamai, Fastly, Imperva) returns 200 to GPTBot, ClaudeBot and PerplexityBot at the edge. A permissive robots.txt does not help if the request never reaches your origin. This is the most common silent failure mode on Cloudflare-fronted sites.
1,380 tokens · updated 2026-05-10
Do you serve text/markdown when an agent sends Accept: text/markdown?
criticalmarkdown-negotiation checks whether your origin returns Markdown when a request carries Accept: text/markdown, either by content negotiation on the same URL or by serving a parallel .md path (e.g. /docs/api and /docs/api.md). Markdown costs roughly an order of magnitude fewer tokens than HTML for the same content, which directly reduces what AI agents pay to read your site.
1,380 tokens · updated 2026-05-10
Is critical content present in the initial HTML, not JS-rendered?
criticalssr-content tests whether a fresh HTTP fetch with no JavaScript execution returns your page's actual content in the response body. GPTBot, ClaudeBot, OAI-SearchBot, Claude-User and PerplexityBot do not run JavaScript. A site that hydrates content client-side serves them an empty shell, regardless of how many other checks pass.
1,280 tokens · updated 2026-05-10
Do you publish /llms-full.txt for single-fetch agent indexing?
importantllms-full-txt checks for a file at /llms-full.txt that contains the full Markdown content of every documentation page concatenated into one URL. Per the llmstxt.org spec it is the companion to llms.txt: llms.txt indexes, llms-full.txt is the bulk corpus. Cloudflare, Anthropic, Perplexity and Stripe all publish one.
1,180 tokens · updated 2026-05-10
Do you publish a /llms.txt index per the llmstxt.org spec?
importantllms-txt checks for a file at /llms.txt at your apex domain that follows the llmstxt.org specification: a single H1, an optional blockquote summary, and H2-grouped lists of Markdown links to the pages an LLM should read. It is a task-organised, agent-friendly index. It does not replace robots.txt or sitemap.xml.
1,450 tokens · updated 2026-05-10
Does your robots.txt allow AI agents to read your site?
importantrobots-ok checks whether /robots.txt at your apex domain explicitly permits the AI crawlers that operators publish product tokens for: GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, Claude-User, Claude-SearchBot, PerplexityBot, Perplexity-User, Google-Extended. A blanket User-agent: * Disallow: / blocks every one of them.
1,450 tokens · updated 2026-05-10
Do you publish a sitemap.xml that agents can actually find?
infositemap-exists checks whether you serve an XML sitemap at a conventional path (/sitemap.xml or /sitemap_index.xml) and reference it from robots.txt with a Sitemap: directive. A sitemap is the cheapest signal an AI crawler has for "what URLs exist on this domain" and it costs almost nothing to generate.
1,100 tokens · updated 2026-05-10
EU compliance
Does it meet EU regulatory requirements?
Do you mark AI-generated content per EU AI Act Article 50?
criticalai-content-marking checks whether AI-generated synthetic content on your site is marked in a machine-readable format per EU AI Act Article 50(2). The provider obligation applies to audio, image, video and text outputs of generative systems, including general-purpose AI. Article 50 enters into application on 2 August 2026 per Article 113. The recommended technical implementations are C2PA Content Credentials and the IPTC Digital Source Type vocabulary.
1,620 tokens · updated 2026-05-10
Does your cookie banner block AI bots from reading content?
importantcookie-bot-handling checks whether your cookie consent gate blocks legitimate AI crawlers from reaching content. Bots are not data subjects under the GDPR, so ePrivacy Article 5(3) consent does not apply to them. Yet many CMPs render a full-page consent wall to every visitor, AI bots included, which means the body the crawler sees is a banner instead of the article.
1,280 tokens · updated 2026-05-10
Does your privacy policy address GDPR Article 22 automated decisions?
importantprivacy-automation checks whether your published privacy policy discloses automated individual decision-making per GDPR Articles 13(2)(f) and 14(2)(g): the existence of the processing, meaningful information about the logic involved, and the significance and envisaged consequences. Required wherever a solely automated decision produces legal or similarly significant effects.
1,480 tokens · updated 2026-05-10
Builder surface
Can devs and agents build against it?
Do you expose a public REST API agents can call programmatically?
importantapi-exists checks whether your service exposes a documented, network-reachable HTTP API that returns JSON to a Bearer- authenticated client. Agents that need to act on your service (place orders, query state, submit data) cannot infer behaviour from HTML; they need a typed contract. A public API is the minimum surface for any agent integration beyond reading.
1,280 tokens · updated 2026-05-10
Do you expose an MCP server so agents can call your API as tools?
importantmcp-server checks whether your service exposes a Model Context Protocol server that responds to the standard JSON-RPC 2.0 initialise / tools.list / tools.call lifecycle. The current stable revision is 2025-11-25 and the supported transports are stdio (for local clients) and Streamable HTTP (for remote clients). HTTP+SSE was deprecated in the 2025-03-26 revision.
1,620 tokens · updated 2026-05-10
Do you publish an OpenAPI 3.x spec at a discoverable path?
importantopenapi-spec checks whether your API publishes a machine-readable OpenAPI 3.x document at a conventional discoverable path (/openapi.yaml, /openapi.json, /api-docs/openapi.json). OpenAPI 3.1 (released 2021-02-15, with the 3.1.1 patch on 2024-10-24 and 3.2.0 on 2025-09-19) is fully aligned with JSON Schema 2020-12. Agents generate clients and validate requests directly from it.
1,320 tokens · updated 2026-05-10
Do you publish human-readable API docs alongside the OpenAPI spec?
infoapi-docs checks whether your API has human-readable documentation rendered alongside the OpenAPI spec, with authentication, errors, rate limits, runnable code samples in multiple languages, and a versioned changelog. The OpenAPI spec is the contract; the docs are how a developer (or an agent on a developer's behalf) decides to use the API.
1,180 tokens · updated 2026-05-10
Do you publish a server-card.json describing your MCP capabilities?
infomcp-server-card checks whether your service publishes a JSON document at /.well-known/mcp/server-card.json describing your MCP server: name, version, transports, headers, repository, capabilities. SEP-1649 proposed this. The SEP is closed-draft pending ratification. Cloudflare ships a real server-card.json ahead of formal approval; most other providers do not yet.
1,180 tokens · updated 2026-05-10
Do you publish /.well-known/mcp for agent auto-discovery?
infomcp-well-known checks whether your domain serves a discovery document at /.well-known/mcp so agents pointed at example.se can locate your MCP server without manual configuration. The endpoint is proposed in SEP-1960 against the MCP specification repository. As of May 2026 the SEP is closed and not ratified; no major provider implements it. The check is forward-looking.
1,080 tokens · updated 2026-05-10
Do you offer a sandbox environment so builders can test safely?
infosandbox-available checks whether you offer a separate test environment with isolated credentials, isolated data, and documented webhook test events. Stripe's sandbox uses key prefixes (sk_test_, pk_test_, rk_test_) versus live (sk_live_, pk_live_, rk_live_) at a 25-ops sandbox rate limit versus 100-ops live. Without a sandbox, agents and integrators practise on production data, which is how outages start.
1,200 tokens · updated 2026-05-10