Skip to main content
Free5 minutesBeginner

AI Influence — which AI bots can crawl you and which AI products send you traffic

The AI Influence dashboard has three tabs: Overview (posture + robots/signals), Crawlers (the full read-only catalogue), and Metrics (AI referral analytics).

aicrawlersgptbotclaudebotreferrals
Last updated:

What is AI Influence?

AI Influence shows how AI surfaces interact with your site. It covers two things you can actually measure:

  1. AI crawler access — which of 29 tracked AI bots your robots.txt, llms.txt and Content-Signal declarations allow or block.
  2. AI referrals — which AI products (ChatGPT, Claude, Perplexity, Gemini, Copilot, DeepSeek, and 20+ others) send you human visitors.

Open your website's dashboard and select the AI Influence tab (in the Behavior group of the left sidebar). It's free on every plan. The tab has three sections: Overview, Crawlers, and Metrics.

The Overview tab

A compact Summary (how many of the 29 bots can crawl you, how many are blocked, AI human visits) plus a Robots & Signals card that lets you verify the verdict yourself:

  • View raw links open your actual robots.txt and llms.txt.
  • The Content-Signal declaration is parsed and shown as chips.
  • The exact list of bots blocked at the root is listed by name.

What is Content-Signal?

Content-Signal is a directive (emitted by Cloudflare's "Managed robots.txt" and a growing number of sites) that states AI-usage intent separately from crawl access:

  • search — may your content build a search index?
  • ai-input — may it be used as live input to a generative answer?
  • ai-train — may it be used to train an AI model?

Each shows as yes, no, or unset.

The Crawlers tab

A read-only table of all 29 tracked AI crawlers and each bot's access on your site. It is not a control panel — Zenovay reads robots.txt, it does not block bots at the edge. Search, sort, and filter by category (Live AI Assistants, Model Training, AI Browser Agents, Commercial Scrapers, Search with AI Overlays).

What each verdict means

The only question that matters is "can this bot reach my content?", so the verdict is three states:

  • Allowed (green) — permitted at your site root. This includes bots that are root-allowed but excluded from generic infrastructure paths like /api or /_next — those exclusions are shown as a note, not a warning, because they don't restrict your content.
  • Blocked (red) — disallowed at the site root by robots.txt.
  • Unknown (grey) — we couldn't read your robots.txt, or the site hasn't had its first check yet (within 24 hours of being added).

Re-checking on demand

The Re-check now button re-fetches your robots.txt + llms.txt, re-evaluates all 29 bots, and runs 5 live HTTP HEAD probes (one per category) to catch Cloudflare WAF / CDN bot blocks that aren't visible in robots.txt. Rate-limited to once per minute per site.

The Metrics tab

AI referral analytics — human visitors who arrived from AI products. Every detected AI source is listed with visitors, share, conversion and (if enabled) revenue, plus a Detection method distribution that reconciles with the source list.

How we identify AI referrals

Four signals, in priority order:

  1. Referrer matchdocument.referrer matches a known AI host (chat.openai.com, claude.ai, perplexity.ai, gemini.google.com, …).
  2. UTM match — the campaign URL carries utm_source=chatgpt, etc.
  3. User-Agent match — visit came from an AI product's in-app browser.
  4. Behavioural heuristic (Dark AI) — direct-traffic visits whose pattern matches AI-arrival fingerprints.

What's "Dark AI"?

A large share of AI-arrived traffic carries no referrer — the user pasted your link from a chat window, or the AI surface stripped the referrer. A daily heuristic catches it: deep content landing (long-tail URL, not /), focused 30–300 s read with > 70% scroll, first-time visitor with low interaction, business-hours arrival. A score ≥ 60 marks it AI-originated via behavioral_heuristic with confidence 0.5–0.95.

Common questions

"I'm blocking GPTBot in robots.txt but the dashboard says 'Allowed'. Why?"

The most common cause: you use Cloudflare's Managed robots.txt, which adds a managed block that disallows AI bots — but a later custom group in the same file re-allows some of them. Per RFC 9309, all User-agent groups for a bot are combined and the effective result is what a real crawler obeys. Zenovay reports that effective access — so a bot re-allowed in a later group correctly shows Allowed. Open View raw on the Overview tab to see the exact rules. Otherwise: confirm https://yourdomain.com/robots.txt is reachable with a literal User-agent: GPTBot / Disallow: / block, then press Re-check now.

"Will blocking AI bots in robots.txt actually stop them?"

Reputable crawlers (OpenAI, Anthropic, Google, Perplexity, Microsoft, Apple) honour robots.txt. Less reputable ones ignore it. For enforcement you need Cloudflare AI-Bot rules, a WAF/rate-limit rule matching User-Agent (→ 403), or a server-side beacon (Zenovay V1.5 — measures actual crawl events, not just configured access).

"Why are Citation and Assist marked 'Coming in V2'?"

We could ship vanity numbers today, but won't mislead you. Citation (where your brand appears inside AI answers) needs a prompt-monitoring vendor or our own crawler. Assist (downstream lift after AI exposure) is too noisy at small volumes. When V2 ships, every metric carries a measured vs. inferred label.

"Does AI Influence track individual visitors?"

No. AI Influence reads the same visitor data the rest of your dashboard uses, so it inherits Zenovay's tracking posture: cookieless mode supported, IPs hashed with a daily-rotating salt (never stored in plaintext), and it adds no new scripts or cookies of its own. It reports AI sources and aggregate patterns, not named individuals.

Plan availability

Free on every plan. AI traffic visibility is a baseline acquisition signal every Zenovay customer should have.

Was this article helpful?