Join the waitlist and get Sublim Business free for 3 months  Claim offer

Web Analytics

How to Track AI Traffic from ChatGPT, Perplexity, and Gemini

Jocerand LeroyJocerand Leroy
9 min read
#ai#web-analytics#seo
Traffic from AI assistants is small, but it converts closer to paid search than to organic, and roughly 70% of it arrives with no referrer, so it hides in Direct. This guide explains why AI traffic is mismeasured everywhere, how to track the part that's trackable, and how to estimate the part that isn't.
How to Track AI Traffic from ChatGPT, Perplexity, and Gemini

For most websites, traffic from AI assistants is still around 1% of total visits. That number is easy to dismiss, and most teams do. It is also the wrong number to look at, for two reasons: it is growing fast, and the visitors behind it convert at rates closer to paid search than to organic. The channel is small, high-intent, and almost completely mismeasured.

The mismeasurement is the real story. Multiple 2026 studies put the share of AI referral sessions that arrive with no referrer header at roughly 70%. Those sessions don't show up as "ChatGPT" or "Perplexity" in your reports. They land in Direct, mixed in with bookmarks and typed URLs, and the channel quietly looks like nothing. If you judge AI traffic by what your dashboard labels as AI, you are seeing a fraction of it.

This guide covers what AI traffic actually is, why it hides, how to track the part that is trackable, and how to reason about the part that isn't.

First, separate crawlers from visitors

The phrase "AI traffic" gets used for two completely different things, and conflating them is the most common mistake.

  • AI crawlers are bots that fetch your pages to train models or build a search index: GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot, Google-Extended. They appear in your server logs, not in your analytics, and they are not people. High crawler activity means the models are reading you. It does not mean anyone visited.
  • AI referral traffic is a human who asked an assistant a question, saw your site cited or linked in the answer, and clicked through. This is a real visitor with real intent, and it is the traffic worth measuring.

Getting cited by an assistant (the crawler side) and getting clicked from that citation (the visitor side) are different outcomes. One study found that only 12-18% of Perplexity citations turn into an actual click. Being mentioned is not the same as being visited, and your analytics only ever sees the visits.

Two columns separated by a not-equal sign. Left: AI crawlers (GPTBot, PerplexityBot, ClaudeBot, Google-Extended) flow into server logs, never your analytics, labelled 'they read you, nobody visited'. Right: AI referral visitors, real humans who ask an assistant, see a citation and click, flow into your analytics as a real visit. A note reads: being cited is not being visited, only 12 to 18 percent of citations become a click
Crawlers hit your server logs and never appear in analytics; only the humans who click a citation become measurable visits — and only a fraction of citations turn into a click.

Why AI traffic hides in Direct

When a browser follows a link, it usually sends a referrer header telling your site where the click came from. That header is how any analytics tool labels a visit as "google", "chatgpt.com", and so on.

AI assistants break this in several ways:

  • Many answers are read inside a mobile app or a desktop client, not a browser tab. App-to-browser handoffs frequently drop the referrer entirely.
  • Some assistants deliberately strip it. Google's AI Mode, for example, uses a noreferrer attribute on its links, which makes that traffic untraceable in any client-side analytics tool.
  • Privacy settings and link wrappers remove or rewrite referrers along the way.

The result: a large share of genuine AI visits arrive with no source attached and get bucketed as Direct. This is not a flaw in one tool. It affects every client-side analytics platform equally, including GA4, including the privacy-first ones. No tool can label a referrer that the browser never sent.

A flow showing a visitor clicking a cited link in an AI answer, the referrer being dropped in transit by three causes (app-to-browser handoff, noreferrer, link wrappers and privacy), and the visit landing in GA4's Direct bucket mixed with bookmarks and typed URLs, where it is now hidden
The referrer is stripped in transit, so a high-intent AI visit lands in Direct, indistinguishable from a bookmark. Roughly 70% of AI traffic arrives this way.

What this means in practice: your reported "AI" number is a floor, not the real figure. The honest way to talk about the channel is "at least this much," never "exactly this much."

The AI sources worth watching

For the sessions that do carry a referrer, these are the domains that account for nearly all measurable AI referral traffic in 2026:

AssistantReferrer domains
ChatGPTchatgpt.com, chat.openai.com, openai.com
Google Geminigemini.google.com, bard.google.com
Claudeclaude.ai
Perplexityperplexity.ai
Microsoft Copilotcopilot.microsoft.com, bing.com/chat
Otherschat.mistral.ai, deepseek.com, grok.com, meta.ai, you.com

The distribution shifts constantly. ChatGPT still leads measurable referrals but its share has fallen from the high-80s a year ago to the low-60s in 2026, while Claude, Gemini, and Perplexity have all gained ground. Regional engines matter too: if a large share of your audience is in France or Europe, Mistral's Vibe (formerly Le Chat; chat.mistral.ai) belongs on your list alongside the global players. Whatever list you build, plan to review it every quarter, because the leaderboard genuinely changes that fast.

Tracking AI traffic in GA4

On May 13, 2026, GA4 added a native "AI Assistant" channel to its default channel group, reaching broad availability across properties in early June. When a click matches a known assistant, GA4 now labels it automatically (medium ai-assistant, channel AI Assistant) with no setup required. It is a real improvement, but three catches matter before you rely on it:

  • The recognized list is published only as examples. Google names ChatGPT, Gemini, Deepseek, Copilot, and Grok, but calls the list non-exhaustive and keeps the full referrer list private, so the docs alone can't tell you how any given assistant is classified. In practice, Perplexity is widely reported to still land in Referral, and Claude, named at launch but absent from the current published list, sits in the same grey zone. Google also routes its own AI Overviews and AI Mode clicks to Organic Search, not to AI Assistant. The only way to be certain for a source that matters to you is to check your own GA4 reports: filter by that source and look at its assigned channel.
  • It is not retroactive. The channel only classifies traffic forward from May 13, 2026. Every AI visit before that date stays buried in Referral or Direct under your old groupings, so the historical trend never gets rebuilt.
  • It inherits the referrer problem. Like any client-side rule, it only catches sessions that arrived with an intact referrer, so the no-referrer majority still falls into Direct.

Rather than guess what Google's private, shifting list does, build a custom channel group as a backstop that catches every assistant you care about by name:

  • Open Admin → Data display → Channel groups and create a new group.
  • Add a channel (for example "AI Traffic") with a condition where Source matches a regex of the domains above:
    chatgpt\.com|chat\.openai\.com|openai\.com|perplexity\.ai|claude\.ai|gemini\.google\.com|bard\.google\.com|copilot\.microsoft\.com|bing\.com/chat|chat\.mistral\.ai|deepseek\.com|grok\.com|meta\.ai|you\.com
  • Drag the AI channel above Referral in the list, then save.
  • Revisit the regex quarterly as new assistants appear.
How the regex works. The | means "or", so the rule tells GA4: if the source is chatgpt.com or perplexity.ai or chat.mistral.ai, and so on, file the visit under AI Traffic. The \. escapes each dot so it's read as a literal dot, not a regex wildcard.
GA4 channel rules checked top to bottom: a visit from chatgpt.com is matched at rule 3, AI Traffic (highlighted), so the rule wins. Rule 4, Referral, sits below and is never reached. A note reads: put AI Traffic above Referral, first match wins, so Referral never sees the visit
GA4 reads channel rules top to bottom and stops at the first match. A visit from chatgpt.com is technically a referral, so if Referral sits above your AI rule it grabs the visit first. Putting AI Traffic above Referral lets your rule win before the generic bucket can swallow it.
The limit you can't regex your way around. This only catches sessions that arrive with a referrer. Roughly 70% of AI visits arrive with none and fall into Direct, so even a flawless channel group sees only the visible third or so of the channel. The regex captures what's labeled; it cannot recover what the browser never sent.

So it's worth doing, but treat it as a partial fix: it's a manual rule you have to maintain, and in GA4 it sits on top of a platform that already samples your reports once volumes get large. A channel that is 1% of traffic is exactly the kind of small segment that sampling rounds away.

Estimating the part you can't see

Since most AI traffic lands in Direct, the question becomes: how do you estimate the hidden portion without guessing? Three signals help triangulate it:

  • Direct traffic to deep pages. Nobody types or bookmarks a long article URL or a niche product page. A rise in Direct landing on deep, specific pages (rather than the homepage) is very often unattributed AI and search referrals. Segment Direct by landing page and watch the deep ones.
  • Branded search lift. People who discover you in an AI answer often search your brand name afterward to verify. A climb in branded queries that tracks with AI mentions is an indirect but real signal.
  • Server logs. Crawler hits from GPTBot, PerplexityBot, and ClaudeBot tell you which pages the models are reading. That doesn't measure visits, but it tells you where citations are likely being generated, which you can cross-reference with Direct spikes on those same URLs.

None of these is exact. Together they turn "we have no idea" into a defensible estimate, which is the most any honest measurement of this channel can offer right now.

Why the channel deserves the attention

It would be reasonable to ignore a 1% channel if those visits behaved like everyone else. They don't. Across 2026 studies, AI referral visitors consistently convert well above organic search, with reported conversion rates in the range of paid search and time-on-site notably higher than typical organic. The intuition behind it is simple: someone arriving from an AI answer has already had their question framed and partly answered, so they land further down the decision funnel than a cold searcher.

A bar chart comparing conversion rate by channel: organic search is low, while AI referral and paid search are both much taller and sit in the same range, with annotations reading 'well above organic' and 'same range'
AI referral visitors convert in the same ballpark as paid search and well above organic, because they arrive with their question already framed.

This is why the Direct leakage is expensive. When high-intent AI visits get misattributed to Direct, two things go wrong at once. You undervalue the work that earns AI citations (your content, your SEO, your attribution model), and you overvalue Direct, which becomes a junk drawer that hides your best-converting new channel. The channel is small enough to ignore and valuable enough that ignoring it is a mistake.

How Sublim handles AI traffic

Sublim captures the full referrer on every event and classifies it into an acquisition channel automatically, so visits from ChatGPT, Perplexity, Claude, and the rest surface as their own source out of the box, with no custom regex to build or maintain. Because Sublim never samples, a 1% channel is reported at full resolution instead of being rounded into noise, which matters precisely because AI traffic is small and growing.

What Sublim can't do, and what no client-side tool can do, is invent a referrer the browser never sent. The no-referrer portion is a hard limit of the web, not of any one product. Where Sublim helps is on the rest of the problem: surfacing the trackable AI traffic cleanly, letting you segment those visitors' behavior to see whether they convert, and giving you the Direct-by-landing-page view you need to estimate the hidden share. You measure what is measurable accurately, and you reason about the rest with real signals instead of a sampled guess.

Track your AI traffic accurately

No sampling, no setup — Sublim surfaces it out of the box.

The bottom line

AI traffic is small today, mismeasured everywhere, and worth more per visit than almost any other channel. Three moves get you ahead of it: separate crawlers from real visitors so you don't celebrate bots, label the trackable AI sources cleanly instead of letting them rot in Direct, and build a defensible estimate for the no-referrer share using deep-page Direct, branded search, and server logs. Do that and you will see the channel growing while your competitors are still deciding whether 1% is worth a meeting.

Jocerand Leroy
Author
Jocerand Leroy
Web Analytics & Privacy Lead

Jocerand writes about privacy-first web analytics, conversion diagnostics, and helping teams make sense of their data without compromising on compliance.

View all articles by this author

Ready to try Sublim?

Simple, fast analytics that respects privacy. It's free to get started.

Business plan · 3 months free at launch · Promo code sent by email

How to Track AI Traffic (ChatGPT, Perplexity, Gemini) in 2026 | Sublim Analytics