Your Google Rank Just Stopped Predicting Whether AI Cites You
A University of Toronto audit of 1,516 queries found AI answer engines cite almost entirely different sources than Google — GPT-4o overlapped Google's top ten 0% of the time for the median query. Here is what that means for how you measure search visibility.

For twenty years, one number told a business whether the web could find it: its Google rank. A peer-reviewed audit out of the University of Toronto just showed that number now tells you almost nothing about whether an AI answer engine will cite you.
The study, "Navigating the Shift: A Comparative Analysis of Web Search and Generative AI Response Generation" (Chen, Wang, Chen, and Koudas, University of Toronto), ran 1,516 queries through five systems side by side: Google Search, GPT-4o, Claude 4.5 Sonnet, Perplexity Sonar Pro, and Gemini 2.5 Flash. The SEO trade press, which picked it up across the past week, has billed it as the largest empirical audit of AI citation behavior published so far.
The number that should reset your dashboard
The researchers measured how often each AI engine's cited domains overlapped with Google's top ten results for the same query. The overlap is small, and it is different for every engine:
- GPT-4o: 4.0% mean overlap, 0.0% median. A zero median means that for at least half of the queries tested, GPT-4o cited zero domains in common with Google's top ten.
- Gemini 2.5 Flash: 11.1%.
- Claude 4.5 Sonnet: 12.6%.
- Perplexity Sonar Pro: 15.2%.
The differences held up under bootstrap resampling. This is not noise. It is a structural finding: the engines that increasingly answer your buyers' questions are reading from a different web than the one your SEO report tracks.
It gets sharper. Roughly 78 to 85% of the domains these engines cited were unique to a single platform. Ranking well inside ChatGPT does not carry over to Perplexity, which does not carry over to Gemini. And the divergence is not only about which URLs get cited, it is about which kinds of sources. The audit found Claude, for instance, leaning heavily on earned media (around 65% of its references) and almost never on social content (about 1%). Each engine has its own editorial center of gravity.
Why this breaks the standard SEO playbook
The entire discipline of search optimization assumes one ranked list that most people see. Optimize the page, earn the links, climb the list, get the traffic. That model quietly assumed the list was singular and stable.
The Toronto audit describes a world with at least five different lists, mostly non-overlapping, each weighting sources by its own logic, and several of them not exposed to you at all. A business can sit at position one on Google and be invisible in the AI answer that a buyer actually reads. Nothing in a rank-tracking dashboard would tell you that, because rank tracking measures the one surface that is now the minority case.
This is the gap that answer engine optimization (AEO) and generative engine optimization (GEO) exist to close. But you cannot optimize a surface you cannot measure, and most marketing teams have no instrument pointed at the AI-answer surface at all. They are flying the AI-search era on a Google-era altimeter.
What we built, and why this study is the proof point
ForaPost SEO Intelligence exists for exactly this measurement problem. It gives a business a real-time SERP landscape, content-gap detection, and AEO/GEO visibility signals, so you can see how you actually show up across answer engines, not just where you rank on Google.
We did not build it as a reaction to this paper. We built it because the decoupling the paper measures has been visible in the field for a while. What the Toronto audit adds is a hard number on a trend that was easy to wave away: a median overlap of zero is not a rounding error, it is a different web.
The practical takeaway is not "panic about AI search." It is narrower and more useful: stop treating one ranked list as the whole picture, and start measuring visibility per engine, because the engines no longer agree with each other or with Google. The businesses that adapt first will be the ones that can see the new surface clearly. That is a measurement problem before it is a content problem, and measurement is something you can start on this quarter.
Sources: "Navigating the Shift: A Comparative Analysis of Web Search and Generative AI Response Generation," Chen, Wang, Chen, and Koudas, University of Toronto, arXiv:2601.16858 (https://arxiv.org/abs/2601.16858); trade-press coverage, everything-pr.com, "Four Engines. One Thousand Queries. The Toronto AI Audit." (https://everything-pr.com/four-engines-one-thousand-queries-the-toronto-audit-of-how-ai-cites-the-web).


