Google’s AI Overviews Weren’t Ready for Prime Time. Here’s Why

Google’s AI Overviews feature was intended to give tidy summations of search results, researched and written in mere seconds by generative AI. The problem is, it got stuff wrong.

How often? It’s hard to say, although examples piled up quickly this month not long after Google started rolling out AI Overviews on a wide-scale basis. Consider these well-publicized flubs:

AI Atlas art badge tag

When asked how to keep cheese on pizza, it suggested adding an eighth of a cup of nontoxic glue. That’s a tip that originated from an 11-year-old comment on Reddit.

And in response to a query about daily rock intake for a person, it recommended we eat “at least one small rock per day.” That advice hailed from a 2021 story in The Onion.

On Thursday evening, Google said it is now scaling back the service on health-related queries, as well as when it deems users are making nonsensical or satirical searches. You also shouldn’t see AI Overviews results “for hard news topics, where freshness and factuality are important.”

Read more: Glue in Pizza? Eat Rocks? Google’s AI Search Is Mocked for Bizarre Answers

In a blog post, Liz Reid, vice president and head of Google Search, acknowledged that “some odd, inaccurate or unhelpful AI Overviews certainly did show up” and said that Google has “made more than a dozen technical improvements to our systems” in the last week and will “keep improving.”

While we’re still learning about what’s next for AI Overviews and for generative AI in search more broadly, we do know more about some of these initial issues.

Why is this happening now?

Essentially, the AI Overview goofs were a variation of AI hallucinations, which occur when a generative AI model serves up false or misleading information and presents it as fact. Hallucinations result from flawed training data, algorithmic errors or misinterpretations of context.

The large language model behind AI engines like those from Google, Microsoft and OpenAI is “statistically predicting data it may see in the future based on what it has seen in the past,” said Mike Grehan, CEO of digital marketing agency Chelsea Digital. “So there’s an element of ‘crap in, crap out’ that still exists.”

Hence the much-ridiculed AI Overviews results and more bad press for Google as it tries to get its footing in the shifting sands of the generative AI era. 

The search engine that debuted in 1998 controls about 86% of the market. Google’s competitors don’t come close: Bing controls 8.2%, Yahoo has 2.6%, DuckDuckGo is 2.1%, Yandex has 0.2% and AOL is 0.1%.

But the advent of generative AI and its growth among consumers — adoption is projected to reach nearly 78 million users, or about one-quarter of the US population, by 2025 — arguably threatens Google’s stranglehold on the market, which translates to roughly 8.5 billion searches per day and $240 billion in annual advertising revenue.

Google has its own gen AI chatbot, Gemini, which is competing with the grandaddy of them all, ChatGPT, and a slew of others from Perplexity, Anthropic, Microsoft and more. They’re all fighting for relevancy as our access to information changes again, much like it did with the introduction of Google 26 years ago.

The last thing Google needs is to lose the trust of the millions of us doing Google searches. 

In a statement last week, when the issues first flared up, a Google spokesperson said the majority of AI Overviews provide accurate information with links for verification. Many of the examples popping up on social media are what she called “uncommon queries,” as well as “examples that were doctored or that we couldn’t reproduce.”

“We conducted extensive testing before launching this new experience, and as with other features we’ve launched in Search, we appreciate the feedback,” the spokesperson said.

What is AI Overviews?

AI Overviews is a new spin on Google search, and it was just starting to roll out in the US. 

Google has long tinkered with its search engine results page to enhance the user experience and to drive revenue. Following its start with 10 blue links in 1998, Google introduced sponsored links and ads a few years later and really shook things up with the addition of the Knowledge Graph in 2012. That’s the box that calls out an answer about a person, place or thing to more quickly answer your query (and to keep you from clicking away).

AI Overviews was the latest update in this vein.

Instead of having to break up a query into multiple questions, you can ask something more complex up front. Google uses the example of searching for a yoga studio popular with locals, convenient to your commute and with a discount for new members. Theoretically, what used to be three searches is now one.

The overall goal of gen AI here is to make search more visual, interactive and social. 

“As AI-powered image and video generation tools become popular and consumers test multi-search features, SERPs’ rich media will effectively capture consumers’ attention,” said Nikhil Lai, senior analyst at research firm Forrester. “After all, 90% of the information transmitted to our brains is visual.”

Google’s Reid said in her blog post Thursday that AI Overviews works differently than AI chatbots, which source their responses from large language models built on vast expanses of what’s known as training data and which often are not connected to the open internet. AI Overviews does use a “customized” language model that’s integrated with Google search’s traditional web-ranking systems.

“AI Overviews are built to only show information that is backed up by top web results,” Reid wrote.

“This means that AI Overviews generally don’t ‘hallucinate’ or make things up in the ways that other LLM products might,’ she wrote. “When AI Overviews get it wrong, it’s usually for other reasons: misinterpreting queries, misinterpreting a nuance of language on the web, or not having a lot of great information available.”

A.J. Kohn, owner of digital marketing firm Blind Five Year Old, likened AI Overviews to a summary of traditional search results. (Google provides links to the sites that help inform the Overview.) The regular results we’re used to then appear under each AI Overview.

“While the generative summarization is somewhat complex, the end user is really getting a sort of TL;DR for that search, which may make it easier for some to find a satisfactory answer,” Kohn said.

Those mistakes AI Overviews was making

To be clear, AI Overviews does get a lot of things right. When I asked it how to get rid of a sore throat, how often to stain a wooden fence and even why AI hallucinates, the answers were all spot on.

But it also reportedly listed — erroneously — the health benefits of running with scissors and taking a bath with a toaster, as well as the number of Muslim presidents and whether a dog has ever played in the NHL. (The AI Overviews answer apparently would have us believe the answer is yes, his name was Pospisil and he was a fourth-round draft pick in 2018.) A relatively new X account, @Goog_Enough, has a running tally.

A screenshot of a tweet with a wrong answer from Google's AI Overviews.

Trojanowski_ on X/Screenshot by CNET

Another screenshot of a wrong answer from Google's AI Overviews.

roundbirbart on X/Screenshot by CNET

Some of these bad answers are in response to what Kohn called “very unlikely queries.”

It seems clear that in at least some of the cases, the AI Overview is picking up material from parody posts, bad jokes and satirical sites like The Onion.

“But what that underscores,” Kohn said, “is just how easy it is to get specious content into the AI Overview.”

It ultimately reveals a problem with grounding and fact-checking content in AI Overviews. 

In his review of Google’s Gemini chatbot, which is powering the new search experience, CNET’s Imad Khan said the model’s propensity to hallucinate should come with a disclaimer: “Honestly, to be safe, just Google it.”

I guess we should add Google it the old-fashioned way and then dig into the links. CNET’s Peter Butler has advice on how to get those old-school Google search results.

Pain points

Even before the mistakes started turning up in AI Overviews, not everyone was happy with the change. (“It makes more sense to me to search on TikTok,” my colleague Katelyn Chedraoui writes.)

Publishers and other websites, meanwhile, are worrying about losing traffic.

According to Grehan, it’s possible sites will see a decline in organic visits, if people stop scrolling below the summaries. “I doubt that because, like all human behavior in general — even if the summary provides a lot of detail upfront — you’ll likely want a second opinion as well,” he said.

The AI Overview mistakes make a strong case for getting that second opinion.

For her part, in a May 14 blog post announcing what AI Overviews can do, Reid wrote that early use in its Search Labs experiments over the last year showed that users were visiting a “greater diversity of websites” with AI Overviews and the links included “get more clicks than if the page had appeared as a traditional web listing for that query.”

But just three months after another public embarrassment — Gemini’s image generation functionality was put on hold because it depicted historical inaccuracies like people of color in Nazi uniforms — the question remains whether we’re starting to see cracks in the foundation of the once omnipotent search powerhouse.

In that May 14 post, Reid also wrote: “We’ve meticulously honed our core information quality systems to help you find the best of what’s on the web.”

Apparently that remains open to debate.

Editors’ note: CNET used an AI engine to help create several dozen stories, which are labeled accordingly. The note you’re reading is attached to articles that deal substantively with the topic of AI but are created entirely by our expert editors and writers. For more, see our AI policy.