75.svg

Overview

<aside> 💡 Due to the essential nature of LLMs and their word predicting nature, these models can sometimes produce outputs that are false, misleading, or even nonsensical. These outputs can range from incorrect facts to entirely fabricated scenarios, often presented with unwarranted confidence. This phenomenon is particularly prevalent in large language models (LLMs) like GPT-3, ChatGPT, and Bard.

</aside>

<aside> <img src="https://prod-files-secure.s3.us-west-2.amazonaws.com/ac3b5898-58b3-48d3-876a-0f9364cb82fc/a9555a29-2c39-4373-abea-71ea4e5cea02/curt_dupe.jpeg" alt="https://prod-files-secure.s3.us-west-2.amazonaws.com/ac3b5898-58b3-48d3-876a-0f9364cb82fc/a9555a29-2c39-4373-abea-71ea4e5cea02/curt_dupe.jpeg" width="40px" />

This recent article/study by scientists at Polytechnic University of Valencia, Spain point out that the larger the LLMs scale, often, the worse they get at answering basic questions.

https://www.newscientist.com/article/2449427-ais-get-worse-at-answering-simple-questions-as-they-get-bigger/

https://www.nature.com/articles/s41586-024-07930-y

</aside>

Most Embarrassing Examples

  1. Google's Bard AI Blunder In February 2023, during a public demonstration of Google's Bard AI, the system incorrectly claimed that the James Webb Space Telescope had taken "the very first pictures of a planet outside of our own solar system"[6]. This factual error led to a sharp decline in Alphabet's (Google's parent company) stock price, wiping out approximately $100 billion in market value within a day[6]. This incident highlighted the risks associated with rushing AI technology to market without thorough testing and validation.
  2. Air Canada's Chatbot Misinformation In a recent case, Air Canada was ordered to pay damages to a passenger after its AI-powered virtual assistant provided incorrect information about bereavement fares[1][5]. The chatbot falsely stated that passengers could apply for bereavement discounts after purchasing tickets, contradicting the airline's actual policy. This error not only resulted in financial compensation but also raised questions about the reliability of AI-driven customer service tools and the legal implications of AI-generated misinformation.
  3. Zillow's Algorithmic Home-Buying Disaster In 2021, Zillow's home-flipping unit, Zillow Offers, faced significant losses due to errors in its machine learning algorithm used to predict home prices[3]. The algorithm's inaccuracies led to Zillow overpaying for properties, resulting in the company writing down millions of dollars and laying off about 25% of its workforce. This case demonstrates how AI errors in critical business operations can have severe financial consequences.
  4. Legal Consequences of ChatGPT Hallucinations In a legal context, a lawyer named Steven A. Schwartz faced a $5,000 fine after submitting a legal brief containing non-existent court cases fabricated by ChatGPT[4]. This incident not only resulted in financial penalties but also highlighted the dangers of relying on AI-generated information without proper verification, especially in professional settings.
  5. Microsoft's Bing Chat Errors During a public demonstration similar to Google's Bard incident, Microsoft's Bing Chat AI provided inaccurate financial data about major companies like Gap and Lululemon[4]. While the immediate financial impact was not as severe as Google's case, it still resulted in public embarrassment and raised concerns about the reliability of AI-powered search and information tools.

Causes

Mitigation Strategies

To reduce the frequency and impact of AI hallucinations, several strategies can be employed: