Skip to main content
Frontier Signal

AI Chatbots Leak Real Phone Numbers, Raising Privacy Concerns

AI chatbots like Gemini and ChatGPT are inadvertently exposing real phone numbers from their training data, leading to calls for individuals and businesses.

Operator Briefing

Turn this article into a repeatable weekly edge.

Get implementation-minded writeups on frontier tools, systems, and income opportunities built for professionals.

No fluff. No generic AI listicles. Unsubscribe anytime.

TL;DR

AI chatbots like Gemini and ChatGPT are inadvertently exposing real phone numbers from their training data, leading to calls for individuals and businesses.

AI chatbots, including Google’s Gemini and OpenAI’s ChatGPT, are inadvertently exposing real phone numbers, drawing them from their training data when prompted in specific ways. This “AI doxxing” is causing individuals to receive unwanted calls from strangers seeking services or information, highlighting a critical and immediate privacy vulnerability for anyone whose data might be included in large language model datasets.

Reports from , detail instances where users interacting with generative AI models like Google Gemini and ChatGPT were able to extract personal contact information, specifically phone numbers, belonging to unrelated individuals. One Redditor described being inundated with calls from strangers seeking various services, all misdirected by Google’s generative AI. Another incident involved a software developer in Israel receiving WhatsApp messages after Gemini provided his number as part of incorrect customer service instructions. Similarly, a PhD candidate at the University of Washington successfully prompted Gemini to reveal a colleague’s personal cell phone number [1, 8].

This phenomenon, sometimes referred to as “AI doxxing,” stems from the models’ training data [4, 7]. Large language models (LLMs) are trained on vast datasets scraped from the internet, which inevitably include publicly available (or once-available) personal information. When prompted in certain ways, the models can “hallucinate” or retrieve and present this data, even if it’s not directly relevant to the user’s query or intended for public dissemination [2, 3]. While AI researchers and privacy experts have long warned about the potential for generative AI to compromise personal privacy, these recent cases demonstrate a concrete, actionable risk beyond theoretical concerns [1, 8]. The issue isn’t limited to a single model; reports indicate that ChatGPT, Gemini, and Grok are all susceptible to revealing private contact details despite existing privacy safeguards [6].

For operators, this represents a significant and immediate privacy and reputational risk. If your business relies on or integrates with generative AI, there’s a non-trivial chance that sensitive information, including customer or employee contact details, could be inadvertently exposed. This isn’t just about malicious actors; even benign queries can lead to data leakage. The implications extend to potential legal liabilities under data protection regulations like GDPR or CCPA, as well as erosion of trust with users and customers. The core problem lies in the opaque nature of LLM training data and the unpredictable ways models can surface information, making traditional data governance challenging.

What operators should do

Operators must assume that any personal identifiable information (PII) included in the training data of publicly available LLMs, or even proprietary models if not meticulously curated, is at risk of exposure. Immediately audit any internal processes or customer-facing applications that rely on generative AI for information retrieval or content generation, specifically testing for the ability to extract sensitive data like phone numbers or email addresses. For any data you contribute to LLM training, ensure robust anonymization and consent mechanisms are in place, and push for greater transparency from AI providers regarding their data sourcing and sanitization practices. If you’re building your own models, implement stringent data scrubbing and privacy-preserving techniques, and consider federated learning or differential privacy where applicable to minimize the risk of data reconstruction and leakage.

Sources

  1. AI chatbots are giving out people’s real phone numbers | MIT Technology Review — https://www.technologyreview.com/2026/05/13/1137203/ai-chatbots-are-giving-out-peoples-real-phone-numbers/
  2. AI chatbots are giving out people’s real phone numbers – AI General – Gnoppix Forum — https://forum.gnoppix.org/t/ai-chatbots-are-giving-out-people-s-real-phone-numbers/6054
  3. AI Chatbots Are Giving Out Your Real Phone Number – Gadget Review — https://www.gadgetreview.com/ai-chatbots-are-giving-out-your-real-phone-number
  4. ‘AI gave me your number’: AI doxxing turning ChatGPT hallucinations into harassment | The Independent — https://www.the-independent.com/tech/ai-doxxing-gemini-hallucination-google-b2973008.html
  5. AI chatbots are giving out people’s real phone numbers – Democratic Underground Forums — https://www.democraticunderground.com/122895920
  6. AI Chatbots Are Exposing People’s Real Phone Numbers — https://ground.news/article/ai-chatbots-are-exposing-peoples-real-phone-numbers_84bdf5
  7. AI Chatbots Are Giving Out Your Real Phone Number — https://tech.yahoo.com/ai/gemini/articles/ai-chatbots-giving-real-phone-144522753.html
  8. AI Chatbots Leak Real Phone Numbers, Privacy Concerns — https://theoutpost.ai/news-story/ai-chatbots-are-leaking-real-phone-numbers-and-addresses-sparking-major-privacy-concerns-26268/

Author

  • Siegfried Kamgo

    Founder and editorial lead at FrontierWisdom. Engineer turned operator-analyst writing about AI systems, automation infrastructure, decentralised stacks, and the practical economics of frontier technology. Focus: turning fast-moving releases into durable, implementation-ready playbooks.

Keep Compounding Signal

Get the next blueprint before it becomes common advice.

Join the newsletter for future-economy playbooks, tactical prompts, and high-margin tool recommendations.

  • Actionable execution blueprints
  • High-signal tool and infrastructure breakdowns
  • New monetization angles before they saturate

No fluff. No generic AI listicles. Unsubscribe anytime.

Leave a Reply

Your email address will not be published. Required fields are marked *