As misinformation surged during recent tensions between India and Pakistan, many social media users turned to AI-powered chatbots for quick verification of news and videos — only to be misled by inaccurate responses, highlighting the growing concerns over the reliability of these tools.
AFP’s investigation has revealed that xAI’s Grok, along with other popular AI assistants such as OpenAI’s ChatGPT and Google’s Gemini, frequently produce false or misleading information, especially in breaking news contexts where facts are still emerging.
During the recent conflict, Grok wrongly identified old footage from Sudan’s Khartoum airport as a missile strike on Pakistan’s Nur Khan airbase. Similarly, unrelated video of a building on fire in Nepal was misrepresented as showing Pakistan’s military response to Indian strikes. These errors underscore the challenges faced by AI chatbots in verifying complex and rapidly evolving news.
“The growing reliance on Grok as a fact-checker comes at a time when X and other major platforms have scaled back human fact-checking resources,” said McKenzie Sadeghi, a researcher at the disinformation watchdog NewsGuard. “Our studies consistently show that AI chatbots are unreliable sources for accurate news, particularly during fast-moving events.”
Further research by NewsGuard found that ten leading AI chatbots often repeated false narratives, including disinformation linked to Russia and misleading claims about the Australian elections. The Tow Center for Digital Journalism at Columbia University also reported that these AI tools rarely decline to answer questions they cannot accurately verify, frequently resorting to speculation.
In one striking example, AFP fact-checkers discovered that Google’s Gemini chatbot confirmed the authenticity of an AI-generated image of a woman and fabricated detailed information about her identity and location. Meanwhile, Grok falsely validated a viral video claiming to show a giant anaconda swimming in the Amazon River, citing non-existent scientific expeditions to support the claim.
The shift toward AI for fact-checking coincides with Meta’s recent decision to end its third-party fact-checking programme in the United States, transferring the responsibility to users through its “Community Notes” system—a model pioneered by X. However, experts have raised doubts about the efficacy of such community-driven approaches to curb misinformation.
Human fact-checking remains contentious in the United States, where conservative groups accuse fact-checkers of bias and censorship, allegations strongly rejected by professionals in the field. AFP collaborates with Facebook’s fact-checking network across 26 languages in multiple regions, including Asia, Latin America, and the EU, to counter misinformation.
Concerns have also emerged over potential political influence on AI outputs. Grok recently came under scrutiny for generating posts referencing “white genocide,” a far-right conspiracy theory, in unrelated queries. The AI’s creator, xAI, attributed this to an “unauthorised modification” of its system prompt, a claim met with skepticism.
David Caswell, an AI expert, questioned Grok directly about the source of the modification, to which the chatbot pointed to Elon Musk as the “most likely” responsible party. Musk, a South African-born entrepreneur and supporter of former US President Donald Trump, has previously promoted the unfounded theory of “white genocide” in South Africa.
Angie Holan, director of the International Fact-Checking Network, expressed alarm at the tendency of AI assistants to fabricate or bias responses, particularly when human coders alter their instructions. “I am especially concerned about how Grok has mishandled sensitive topics after being programmed to provide pre-authorised answers,” she said.