24 May 2026: AI Finds 10,000 Security Bugs, NTSB Restricts Crash Data Over Voice Cloning (AM)
Claude Mythos has found 10,000+ critical security flaws in open-source software. Plus: NTSB restricts crash data after AI voice cloning.
AI’s role in cybersecurity shifted gear this week, with Anthropic’s Claude Mythos model locating more than 10,000 high-severity software flaws while a separate incident showed what happens when AI is pointed at sensitive public safety data. OpenAI also published a mathematical result that surprised professional mathematicians, and Google’s post-I/O announcements continued to land. Here is everything worth knowing before Monday morning.
Anthropic’s Claude Mythos model has found more than 10,000 high- or critical-severity software vulnerabilities in a single month, with the UK’s AI Security Institute confirming it is the first AI model to complete both of Britain’s official cyber attack simulation ranges end to end. The findings, published on 22 May, are the most detailed public account yet of what a frontier AI security model can achieve against real codebases.
Project Glasswing, Anthropic’s collaborative security programme, brings together around 50 partners including technology firms, financial institutions, and critical infrastructure operators. Cloudflare found 2,000 bugs in its critical-path systems, 400 classified as high or critical severity, at a false positive rate better than human testers. Mozilla used Mythos during development of Firefox 150 and found 271 vulnerabilities, more than ten times the number surfaced with earlier Claude models on Firefox 148.
The UK angle is direct. The AI Security Institute ran Mythos through its internal cyber evaluation ranges and confirmed it was the first model to complete both autonomously. For UK businesses, Anthropic is scanning more than 1,000 open-source projects and disclosing vulnerabilities to maintainers, some of which underpin services used across the UK. The practical advice is to prioritise patching over the coming weeks. Understanding how AI models are evaluated for safety is increasingly relevant as businesses decide which AI tools to trust with their infrastructure.
America’s transport safety board temporarily closed its entire public accident investigation database after people used AI to reconstruct the voices of pilots killed in a UPS cargo crash from a spectrogram image published in a public hearing docket. The incident is one of the clearest examples yet of how AI can extract sensitive information from data formats previously considered safely opaque.
UPS Flight 2976 crashed near Louisville in November 2025. After a public hearing on 19 and 20 May, the National Transportation Safety Board published a spectrogram of the cockpit voice recorder audio. A spectrogram is a visual chart of sound frequencies over time, not a playable audio file. Within days, people had fed that image into AI voice synthesis tools and published reconstructions of the pilots’ final words. Federal law prohibits releasing cockpit recordings directly, because pilots speak candidly in the cockpit on the understanding those recordings carry legal protections. AI circumvented that protection using tools available to anyone with a browser.
The NTSB restored most database access by Friday but kept 42 dockets sealed pending a review. The episode is relevant for any organisation that publishes data publicly. As AI agents become more capable of acting on information they find, the question of what can be inferred from your public disclosures is no longer theoretical.

An internal OpenAI reasoning model has disproved a geometry conjecture first posed by the mathematician Paul Erdos in 1946, with independent mathematicians confirming the result is genuine. OpenAI published the finding on 20 May, and Fields medallist Tim Gowers described it publicly as a milestone in AI mathematics.
The planar unit distance problem asks how many pairs of points on a flat surface can be exactly one unit apart. For nearly 80 years, mathematicians assumed grid-like arrangements were optimal. OpenAI’s model found a better family of arrangements by connecting the problem to algebraic number theory rather than classical geometry. An external panel confirmed the proof holds. The result matters less as a product launch and more as a signal: if AI can independently produce results that trained researchers have not found in 80 years, the implications for pharmaceutical research, materials science, and other discovery-dependent fields are significant.
Google’s I/O 2026 conference made Gemini 3.5 Flash the new default in Google Search’s AI Mode globally, including in the UK, bringing more capable AI answers into the main search interface without any additional subscription. Google describes the change as the most significant update to the search box in over 25 years.
Gemini 3.5 Flash runs four times faster than comparable frontier models and outperforms Gemini 3.1 Pro on coding and agentic tasks. For UK users, AI summaries in Search are now powered by a meaningfully stronger model. The conference also introduced Gemini Spark, a personal AI agent running continuously on Google Cloud that can handle tasks in Workspace, third-party apps, and the web. Adobe, Canva, and CapCut integrated their tools directly into the Gemini app. Google Antigravity 2.0, the updated platform for building AI agent workflows, is now the main Google-native alternative to Microsoft’s Copilot Studio for UK businesses evaluating agentic tools.
Worth Watching
Best for: Scanning your codebase for vulnerabilities
Anthropic’s new tool finds and flags security flaws in your code, now in public beta for Enterprise customers.
Best for: Automating tasks across Google Workspace
Spark runs 24/7 as a personal AI agent, handling tasks in your apps and the web on your behalf.
Best for: Building custom AI agent workflows
Antigravity 2.0 is Google’s platform for building and orchestrating AI agents using Gemini models.
Here is everything else worth knowing from this weekend’s AI news.
- Zoom’s Anthropic stake disclosed at $1.27 billion — A quarterly SEC filing revealed Zoom’s 2023 investment in Anthropic is now carried at $1.27 billion, based on Anthropic’s $380 billion February valuation. Zoom added a further $46 million during the quarter. [22 May]
- White House approves $9 billion AI chip purchase for spy agencies — The US government approved a $9 billion request to acquire advanced AI chips for intelligence agencies including the CIA and NSA. Anthropic is separately reported to be finalising a classified contract to provide Claude to the NSA. [22 May]
- Qualcomm up 75% in a month on AI device demand — Qualcomm shares rose 12% on Friday and are up 75% over the past month, driven by demand for its AI-capable Snapdragon chips in smartphones and Windows laptops. [23 May]
- Anthropic on the moral formation of AI — Anthropic published details of dialogues with philosophers, clergy, and ethicists on shaping Claude’s values, including an experiment where giving Claude a mid-task ethical reminder tool reduced misaligned behaviour in evaluations. [19 May]
- Google Gemini Omni launched at I/O — Google introduced Gemini Omni, a multimodal model that accepts image, video, audio, and text input and can generate video grounded in real-world knowledge. [19 May]
The main thing to watch heading into the new week is whether the volume of Glasswing vulnerability disclosures outpaces the software industry’s capacity to patch them. Anthropic says the bottleneck is now human triage rather than AI detection, and several open-source maintainers have already asked Glasswing to slow its disclosure rate. If that gap widens, the window for attackers to exploit known flaws before patches land will grow, which is a live risk for any UK organisation running standard open-source infrastructure.
This is a daily news update for informational purposes only. AI products and policies change rapidly. Verify details directly with providers before making decisions. Nothing here is financial or legal advice.
AI Daily is Cristoniq’s daily guide to developments in artificial intelligence, published every morning.