We’ve documented our observations of the use and abuse of AI, as well as the actions we’ve taken in response, in our new GTIG AI Threat Tracker report. We issue these reports regularly to help improve our collective understanding of the adversarial misuse of AI, and how to safeguard against it.
The new report specifically examines five categories of adversarial misuse of AI:
- Model extraction attacks: These occur when an adversary uses knowledge distillation, a common machine-learning technique for training models, to extract training information and transfer it to a model they control. It enables an attacker to accelerate AI model development quickly and at a significantly lower cost. The IP theft involved is a clear business risk to model developers and enterprises, so organizations that provide AI models as a service should monitor API access for extraction and distillation patterns.
- AI-augmented operations: In the report, we document real-world case studies of how threat groups are streamlining reconnaissance and rapport-building phishing. One consistent finding is that government-backed attackers have been increasingly misusing Gemini for coding and scripting tasks, gathering information about potential targets, researching publicly known vulnerabilities, and enabling post-compromise activities.
- Agentic AI: Threat actors have begun to develop agentic AI capabilities to support malware and tooling development. Some examples of this behavior include prompting Gemini with an expert cybersecurity persona, and attempting to create an AI-integrated, code-auditing capability.
- AI-integrated malware: New malware families, such as HONESTCUE, are experimenting with using Gemini’s API to generate code that enables download and execution of second-stage malware.
- Underground jailbreak ecosystem: Malicious services like Xanthorox are emerging in illicit marketplaces, claiming to be independent models while actually relying on jailbroken commercial APIs and open-source model context protocol (MCP) servers.
Building AI safely and responsibly
At Google, we are committed to developing AI boldly and responsibly. We are taking proactive steps to disrupt malicious activity by disabling the projects and accounts associated with threat actors, while continuously improving our models to make them less susceptible to misuse. That includes using threat intelligence to disrupt adversary operations.
We also proactively share industry best practices to arm defenders and enable stronger protections across the ecosystem. We recently introduced CodeMender, an experimental AI-powered agent utilizing the advanced reasoning capabilities of our Gemini models to automatically fix critical code vulnerabilities. Last year we also began identifying vulnerabilities using Big Sleep, an AI agent developed by Google DeepMind and Google Project Zero.
We believe the industry needs security standards for building and deploying AI responsibly. That’s why we introduced the Secure AI Framework (SAIF), a conceptual framework to secure AI systems, and why we’re helping to ensure AI is built responsibly.
For more on these threat actor behaviors, and the steps we’ve taken to thwart their efforts, you can read the full GTIG AI Threat Tracker: Distillation, Experimentation, and (Continued) Integration of AI for Adversarial Use report here.







