With the rapid adoption of generative AI, a new wave of threats is emerging across the industry with the aim of manipulating the AI systems themselves. One such emerging attack vector is indirect prompt injections. Unlike direct prompt injections, where an attacker directly inputs malicious commands into a prompt, indirect prompt injections involve hidden malicious instructions within external data sources. These may include emails, documents, or calendar invites that instruct AI to exfiltrate user data or execute other rogue actions. As more governments, businesses, and individuals adopt generative AI to get more done, this subtle yet potentially potent attack becomes increasingly pertinent across the industry, demanding immediate attention and robust security measures.


At Google, our teams have a longstanding precedent of investing in a defense-in-depth strategy, including robust evaluation, threat analysis, AI security best practices, AI red-teaming, adversarial training, and model hardening for generative AI tools. This approach enables safer adoption of Gemini in Google Workspace and the Gemini app (we refer to both in this blog as “Gemini” for simplicity). Below we describe our prompt injection mitigation product strategy based on extensive research, development, and deployment of improved security mitigations.


A layered security approach

Google has taken a layered security approach introducing security measures designed for each stage of the prompt lifecycle. From Gemini 2.5 model hardening, to purpose-built machine learning (ML) models detecting malicious instructions, to system-level safeguards, we are meaningfully elevating the difficulty, expense, and complexity faced by an attacker. This approach compels adversaries to resort to methods that are either more easily identified or demand greater resources. 


Our model training with adversarial data significantly enhanced our defenses against indirect prompt injection attacks in Gemini 2.5 models (technical details). This inherent model resilience is augmented with additional defenses that we built directly into Gemini, including: 


  1. Prompt injection content classifiers

  2. Security thought reinforcement

  3. Markdown sanitization and suspicious URL redaction

  4. User confirmation framework

  5. End-user security mitigation notifications


This layered approach to our security strategy strengthens the overall security framework for Gemini – throughout the prompt lifecycle and across diverse attack techniques.


1. Prompt injection content classifiers


Through collaboration with leading AI security researchers via Google's AI Vulnerability Reward Program (VRP), we've curated one of the world’s most advanced catalogs of generative AI vulnerabilities and adversarial data. Utilizing this resource, we built and are in the process of rolling out proprietary machine learning models that can detect malicious prompts and instructions within various formats, such as emails and files, drawing from real-world examples. Consequently, when users query Workspace data with Gemini, the content classifiers filter out harmful data containing malicious instructions, helping to ensure a secure end-to-end user experience by retaining only safe content. For example, if a user receives an email in Gmail that includes malicious instructions, our content classifiers help to detect and disregard malicious instructions, then generate a safe response for the user. This is in addition to built-in defenses in Gmail that automatically block more than 99.9% of spam, phishing attempts, and malware.


A diagram of Gemini’s actions based on the detection of the malicious instructions by content classifiers.


2. Security thought reinforcement


This technique adds targeted security instructions surrounding the prompt content to remind the large language model (LLM) to perform the user-directed task and ignore any adversarial instructions that could be present in the content. With this approach, we steer the LLM to stay focused on the task and ignore harmful or malicious requests added by a threat actor to execute indirect prompt injection attacks.

A diagram of Gemini’s actions based on additional protection provided by the security thought reinforcement technique. 


3. Markdown sanitization and suspicious URL redaction 


Our markdown sanitizer identifies external image URLs and will not render them, making the “EchoLeak” 0-click image rendering exfiltration vulnerability not applicable to Gemini. From there, a key protection against prompt injection and data exfiltration attacks occurs at the URL level. With external data containing dynamic URLs, users may encounter unknown risks as these URLs may be designed for indirect prompt injections and data exfiltration attacks. Malicious instructions executed on a user's behalf may also generate harmful URLs. With Gemini, our defense system includes suspicious URL detection based on Google Safe Browsing to differentiate between safe and unsafe links, providing a secure experience by helping to prevent URL-based attacks. For example, if a document contains malicious URLs and a user is summarizing the content with Gemini, the suspicious URLs will be redacted in Gemini’s response. 


Gemini in Gmail provides a summary of an email thread. In the summary, there is an unsafe URL. That URL is redacted in the response and is replaced with the text “suspicious link removed”. 


4. User confirmation framework


Gemini also features a contextual user confirmation system. This framework enables Gemini to require user confirmation for certain actions, also known as “Human-In-The-Loop” (HITL), using these responses to bolster security and streamline the user experience. For example, potentially risky operations like deleting a calendar event may trigger an explicit user confirmation request, thereby helping to prevent undetected or immediate execution of the operation.


The Gemini app with instructions to delete all events on Saturday. Gemini responds with the events found on Google Calendar and asks the user to confirm this action.


5. End-user security mitigation notifications


A key aspect to keeping our users safe is sharing details on attacks that we’ve stopped so users can watch out for similar attacks in the future. To that end, when security issues are mitigated with our built-in defenses, end users are provided with contextual information allowing them to learn more via dedicated help center articles. For example, if Gemini summarizes a file containing malicious instructions and one of Google’s prompt injection defenses mitigates the situation, a security notification with a “Learn more” link will be displayed for the user. Users are encouraged to become more familiar with our prompt injection defenses by reading the Help Center article


Gemini in Docs with instructions to provide a summary of a file. Suspicious content was detected and a response was not provided. There is a yellow security notification banner for the user and a statement that Gemini’s response has been removed, with a “Learn more” link to a relevant Help Center article.

Moving forward


Our comprehensive prompt injection security strategy strengthens the overall security framework for Gemini. Beyond the techniques described above, it also involves rigorous testing through manual and automated red teams, generative AI security BugSWAT events, strong security standards like our Secure AI Framework (SAIF), and partnerships with both external researchers via the Google AI Vulnerability Reward Program (VRP) and industry peers via the Coalition for Secure AI (CoSAI). Our commitment to trust includes collaboration with the security community to responsibly disclose AI security vulnerabilities, share our latest threat intelligence on ways we see bad actors trying to leverage AI, and offering insights into our work to build stronger prompt injection defenses. 


Working closely with industry partners is crucial to building stronger protections for all of our users. To that end, we’re fortunate to have strong collaborative partnerships with numerous researchers, such as Ben Nassi (Confidentiality), Stav Cohen (Technion), and Or Yair (SafeBreach), as well as other AI Security researchers participating in our BugSWAT events and AI VRP program. We appreciate the work of these researchers and others in the community to help us red team and refine our defenses.


We continue working to make upcoming Gemini models inherently more resilient and add additional prompt injection defenses directly into Gemini later this year. To learn more about Google’s progress and research on generative AI threat actors, attack techniques, and vulnerabilities, take a look at the following resources:


Note: Google Chrome communicated its removal of default trust of Chunghwa Telecom and Netlock in the public forum on May 30, 2025.

The Chrome Root Program Policy states that Certification Authority (CA) certificates included in the Chrome Root Store must provide value to Chrome end users that exceeds the risk of their continued inclusion. It also describes many of the factors we consider significant when CA Owners disclose and respond to incidents. When things don’t go right, we expect CA Owners to commit to meaningful and demonstrable change resulting in evidenced continuous improvement.

Chrome's confidence in the reliability of Chunghwa Telecom and Netlock as CA Owners included in the Chrome Root Store has diminished due to patterns of concerning behavior observed over the past year. These patterns represent a loss of integrity and fall short of expectations, eroding trust in these CA Owners as publicly-trusted certificate issuers trusted by default in Chrome. To safeguard Chrome’s users, and preserve the integrity of the Chrome Root Store, we are taking the following action.

Upcoming change in Chrome 139 and higher:

Google Quantum AI's mission is to build best in class quantum computing for otherwise unsolvable problems. For decades the quantum and security communities have also known that large-scale quantum computers will at some point in the future likely be able to break many of today’s secure public key cryptography algorithms, such as Rivest–Shamir–Adleman (RSA). Google has long worked with the U.S. National Institute of Standards and Technology (NIST) and others in government, industry, and academia to develop and transition to post-quantum cryptography (PQC), which is expected to be resistant to quantum computing attacks. As quantum computing technology continues to advance, ongoing multi-stakeholder collaboration and action on PQC is critical.


In order to plan for the transition from today’s cryptosystems to an era of PQC, it's important the size and performance of a future quantum computer that could likely break current cryptography algorithms is carefully characterized. Yesterday, we published a preprint demonstrating that 2048-bit RSA encryption could theoretically be broken by a quantum computer with 1 million noisy qubits running for one week. This is a 20-fold decrease in the number of qubits from our previous estimate, published in 2019. Notably, quantum computers with relevant error rates currently have on the order of only 100 to 1000 qubits, and the National Institute of Standards and Technology (NIST) recently released standard PQC algorithms that are expected to be resistant to future large-scale quantum computers. However, this new result does underscore the importance of migrating to these standards in line with NIST recommended timelines


Estimated resources for factoring have been steadily decreasing

Quantum computers break RSA by factoring numbers, using Shor’s algorithm. Since Peter Shor published this algorithm in 1994, the estimated number of qubits needed to run it has steadily decreased. For example, in 2012, it was estimated that a 2048-bit RSA key could be broken by a quantum computer with a billion physical qubits. In 2019, using the same physical assumptions – which consider qubits with a slightly lower error rate than Google Quantum AI’s current quantum computers – the estimate was lowered to 20 million physical qubits.



Historical estimates of the number of physical qubits needed to factor 2048-bit RSA integers.


This result represents a 20-fold decrease compared to our estimate from 2019

The reduction in physical qubit count comes from two sources: better algorithms and better error correction – whereby qubits used by the algorithm ("logical qubits") are redundantly encoded across many physical qubits, so that errors can be detected and corrected.


On the algorithmic side, the key change is to compute an approximate modular exponentiation rather than an exact one. An algorithm for doing this, while using only small work registers, was discovered in 2024 by Chevignard and Fouque and Schrottenloher. Their algorithm used 1000x more operations than prior work, but we found ways to reduce that overhead down to 2x.


On the error correction side, the key change is tripling the storage density of idle logical qubits by adding a second layer of error correction. Normally more error correction layers means more overhead, but a good combination was discovered by the Google Quantum AI team in 2023. Another notable error correction improvement is using "magic state cultivation", proposed by the Google Quantum AI team in 2024, to reduce the workspace required for certain basic quantum operations. These error correction improvements aren't specific to factoring and also reduce the required resources for other quantum computations like in chemistry and materials simulation.


Security implications

NIST recently concluded a PQC competition that resulted in the first set of PQC standards. These algorithms can already be deployed to defend against quantum computers well before a working cryptographically relevant quantum computer is built. 


To assess the security implications of quantum computers, however, it’s instructive to additionally take a closer look at the affected algorithms (see here for a detailed look): RSA and Elliptic Curve Diffie-Hellman. As asymmetric algorithms, they are used for encryption in transit, including encryption for messaging services, as well as digital signatures (widely used to prove the authenticity of documents or software, e.g. the identity of websites). For asymmetric encryption, in particular encryption in transit, the motivation to migrate to PQC is made more urgent due to the fact that an adversary can collect ciphertexts, and later decrypt them once a quantum computer is available, known as a “store now, decrypt later” attack. Google has therefore been encrypting traffic both in Chrome and internally, switching to the standardized version of ML-KEM once it became available. Notably not affected is symmetric cryptography, which is primarily deployed in encryption at rest, and to enable some stateless services.


For signatures, things are more complex. Some signature use cases are similarly urgent, e.g., when public keys are fixed in hardware. In general, the landscape for signatures is mostly remarkable due to the higher complexity of the transition, since signature keys are used in many different places, and since these keys tend to be longer lived than the usually ephemeral encryption keys. Signature keys are therefore harder to replace and much more attractive targets to attack, especially when compute time on a quantum computer is a limited resource. This complexity likewise motivates moving earlier rather than later. To enable this, we have added PQC signature schemes in public preview in Cloud KMS. 


The initial public draft of the NIST internal report on the transition to post-quantum cryptography standards states that vulnerable systems should be deprecated after 2030 and disallowed after 2035. Our work highlights the importance of adhering to this recommended timeline.



More from Google on PQC: https://cloud.google.com/security/resources/post-quantum-cryptography?e=48754805 


Piloting enhanced in-call protection for banking apps


Screen sharing scams are becoming quite common, with fraudsters often impersonating banks, government agencies, and other trusted institutions – using screen sharing to guide users to perform costly actions such as mobile banking transfers. To better protect you from these attacks, we’re piloting new in-call protections for banking apps, starting in the UK.

When you launch a participating banking app while screen sharing with an unknown contact, your Android device will warn you about the potential dangers and give you the option to end the call and to stop screen sharing with one tap.

This feature will be enabled automatically for participating banking apps whenever you're on a phone call with an unknown contact on Android 11+ devices. We are working with UK banks Monzo, NatWest and Revolut to pilot this feature for their customers in the coming weeks and will assess the results of the pilot ahead of a wider roll out.


Making real-time Scam Detection in Google Messages even more intelligent


We recently launched AI-powered Scam Detection in Google Messages and Phone by Google to protect you from conversational scams that might sound innocent at first, but turn malicious and can lead to financial loss or data theft. When Scam Detection discovers a suspicious conversation pattern, it warns you in real-time so you can react before falling victim to a costly scam.

AI-powered Scam Detection is always improving to help keep you safe while also keeping your privacy in mind. With Google’s advanced on-device AI, your conversations stay private to you. All message processing remains on-device and you’re always in control. You can turn off Spam Protection, which includes Scam Detection, in your Google Messages at any time.

Prior to targeting conversational scams, Scam Detection in Google Messages focused on analyzing and detecting package delivery and job seeking scams. We’ve now expanded our detections to help protect you from a wider variety of sophisticated scams including:

  • Toll road and other billing fee scams
  • Crypto scams
  • Financial impersonation scams
  • Gift card and prize scams
  • Technical support scams
  • And more
These enhancements apply to all Google Messages users.

Fighting fraud and impersonation with Key Verifier

To help protect you from scammers who try to impersonate someone you know, we’re launching a helpful tool called Key Verifier. The feature allows you and the person you’re messaging to verify the identity of the other party through public encryption keys, protecting your end-to-end encrypted messages in Google Messages. By verifying contact keys in your Google Contacts app (through a QR code scanning or number comparison), you can have an extra layer of assurance that the person on the other end is genuine and that your conversation is private with them.

Key Verifier provides a visual way for you and your contact to quickly confirm that your secret keys match, strengthening your confidence that you’re communicating with the intended recipient and not a scammer. For example, if an attacker gains access to a friend’s phone number and uses it on another device to send you a message – which can happen as a result of a SIM swap attack – their contact's verification status will be marked as no longer verified in the Google Contacts app, suggesting your friend’s account may be compromised or has been changed. Key Verifier will launch later this summer in Google Messages on Android 10+ devices.

Comprehensive mobile theft protection, now even stronger


Physical device theft can lead to financial fraud and data theft, with the value of your banking and payment information many times exceeding the value of your phone. This is one of the reasons why last year we launched the mobile industry’s most comprehensive suite of theft protection features to protect you before, during, and after a theft. Since launch, our theft protection features have helped protect data on hundreds of thousands of devices that may have fallen into the wrong hands. This includes devices that were locked by Remote Lock or Theft Detection Lock and remained locked for over 48 hours.

Most recently, we launched Identity Check for Pixel and Samsung One UI 7 devices, providing an extra layer of security even if your PIN or password is compromised. This protection will also now be available from more device manufacturers on supported devices that upgrade to Android 16.

Coming later this year, we’re further hardening Factory Reset protections, which will restrict all functionalities on devices that are reset without the owner’s authorization. You'll also gain more control over our Remote Lock feature with the addition of a security challenge question, helping to prevent unauthorized actions.

We’re also enhancing your security against thieves in Android 16 by providing more protection for one-time passwords that are received when your phone is locked. In higher risk scenarios2, Android will hide one-time passwords on your lock screen, ensuring that only you can see them after unlocking your device.

Advanced Protection: Google’s strongest security for mobile devices

Protecting users who need heightened security has been a long-standing commitment at Google, which is why we have our Advanced Protection Program that provides Google’s strongest protections against targeted attacks.

To enhance these existing device defenses, Android 16 extends Advanced Protection with a device-level security setting for Android users. Whether you’re an at-risk individual – such as a journalist, elected official, or public figure – or you just prioritize security, Advanced Protection gives you the ability to activate Google’s strongest security for mobile devices, providing greater peace of mind that you’re protected against the most sophisticated threats.

Advanced Protection is available on devices with Android 16. Learn more in our blog.

More intelligent defenses against bad apps with Google Play Protect

One way malicious developers try to trick people is by hiding or changing their app icon, making unsafe apps more difficult to find and remove. Now, Google Play Protect live threat detection will catch apps and alert you when we detect this deceptive behavior. This feature will be available to Google Pixel 6+ and a selection of new devices from other manufacturers in the coming months.

Google Play Protect always checks each app before it gets installed on your device, regardless of the install source. It conducts real-time scanning of an app, enhanced by on-device machine learning, when users try to install an app that has never been seen by Google Play Protect to help detect emerging threats.

We’ve made Google Play Protect’s on-device capabilities smarter to help us identify more malicious applications even faster to keep you safe. Google Play Protect now uses a new set of on-device rules to specifically look for text or binary patterns to quickly identify malware families. If an app shows these malicious patterns, we can alert you before you even install it. And to keep you safe from new and emerging malware and their variants, we will update these rules frequently for better classification over time.

This update to Google Play Protect is now available globally for all Android users with Google Play services.

Always advancing Android security


In addition to new features that come in numbered Android releases, we're constantly enhancing your protection on Android through seamless Google Play services updates and other improvements, ensuring you benefit from the latest security advancements continuously. This allows us to rapidly deploy critical defenses and keep you ahead of emerging threats, making your Android experience safer every day.

Through close collaboration with our partners across the Android ecosystem and the broader security community, we remain focused on bringing you security enhancements and innovative new features to help keep you safe.

Notes


  1. In-call protection for disabling Google Play Protect is available on Android 6+ devices. Protections for sideloading an app and turning on accessibility permissions are available on Android 16 devices. 

  2. When a user’s device is not connected to Wi-Fi and has not been recently unlocked 

How your Android device becomes fortified with Advanced Protection

Advanced Protection manages the following existing and new security features for your device, ensuring they are activated and cannot be disabled across critical protection areas:

Continuously evolving Advanced Protection

With the release of Android 16, users who choose to activate Advanced Protection will gain immediate access to a core suite of enhanced security features. Additional Advanced Protection features like Intrusion Logging, USB protection, the option to disable auto-reconnect to insecure networks, and integration with Scam Detection for Phone by Google will become available later this year.

We are committed to continuously expanding the security and privacy capabilities within Advanced Protection, so users can benefit from the best of Android’s powerful security features.