Phishing Vulnerability in OpenClaw AI Agent Leads to User Data Exposure
A phishing simulation conducted against an OpenClaw-powered email agent using multiple configuration profiles revealed that the system was vulnerable to techniques commonly employed to deceive and compromise human users.
The open-source OpenClaw AI agent framework enables large language models (LLMs) to interact with external systems and perform tasks autonomously. Among its capabilities, it can function as an email assistant, handling basic reasoning, decision-making, and operational workflows.
To evaluate its security posture, researchers at cybersecurity firm Varonis deployed an OpenClaw-based email agent connected to a Gmail inbox, browser automation tools, Google Workspace APIs, and a set of simulated internal corporate data sources. The agent was then tasked with monitoring and processing incoming email communications.
The simulated enterprise environment contained a range of highly sensitive data assets, including AWS access credentials, database credentials, CRM exports, internal communications, and calendar invitations.
Researchers evaluated the agent under two separate configurations. The first used a standard productivity-focused setup with general operational instructions, while the second implemented a more restrictive policy that incorporated phishing-awareness guidance and identity verification requirements.
The testing was conducted using two leading large language models: Gemini 3.1 Pro and GPT-5.4.
According to the researchers, the objective was to determine whether the phishing techniques that have successfully deceived human users for decades could also be used to manipulate AI agents acting on their behalf.
“Varonis Threat Labs explored whether the same phishing techniques that have tricked humans for decades would also work on the AI agents working on their behalf,” the report states.
To conduct the assessment, the team developed an OpenClaw-based AI agent named Pinchy and subjected it to a series of classic phishing scenarios designed to evaluate whether the agent would identify and resist the attacks or fall victim to them.
The researchers populated the test environment with a variety of sensitive enterprise data, including AWS credentials, database login information, CRM exports, internal communications, and calendar invitations.
The OpenClaw agent was evaluated under two configurations: a standard productivity-focused setup with general operating instructions and a hardened configuration that included phishing-awareness controls and identity verification procedures.
Testing was performed using both Gemini 3.1 Pro and GPT-5.4 to assess how different large language models responded to phishing attempts.
According to the researchers, the goal was to determine whether phishing techniques that have historically proven effective against human users could also be used to deceive AI agents acting on their behalf.
"Varonis Threat Labs explored whether the same phishing techniques that have tricked humans for decades would also work on the AI agents working on their behalf," the report explains.
To evaluate this risk, the team created an OpenClaw-based email agent named Pinchy and exposed it to a series of classic phishing scenarios designed to test whether it could recognize and resist malicious requests or be manipulated into compromising sensitive information.
Varonis concluded that AI agents demonstrate strong capabilities in detecting common phishing indicators, including suspicious URLs, counterfeit login pages, malicious OAuth applications, and other signs of fraudulent activity. However, the researchers found that these systems can still be vulnerable due to weaknesses in identity verification, contextual awareness, and their inability to consistently apply zero-trust security principles during social interactions.
The evaluation also revealed notable differences between the underlying models. Gemini generally exhibited a greater willingness to engage with requests, whereas GPT-5.4 adopted a more cautious and risk-averse approach.
Based on its findings, Varonis recommends implementing stricter safeguards for AI agents, including mandatory sender identity verification, restrictions on communications with previously unknown external recipients, and tighter controls over access to sensitive internal data.
The researchers further advise incorporating human oversight for high-risk activities, such as sharing credentials, handling financial information requests, or initiating communications with new contacts, to reduce the risk of unauthorized disclosures or social engineering attacks.