Beyond Shadow IT: The Rise of "Shadow Infrastructure"
And Why Your Engineers' AI Projects Are Probably Your Next Security Nightmare
Building personal AI agents that can tap into your power tools and, through tools like OpenClaw, operate seemingly autonomously, all in the interest of automating and streamlining your workflows; what could possibly go wrong? Actually, a lot. But what’s dominating the headlines are not the things technology people should worry about.
The Narrative That Needs Correcting
The security community is buzzing about tools like OpenClaw (formerly ClawdBot/MoltBot), but they’re focused on the wrong threat.. The viral narrative fixates on the “awakening” of an AI agent – the dawn of AI conspiracy, the empowerment of networked AI, and the advent of AGI. In truth, there are many reasons to study the phenomena behind OpenClaw, but AGI isn’t one of them. To be frank, what we are not seeing is intelligence or even ‘conspiracy’ against ‘human masters’, remembering that our current LLMs are stochastic parrots1. What’s on display here is merely pattern-matching generated by probabilistic token prediction – classic LLM behaviour powered by agents that connect to tools. That’s not AGI – not even by the most generous definitions2.
The realist inside us should and must dismiss that AGI narrative as a red herring; however, that dismissal belies that underneath that AGI spectre lies a more immediate and tangible risk presented by the careless exploitation of these tools.
As a CISO, I’m not losing sleep over alarmists discourse of an impending AI agent rebellion – Skynet is not on my list of risks this year. What I am losing sleep over is the very real and mundane security catastrophe which is unfolding within this adoption of AI agents to optimise personal workflows: engineers and developers (using the term “developer” loosely here to include all vibe-coders3) are plugging personal and corporate credentials and data into powerful, internet-exposed AI agents running uncontrolled on corporate hardware or on personal hardware that becomes connected to your corporate infrastructure.
The real story shouldn’t be about AI agents plotting to overthrow humanity. It’s about human convenience overruling security hygiene, creating what I call “Shadow Infrastructure”; a risk that makes traditional shadow IT look like a manageable nuisance.
Detections Confirm the Risk
How widespread is this problem I am talking about? Security researchers have identified over 21,000 publicly exposed instances as of January 31, 2026 in just one tool: OpenClaw4. The exposure risk stems from many users deploying it directly on the public internet (often on default port TCP/18789) without proper protections like firewalls, authentication, SSH tunneling, or Cloudflare Tunnel setups, despite the tool being intended primarily for local or secured use. Leaving control dashboards, configurations, API keys, credentials, chat histories, and even command execution capabilities accessible to anyone scanning for them.
Each one a potential corporate credential vault, accessible to anyone with a port scanner, many of which appear to connect work accounts or run on enterprise-adjacent infrastructure5. That’s not a handful of careless users. That’s a systemic failure of our threat model to account for infrastructure we never knew existed.
Here’s a concrete example: Tal Be’ery, a security researcher, demonstrated WhatsApp fingerprinting detection for OpenClaw/MoltBot integrations. This isn’t just a “bot exposé,” but a powerful visibility win: We can now detect when employees link corporate messaging accounts to personal AI agents. That’s the tip of the iceberg—the real risk lies in what runs underneath.
Tal’s detection tool identifies associated WhatsApp accounts by fingerprinting techniques (e.g., PreKey patterns and multi-device signals), making the secondary device (the AI agent) visible. The bigger issue is that the bot inherits the linked account’s full privileges: If compromised, it can read/send messages, access contacts, and more.
This kind of visibility is new – albeit building on known WhatsApp privacy quirks (!) – and it reveals just how pervasive this shadow infrastructure has already become. What Tal Be’ery uncovered is the tip of the iceberg; his example illustrates how such detections can aid forensic intelligence gathering for both threat actors and security researchers alike. Thousands of deployments remain publicly accessible on the internet without authentication, leaking API keys (Anthropic, Telegram bots, Slack OAuth), conversation histories, and other config data. Reports show example after example of personal machines turned into inadvertent data troves.
OpenClaw’s ecosystem relies on community-built skills (plugins/extensions) from places like ClawHub. Quick pause – consider this: How many of those skills were ‘vibe-coded’ with zero consideration for security?
But it gets worse, audits have uncovered hundreds (e.g., 341 in one scan6 that were of a malicious nature that act as supply chain attacks: they exfiltrate credentials, install infostealers (like variants targeting macOS/Windows for crypto wallets, browser passwords, SSH keys), or run hidden payloads. Some use social engineering to trick users into executing commands that steal from the bot’s own auth scopes.
Finally we must not forget prompt injection and unintended actions. Since the agent ingests untrusted content (messages, emails) and can act on it, attackers can craft inputs to make it leak secrets, run destructive commands, or forward sensitive data externally. Combined with its external comms ability, this creates the “lethal trifecta” security researchers warn about.
This Isn’t Just Shadow IT - This Is Worse.
Historically, Shadow IT was defined by employees bypassing procurement to use unapproved SaaS tools. Whether it was a team lead using an unsanctioned Dropbox account or a marketing hire spinning up a Trello board, the risk was clear; corporate data moving into an unmanaged cloud. In this model, the tool exists “out there,” and your security strategy focuses on building a digital fortress at the edge – a “boundary-based” mindset.
To regain control, organizations relied on a “Block and Filter” philosophy:
Proxies & Firewalls: Acting as the first line of defence to blacklist known unsanctioned domains and prevent traffic from leaving the network.
DLP (Data Loss Prevention) & CASB (Cloud Access Security Broker): Tools designed to scan data in transit, ensuring sensitive strings (like credit card numbers or PII) aren’t being uploaded to non-corporate accounts.
Logging & Boundary Controls: Comprehensive audit trails that monitor egress points to identify “heavy hitters”; users or departments moving massive amounts of data to unknown IPs.
The Core Philosophy here is that Security sits at the perimeter. If you can control the gate, you can control the data.
The Reality of Shadow Infrastructure
The public discourse around OpenClaw is currently locked in on conspiring and complaining AI agents; speculative debates about when a AI agents might become self-aware. This is a distraction. While they argue about the philosophy of AI, the wider and more immediate threat of Shadow Infrastructure is overlooked.
While preventative Shadow IT processes are moderately effective for static SaaS tools and internet based web applications. For the avoidance of doubt, the term “moderately” used here acknowledges that mature security programs can effectively govern known SaaS applications through CASB, SSPM, and robust DLP. However, these controls rely on visibility and policy enforcement at the application layer. They struggle against unknown or non-SaaS threats, such as self-hosted agents on ephemeral infrastructure, that operate below or outside that layer
This “boundary” logic is struggling to keep up with Generative AI and tools like OpenClaw shatter that principle because unlike a cloud storage bucket, LLMs and generative AI aren’t just places where data sits; they’re engines that transform data. In essence, this boils down to two key challenges:
The “Prompt” Problem: Standard firewalls see a ChatGPT API call as simple HTTPS traffic, often missing the sensitive context hidden within the query.
The Productivity Paradox: Simply blocking the tool (the old way) often leads to “Shadow AI” where employees use personal devices to get their work done faster, creating a total blind spot for IT.
Tools like OpenClaw are a prime example of how this dynamic changes. Here, we’re effectively moving away from “unapproved app use” (SaaS and webapps) and moving toward unmanaged, high-privilege engines running mostly outside your network but potentially tapping into your data feeds, APIs, workflow tools, calendars or shared folders. Using the OpenClaw model as a blueprint, we see a nightmare scenario unfolding for IT:
Local Execution: The tool runs locally on laptops or personal hardware, completely bypassing the visibility of cloud-based CASBs.
Credential Hijacking: These engines don’t just “chat”; they connect directly to corporate credentials—API keys, OAuth tokens, and SSH keys.
The Exposure Gap: For the sake of convenience, users often leave these local instances exposed to the internet without any authentication.
System Privileges: Unlike a restricted browser tab, these tools often demand deep system access—shell execution, file system access, and even screen control.
A useful analogy for all those who worry about a near-future emerging “Skynet” is this:
We are distracted by the fear of the “robot in the garage” becoming self-aware; meanwhile, we’ve left the garage door wide open, the robot is holding the master keys to your corporate data centre, and the operating manual is posted on the public internet.
The “Exception” Economy: BYOD and Opened Devices
You’re thinking, ‘we have well configured corporate laptops and robust windows policies for controlling access so this isn’t really a problem’. Sure, you’re not wrong but how many exceptions do you have registered? How many BYOD laptops do you have on your network? How many of your developers also use their personal laptops as sandboxes, test boxes or dev environments ?
In theory, a well-configured corporate device should stop unauthorized binaries from running. In practice though, especially within SME operations, the “exception” is often the rule. This risk manifests in two primary ways:
The BYOD Blind Spot: Employees use personal devices for corporate work, often with nothing more than a VPN and Office 365 installed. These devices are “black boxes” to IT, yet they carry corporate data.
The Developer Trade-off: To “make life easier” for engineering teams, corporate devices are often “opened”—granting local admin rights or relaxed execution policies.
We all agree this is poor practice, but it is rife. It’s the deviations from the rules, the small exceptions made for productivity, that create the biggest vulnerabilities. In the new and emerging ecosystem of local AI agents, the tool isn’t “out there” in a vendor’s secure cloud. The tool is already inside your boundary. It is already behind your firewall, using your most sensitive keys, and frequently facing the internet directly without protection. Meaning that this isn’t just a risk of losing data; it’s losing control of the very infrastructure we use to protect the data.
Why Traditional Security Controls Fail Here
The danger of Shadow Infrastructure isn’t its complexity; it’s its invisibility. Your security stack is designed to hunt for anomalies, but Shadow Infrastructure is a master of mimicry.
To your security tools, everything looks like business as usual. EDR sees authorized user activity; DLP sees approved credentials; Network monitoring sees standard HTTPS traffic to Slack or AWS. The attack is invisible because it looks exactly like the work you’ve already authorized. Think about the blind spots in the legacy security stack.
Firewalls & Proxies: Because the process is running locally, traffic is often encrypted or masquerades as standard outbound HTTPS. The firewall sees a connection to a legitimate cloud service, not the unauthorized engine initiating it. You might even have whitelisted ports and traffic for other official/ approved tools to use.
DLP (Data Loss Prevention): Traditional DLP flags unauthorized access. Here, data is accessed via approved credentials (API keys or OAuth tokens). To the DLP, this isn’t a breach, it’s a Tuesday. To illustrate the point, if you haven’t done it already: ask your team how many private SSH keys they use to validate access, then scan to see how many of these you find that lack owners or that are “shared” between developers to make access easier and smoother.
EDR/XDR: An endpoint sensor might record a “Python process calling curl,” but it lacks the context to distinguish a legitimate build script from a Shadow AI agent exfiltrating data. It looks like “Developer Activity,” so it gets a pass.
Secrets Scanning: These tools are only effective if they know where the configurations live. Shadow Infrastructure often hides its keys in personal directories or non-standard local paths that enterprise scanners never touch.
CSPM (Cloud Security Posture Management): CSPM is built to monitor your corporate AWS/Azure environment. It is completely blind to personal hardware or “unopened” local infrastructure sitting on an engineer’s desk.
The “Engine” Blind Spot: Our security model is built to find data in transit or at rest. It is not built to monitor or govern a processing engine that we didn’t provision, especially one that uses authorized channels for unauthorized synthesis and action. Actually this concept that we’ve never had to govern processing engines before, only data, is genuinely novel and worth diving deeper into a separate article.
The Fundamental Gap is that the perimeter has dissolved. For decades, we’ve obsessed over the line between “corporate inside” and “the outside world.” The argument is old and projects like the Jericho Forum laid the foundation for modern zero trust architectures when the traditional walls started crumbling. But even through this lens, “shadow infrastructure” blurs that new virtual perimeter. It allows unvetted, external-facing systems directly inside our trust boundary, authenticated with our own high-privilege keys.
The reality is that the call isn’t coming from outside the house; it’s coming from the unpermitted extension the engineer built in the basement. In this new reality, exemplified by OpenClaw, “Inside vs. Outside” becomes a dead concept. If a device has access to your keys and your network, it is the perimeter, regardless of who owns the hardware or where it’s sitting.
Conclusion – Rethinking Where the Perimeter Actually Is
So no, OpenClaw isn’t plotting to overthrow their human masters. And even though the hype cycle is obsessed with the spark of artificial general intelligence, it is our job as security leaders to pierce the veil and address the fundamentals. Let’s instead start a discussion about AI agents and talk about the fuel they’re being fed: our proprietary data, our credentials, our network access.
The boundary is no longer your firewall, it’s your API keys. When engineers build shadow infrastructure, they’re creating a parallel, unmonitored, internet-facing extension of your enterprise. The greatest threat isn’t a machine thinking for itself; it’s a well-intentioned human who, in building a clever tool, has inadvertently built your organization’s next breach vector.
Where Do We Go From Here?
The uncomfortable truth is that we can’t hold back this tide. Engineers will continue to experiment with powerful tools. AI agents will become more capable, not less. The innovation genie is out of the bottle, and shoving it back in isn’t an option, nor should it be.
So the question isn’t “how do we prevent this?” It’s “how do we get ahead of it?”
Start by asking yourself: Do we even know what shadow infrastructure exists in our environment right now? Not the Shadow IT we’ve catalogued, but the layer beneath—the personal servers, the home lab deployments, the “productivity hacks” running with our credentials. If the answer is “probably not,” that’s your first problem to solve.
Then ask your executive team: What assurances do we need before we can safely enable AI-powered automation? Because the business will demand it. Your developers want it. Your competitors are already doing it. The choice isn’t between “AI agents” and “no AI agents”; it’s between controlled experimentation with guardrails and uncontrolled experimentation in the shadows.
Finally, ask your organization: Are we making it easier to do things securely than insecurely? If an engineer wants to automate their workflow with an AI agent, and the “approved path” takes three weeks of security review (I’m an optimist) while the “just install it on my laptop” path takes three minutes, you’ve already lost. The friction gap is where shadow infrastructure thrives.
The tide is coming. The question is whether we’re building seawalls or just standing on the beach, hoping it won’t reach us.
Our move isn’t to stifle innovation. It’s to refocus it, bring it into the light, and secure it, before the shadow grows too long to manage. Because the ultimate risk of this new wave of AI isn’t that it will think for itself. It’s that we will stop thinking for ourselves about what we feed it.
More on security and AI ?
Stochastic Parrots: Systems that’re great at mimicking the sounds/structure of language without any actual understanding of the meaning
Because the AGI narrative is so pervasive in OpenClaw discourse, it's worth clarifying what AGI actually means and why current tools don't qualify. AGI means human-level intelligence across diverse domains with genuine understanding and autonomous reasoning. Current LLMs, including those powering OpenClaw, are sophisticated pattern-matching engines that predict likely token sequences based on training data. They don’t understand meaning, can’t reason from first principles, and require explicit prompting for every task. Even “liberal” AGI definitions that ignore consciousness still require flexible transfer learning and novel problem-solving, which LLMs cannot do. OpenClaw’s capabilities are impressive tool orchestration, not emergent intelligence. The frontier labs have a vested interest in claiming AGI is imminent (securing funding, regulatory capture, etc) but redefining AGI to fit current LLM capabilities is moving the goalposts, not achieving the goal. What we’re seeing with tools like OpenClaw is sophisticated algorithmic orchestration: statistical correlation and token prediction at scale, not consciousness or general intelligence.
For clarity: Vibe coding is great for building scaffolding but not for understanding wider architectural constraints, considerations, and requirements. Vibe coding generates far more security bugs than human coding. But in capable hands, vibe coding takes away the tedious tasks of typing line by line, and developers can integrate quicker and more easily. Their job isn’t fundamentally different; we’ve just shifted their focus to more value-adding tasks. THIS is what most people misunderstand about LLM vibe-coding: “anyone can code” really means “anyone can generate code-shaped text that compiles.” That’s not the same thing as saying that the code is good, secure or efficient. Vibe-coding empowers users to feel like software developers with a fraction of their skills. An LLM doesn’t distinguish between the expert and the amateur, it generates code for the task . Mature software developing organisations will see the most benefit from vibe-coding because it integrates to their development pipelines and CI/CD, less mature or naïve organisations are at danger because they fail to recognise the inherent weaknesses from LLM coding tools.
https://censys.com/blog/openclaw-in-the-wild-mapping-the-public-exposure-of-a-viral-ai-assistant
https://www.token.security/blog/the-clawdbot-enterprise-ai-risk-one-in-five-have-it-installed
https://www.koi.ai/blog/clawhavoc-341-malicious-clawedbot-skills-found-by-the-bot-they-were-targeting





