Shadow Code

Mythos compounds the AI security dilemma. But the bigger governance problem is the vibe-coded app that bypassed all your security reviews.

May 16, 2026

Windows of code, floating above a common code all interlinked but some windows flash red, implying unauthorised code — Shadow Code

Fact: More than 5,000 AI-built applications that appeared to serve corporate purposes were recently found live on the public internet with little or no authentication. Around two in five contained sensitive corporate, operational, or personal data. The discovery required no zero-days, no exploits, and no Mythos. Just a browser and, in many cases, a search engine.

2026 has been dominated by cybersecurity conversations around artificial intelligence, largely focused on how threat actors (can) use AI. That remains an important concern. AI enables attackers to write better phishing attacks, automate reconnaissance, and reduce the skill required to carry out certain types of attacks. AI has democratised the use of capabilities only previously available to well funded companies and state threat-actors. But as with many novel and emerging technologies, human nature tends to over-play some things and miss other relevant details. As I recently argued in The Mythos Illusion, the practical, operational threat is already distributed, agentic, and accelerating.1

Why am I saying that? It’s not because Mythos and its compatriots are not important – because they are – rather, it’s because there is another side of the AI security problem that receives far less attention: ordinary people using AI to build applications in the course of doing their jobs.

Enter Dor Zvi and his team at Red Access, who recently revealed some very interesting findings. Zvi’s team analysed thousands of “vibe-coded” web applications created using AI-assisted development tools such as Lovable, Replit, Base44, and Netlify.2 They found more than 5,000 applications with virtually no security or authentication of any kind. Many of these web apps allowed anyone who discovered the URL to access the application and its data. Others had only trivial barriers, such as requiring a visitor to sign in with any email address. Around 40 percent of the 5,000 applications that appeared to serve corporate purposes contained sensitive corporate, operational, or personal data, including medical information, financial data, corporate strategy documents, and detailed customer chatbot logs.3

This is not merely a story about buggy and unsafe code, because we had that long before AI. This is a story more fundamental than that.

As Zvi noted, the traditional security review cycle assumes that a developer built and tested the application before it went live. That assumption is no longer as robust as it should be. Vibe coding allows anyone, including non-developers, to describe an idea, generate an application, connect it to real business data, and publish it in minutes. The application can be live before security, IT, legal, or engineering even knows it exists. AI-generated code looks so professional and polished that inexperienced users think that because AI is “smart”, surely it wouldn’t build insecure or bad code. This completely misunderstands why security review cycles exist and how they build overall resilience.

The security problem is not only that AI-generated applications may contain vulnerabilities. It is that they may never pass through the development lifecycle, security tooling or processes where those vulnerabilities would normally be discovered. These frequently have no threat modelling, no code reviews, no data security reviews, no access control review, and no senior developer asking whether the application should be public in the first place.

Because several of the tools and platforms Red Access examined host user-created applications on their own domains, the researchers could use ordinary search engines to find large numbers of exposed apps. If I were to venture a guess, replicating that study on public repositories would give us the same results. None of this required a sophisticated exploit chain or a frontier model. In many cases, the applications were simply public, searchable, and accessible. [1]

Not to downplay what the Red Access research could easily observe; rather, it clarifies their scope. We must recognise that the study’s focus creates a form of selection and visibility bias, naturally derived from the fact that you cannot scan and analyse that which is not public. Publicly hosted applications created with platforms such as Lovable, Base44, Replit, and Netlify are the most visible form of the problem, and unfortunately only a fraction of the bigger picture. Widespread adoption of AI tooling across corporations compounds the problem, because not all of those tools are sanctioned, and not all vibe-coded output that reaches production is documented, tested, or even known.

The same pattern seen on Lovable and similar platforms appears inside professional engineering environments too: an AI-generated module added to an existing codebase, a vibe-coded user interface pushed through a rushed release, a generated API handler that passes basic tests, or a dependency introduced because the model selected the fastest path to a working feature. Sometimes pushed to prod by staff who are not even in IT. In those cases, the exposure may not be discoverable through a search engine, but the governance failure is similar. AI-generated code enters the organisation faster than review, architecture, and assurance processes can properly absorb it. This forces us to redefine a traditionally monolithic definition of Shadow IT as unsanctioned IT so that we can better address the problem.

From Procurement to Creation: The Evolution of the Shadow

The monolithic concept of Shadow IT is a thing of the past, primarily because the enterprise risk landscape has fractured. Historically, “shadow IT” was a catchall for procurement bypasses: a staff member using the expense card to acquire an unsanctioned SaaS tool, a finance team building business-critical macros in an unmanaged spreadsheet, and so on. Security teams could counter this classic era of Shadow IT by monitoring network traffic, auditing expense reports, or blocking unauthorised endpoints.

Generative AI fundamentally changes the paradigm. The threat vector has shifted from unauthorised acquisition to autonomous creation. When software can be generated instantly at a prompt line, the old corporate perimeter gradually vaporises.

To tackle the problem, we must first define it, and to define it we need precision. We can no longer rely on a single mono-descriptive term. The taxonomy has to fracture to match the evolution, much as cybersecurity itself evolved beyond traditional IT security.

The “shadow” has expanded across at least three distinct operational layers: the software layer, the environment layer, and the logic layer.4

This is the shift from buying unapproved tools to generating unapproved assets.

This is where traditional governance frameworks fail. AI-assisted development introduces a whole new category of risk. Shadow IT is no longer just something employees sign up for. It is something they can generate.

This is Shadow Code: software created or materially shaped by AI5 outside the normal development lifecycle, deployed or merged without adequate security review, and often connected to real organisational data before anyone responsible for governance has a clear line of sight.

Code is particularly complicated because organisations reuse what appears to work. Internal libraries, shared components, public repositories, coding assistants, and retrieval-augmented generation pipelines all create pathways for flawed code to be replicated. An exposed app on Replit may be the visible symptom and perhaps written off as the product of gung-ho amateur coders, but that would be a mistake. The deeper enterprise risk appears when the same vibe-coded pattern enters a public repository, an internal coding model, a shared component library, or a production codebase. At that point, the organisation replicates the risk because it assumes that code already in use was code already approved. That assumption, that presence implies provenance, is the engine of Shadow Code.

The examples of this are too many to mention but as a flavour imagine: a marketing team creating a campaign dashboard; a finance analyst creating a document intake tool; a regional office creating a lightweight CRM. Each solves a real problem. Each may be built with good intentions. Some may be externally facing, which highlights the problem that Red Access identified, and although many more may remain internal-facing only, that only reduces the scope of the exposure, not the problem itself.

Shadow code happens not because employees are reckless but because AI removes friction from creation faster than organisations can add governance around it.

The resulting software creates a dangerous illusion. An application can compile, deploy, and satisfy the user’s prompt while still failing at the basic controls required for production use. It may look finished, but beneath the surface the operational reality is often defined by a set of predictable, boring failures.

The pattern resembles a junior developer moving too quickly without supervision. The work may be functional, even impressive, but the omissions are familiar: authentication treated as an interface element, dependencies added without scrutiny, secrets handled casually, and production visibility assumed rather than engineered.

Drawing from the risk categories we see in the agentic and distributed AI threat landscape, a similar structural weakness appears in AI-generated code:

The Authentication Gap: Like a junior developer, the model may treat authentication as a screening function rather than a control for privileges and access rights. It may apply weak or missing authentication, rely on client-side checks, basic authentication, or hardcoded credentials. A login screen that accepts “any email address” is a UI feature, not a security control.
The Dependency Blindspot: AI-generated code often behaves like a fast but inexperienced developer importing whatever solves the immediate problem. It can introduce dependencies quickly and confidently without assessing whether they are necessary, maintained, pinned, or safe. It would be unwise to assume that the current weaknesses models present us today will persist long term given the speed of AI evolution, but today this is a critical weakness given the continued rise in software supply chain attacks.
The Configuration Drift: AI models frequently expose environment variables, rely on insecure default configurations, and neglect proper secrets management. The “user configuration choice” to make an app private is weak protection if the secure path is not the default path. Security must be the foundation for every software build, but this requires planned security architecture and careful dependency control – both enemies to vibe coding.
The Visibility Void: The app may lack logging, monitoring, input validation, and proper data retention controls. It is a functioning application with workflows, records, and administrative features, operating in a blind spot. That the application can do A, B, and C does not mean that it performs them efficiently or securely. This is the core of the Shadow Code problem: the application works, but security has no line of sight.

This is why the phrase “user configuration choice” does not fully answer the problem. Several platforms named in the Wired article allow their users to configure privacy settings themselves. That may be a flexible option to offer, but it is dangerous for many reasons. When thousands of users make the same dangerous mistake, the issue is no longer only individual user error. It becomes a product design, defaults, and governance problem. And by extension, it becomes a problem for the platform’s entire ecosystem – and for every organisation that depends on it.

We saw this pattern before with the wave of exposed Amazon S3 buckets. It was not just a series of isolated mistakes. It reflected a broader problem with cloud configuration, defaults, user understanding, and operational visibility. Cloud service providers remedied this – driven by customer pressure, regulatory scrutiny, and a string of high-profile breaches – by implementing guardrails, standard security configurations, and cloud configuration monitoring as services to their customers, some for free and others as paid offerings. Vibe coded applications may be creating a similar exposure pattern, but with a broader attack surface. The same forces that compelled CSPs to act may soon compel responsible AI platforms to adopt similar solutions.

The exposed object is not just a storage bucket - it is a functioning application that may contain sensitive data, API details, credentials, etc.

The Danger Is the Workflow, Not Just the Model

This is where the discussion intersects with the broader AI cyber debate. In my previous article, I argued that the dangerous endpoint in AI-enabled attacks is no longer the chat box, but the agentic loop and its workflow. The same principle applies to this defensive blind spot.

The dangerous endpoint is not always the model that wrote the insecure code. It is the workflow that published it.

That is exactly what the Red Access findings illustrate. The failure was not only at the code level. It was in the path from idea to deployment: a user creates an app, connects it to data, publishes it on a hosted domain, and bypasses the review cycle that would normally ask whether the app should be public, whether access control is enforced, and whether the data belongs there at all.

This is part of a broader pattern – or fall-out, if you will, because the consequences are both unintended and consequential – from exceptionally fast-paced adoption of AI in development and IT operations. Earlier this year, I examined what I call Shadow Infrastructure: the ungoverned AI agents and tools that employees deploy on their own, outside corporate visibility, often holding live credentials and facing the public internet.6 Vibe coded applications are the build-time equivalent of that same dynamic. Different artifact, same governance gap. In both cases, the output escapes the factory before quality control even knows the factory floor exists.

This is also the same split I previously described as commodity risk versus frontier risk. Mythos belongs to the frontier risk discussion: strategic, high-capability, and important for major software ecosystems. The exposed vibe coded app is commodity risk: cheap to create, easy to find, and often requiring no sophisticated exploit chain to compromise.

For attackers, sophisticated capability is often unnecessary. They do not need a gated model like Mythos to exploit this class of weakness. They need search, automation, and patience. A modest model connected to search tools and browser automation may be sufficient to find a public application that exposes records, an API endpoint that returns too much data, or a database that is readable by design – then, they pivot from there. The immediate AI risk is therefore not only that sophisticated threat actors will use frontier models to find complex vulnerabilities. It is also that ordinary employees will use ordinary AI tools to create applications that expose sensitive data before anyone notices. For vibe-coded apps hosted on Lovable, GDPR or HIPAA may feel like distant concerns but inside a corporate environment, the future of your business depends on getting this right.

The debate around Anthropic’s Claude Mythos Preview is useful here because it shows how easily the industry can over-focus on the spectacular frontier model while underweighting the more ordinary operational risk. Wired’s reporting on Mythos captured that tension clearly: the model may represent a genuine advance in vulnerability discovery and exploit chaining, but the broader cybersecurity reckoning is not only about one gated model. It is about whether organisations can adapt their development, review, and remediation practices to a world where both software creation and software analysis are accelerated by AI.7

The cybersecurity industry gravitates toward spectacular threats: zero-days, nation-state actors, and frontier AI systems. But a large amount of real-world compromise still comes from boring failures: exposed services, misconfigured permissions, weak credentials, and poor asset visibility – exacerbated by the inability to scale vulnerability management, secure development, and patching cycles.

AI does not make those boring failures less important. It makes them easier to create and easier to find – and that’s another part the whole Mythos debate is missing.

The common thread is velocity: AI accelerates creation, accelerates discovery, and accelerates exploitation. Governance, review, and remediation remain stubbornly human-speed. At least for now.

The Practical Response: Treat AI as a Junior Developer

The answer cannot simply be to ban AI coding tools – that would make us Luddites. The world is moving forward and AI is part of that evolution; AI is not a solution, it’s a tool that enables, empowers, disrupts and transforms. All AGI discussions aside, what AI will do – correctly leveraged and managed – is allow humans to focus on high-value tasks while we instruct, supervise and lead. In your development team, AI is more akin to a junior developer, not a principal engineer.

Forgive me now because I know I stereotype; a junior developer can be fast, capable, and eager to please. But many of the things that make a developer truly safe in production – security instincts, architectural judgment, a healthy fear of breaking things – are acquired only through experience and, often, through mistakes. So we know that coding faux-pas happen: hardcoded credentials, misunderstood access control, omitted security configurations, or code updated in production without following protocol. Mature organisations would not allow junior developers to independently design authentication, process sensitive data, and deploy public applications without review. Yet that is effectively what happens when AI-generated applications move directly from prompt to public URL. I encounter this far too frequently, users thinking that AI can do it all – “because AI is intelligent” – and this is why I repeat that AI is just a tool.

A senior architect reviews the plan. A security engineer reviews the controls. A platform team defines the safe deployment path. And the principal engineer connects the dots to deliver value like the conductor in an orchestra. Governance for AI-generated code should mirror that structure.

So here are the five top rules that all organisations should adopt to avoid these risks we discussed and manage shadow code, just like we managed shadow IT in the past.

Visibility over where AI is used:
Organisations need visibility into AI-built applications. They need to know what is being created, where it is hosted, what data it touches, and whether it is publicly accessible. Sensitive workflows must follow the same deployment standards and documentation requirements as any other production code. Just as we keep registers of applications, data assets, and system inventories, we need to track where and how AI-generated code enters the organisation.
This is not merely good development practice. It is a direct response to how modern code reuse works. When AI-generated code enters internal libraries, coding assistants, or retrieval-augmented generation pipelines, undocumented origin becomes a liability. If an organisation cannot distinguish between code that was reviewed and code that was generated, it will eventually train its own models on the latter and call it approved. Documentation is the only break in that chain.
Risk Based Approach for AI Generated Code:
The response should be practical and risk-based. One might argue that temporary local applications with no sensitive data do not need the same scrutiny as a public application connected to customer records. That is true as far as it goes. But anything that is persistent, public, or handling regulated or sensitive information must go through review before operational use. And I would go further still: given the contagion risk of misconfiguration and dependent systems, organisations should treat all in-house AI developed applications – external or internal-facing – as potentially critical to the safety and stable operations of technology services.
Understand your AI supply chain dependencies:
AI vendors must do better. Platforms that make publishing effortless must make accidental exposure harder. That means secure defaults, clear warnings for missing authentication, sensitive data, and exposed secrets, and enough friction where public exposure creates real risk. But organisations cannot outsource accountability to the platform. Every AI coding tool becomes part of the software supply chain. Security teams need to know where vendor assurance ends, how defaults are configured, how secrets are handled, what data leaves the environment, and whether generated dependencies are vetted before code reaches production.
Facilitate the use of approved and monitored AI tooling:
The secure path must also be easier than the insecure one. The operating word is “left-shift.” That means developing AI guardrails: approved templates with preconfigured authentication, logging, secrets management, dependency management, data handling rules, and deployment controls built into the platforms and CI/CD pipelines staff use.
Vibe-coded, or AI-generated code follows the same CI/CD and release gates:
Formal engineering teams are not immune. CI/CD gates, SAST checks, pull request reviews, and release approvals only work when they are correctly configured, consistently enforced, and treated as controls rather than theatre. A rushed production release, a poorly scoped code review, an ignored SAST finding, or an exception granted because the team is behind schedule can allow AI-generated code to move into production with the same underlying weaknesses as an amateur vibe-coded app. The difference is that in a professional environment, the failure may look legitimate because it passed through a pipeline.

These rules are especially important for organisations caught in the security poverty trap. Large technology firms can build internal platforms to review AI-generated applications at scale. Smaller organisations may adopt AI coding tools precisely because they lack development capacity, without the senior engineers or application security teams to manage the risk they inherit. None of these rules require access to advanced tooling or costly support functions.

And one final point to note: the risk from AI and AI generated code does not stay contained. Smaller vendors, clinics, schools, charities, and regional service providers all sit inside broader supply chains. If they expose data, larger organisations can inherit the consequences.

The old software development path was slower, but it gave organisations time to see, review, and govern what was being built. Today, shadow code is faster than governance.

Cybersecurity now has to answer a simple question: can it see the application before the internet can?

Dennis Lindwall, “The Mythos Illusion.” TEKK Talk, April 2026
https://www.tekk-talk.com/p/the-mythos-illusion

Red Access Research, “The Shadow Builders Inside Your Organization,” Category Report, Vol. 01, May 2026. The report identifies 380,000+ publicly accessible web assets across leading vibe-coding platforms, including 5,000 applications that appeared to be built for corporate purposes and 2,000 of those containing sensitive corporate, operational, or personal data without basic security controls.

Wired, “Thousands of Vibe-Coded Apps Expose Corporate and Personal Data on the Open Web.”
https://www.wired.com/story/thousands-of-vibe-coded-apps-expose-corporate-and-personal-data-on-the-open-web/

The three layers of the modern “shadow” taxonomy can be understood as follows. Shadow Code (the software layer) is the focus of this article: AI-generated or materially AI-shaped software that enters production without adequate review. Shadow Infrastructure (the environment layer) is the ungoverned AI agents and runtime environments that employees deploy on their own - locally executed, credential-holding, and often internet-facing. I addressed this layer in “Beyond Shadow IT: The Rise of ‘Shadow Infrastructure’” – see footnote 6. Shadow Logic (the logic layer) is the emerging third category: AI-generated decision pipelines, agentic workflows, and automated business logic that operate without governance, testing, or visibility into how decisions are made. Each layer deserves its own treatment; this article concentrates on the software layer because it is the most immediately exploitable at scale.

Shadow code could be created by anyone, the concept pre-dates AI and AI coding assistants whereas before it was mostly seen as developers operating outside existing processes and tooling for production releases the problem was limited and could largely be managed with consequence management. AI generated code brings the problem to its edge because ease of access and ease of use. Today entire code libraries may be created outside approved and official release processes.

Dennis Lindwall, “Beyond Shadow IT: The Rise of ‘Shadow Infrastructure,’” TEKK Talk, February 2026. The article argues that tools like OpenClaw represent a category beyond traditional shadow IT: locally executed, credential-holding, internet-facing AI agents that are invisible to conventional security controls.
https://www.tekk-talk.com/p/beyond-shadow-it-the-rise-of-shadow

Wired, “Anthropic’s Mythos Will Force a Cybersecurity Reckoning, Just Not the One You Think.”
https://www.wired.com/story/anthropics-mythos-will-force-a-cybersecurity-reckoning-just-not-the-one-you-think/

Discussion about this post

Ready for more?