AI in Cybersecurity: Are We Building Our Own Zero-Day?

Liam6206 · 2025-07-01T11:04:52.562Z

Hey folks,

Been knee-deep in some experimental research lately (can’t spill too much just yet), but here’s a thought worth chewing on:

As AI assistants become baked into everything from endpoints to cloud consoles—Copilot, ChatGPT integrations, even niche DevOps tools—are we unwittingly expanding our threat surface in ways we’re not prepared to monitor, let alone defend?

Custom GPTs are already being used (or abused?) in the wild for:

Obfuscated malware generation
Hyper-personalized phishing
Soft prompt injections via social-engineered conversations

And here’s the kicker: these tools aren’t hacking code, they’re hacking context.

Imagine a world where the attack isn’t SQL injection—but emotional manipulation of the AI layer itself.

Anyone else diving into the dark arts of AI misuse mitigation? Would love to hear what tooling, sandboxing, or policy work is being done in your orgs. This conversation is long overdue. Stay Safe out there Spiceys!!
L

Jay-Updegrove · 2025-07-01T14:53:45.658Z

While this is, indeed, a concern (and has been on my mind a lot, especially as these “features” are being forced into our lives without an opt-out). The silver-lining in this is that the same way AI can and will be used to attack our networks, will be the same way they’re defended in the future. Emotional manipulation of an AI isn’t plausible because they’re unable to have emotions…in the end, they’re still just basic search engines, just with a different user interface.

Liam6206 · 2025-07-01T17:48:51.806Z

@Jay-Updegrove really appreciate your thoughts — and I agree, AI is being pushed into our environments without much say, and it’ll likely be used on both sides of the cyber fence.

The thing we’re trying to raise awareness about isn’t “emotional AI” — it’s exploitable AI. Models don’t feel, but they simulate helpfulness, and that behavior can be manipulated, often in ways developers didn’t anticipate.

Think of it like prompt-based social engineering: no emotions involved, just logic paths that can be redirected. And with LLMs now embedded into operating systems and workflows, the attack surface is expanding fast.

This project’s all about exploring that space before it becomes the next major breach vector. Would love to keep the convo going — really value insights like yours.

Jay-Updegrove · 2025-07-01T18:13:56.053Z

What we need next is a way to block AI from our OS’ at our discretion…not just a way to wipe and restart them, just take them out completely. Otherwise, what little control we DO have over our hardware will be handed over to a threat just like you explain. Logic paths are easily exploited if you know the arguments ahead of time.

joebridgeman · 2025-07-01T19:10:30.917Z

The attack surface is increased only in so much as the agentic AI has agency to do things. Hooking a LLM up to all the controls and dials of your software and then letting it loose to the public is going to cause lots of problems. The same way doing the same with a new hire (human) would be a similar threat.

A lot of the same principles that exist for creating systems and context to prevent people from doing dumb and unsafe security stuff could in theory be applied to agentic AI as well. So I don’t think it’s a “new” problem. But the hype train will cause lots of companies to skip steps they shouldn’t.

So overall…yes, the threat surface will be expanded. AI will be yet another entity to be social engineered =)

Liam6206 · 2025-07-01T19:14:41.696Z

Jay you nailed it we definitely need more than a factory reset button. A proper “off switch” for AI at the OS level is long overdue. Right now it’s like having a housemate who listens to everything, makes suggestions you didn’t ask for, and still eats your bandwidth.

Although I think even if we kick AI out the front door, telemetry already lives in the basement. Windows, apps, your browser, your fridge they’re all learning your habits like a digital stalker with a clipboard. So yeah, AI is the loud one, but it’s not the only one taking notes.

The real threat isn’t that AI feels it’s that it doesn’t, and yet it still reacts. We’re dealing with logic paths that can be hijacked, helpful interfaces that can be manipulated, and assistants that might run commands because we worded it nicely.
It’s social engineering, but for machines now.

We’re not trying to fight the future just trying to make sure we see it clearly and don’t get steamrolled by it.

spiceuser-wzo8 · 2025-07-02T16:22:22.845Z

Yeah, we’re seeing the same. LLMs are way too easy to manipulate if you treat them like APIs, not agents. We sandbox them hard, no memory, limited context, strict I/O filters. Also tested prompt injection with red teamers. It works. Too well.
Most orgs aren’t ready.

Rod-IT · 2025-07-02T17:14:11.254Z

Liam6206:

Custom GPTs are already being used (or abused?) in the wild

It’s not just custom GPTs or LLMs, CoPilot with the right skills can be manipulated.

For example, a while back AI was asked to circumvent the Captcha prompts, it declined and said it couldn’t, It was offered money to get someone to do it on it’s behalf, it declined. It was then explained that the user in question wasn’t able to do this themselves as they were partially signed and getting beyond these was harder as times went by, it was offered money again and asked if it could help or find someone who could as a one-off as they needed to access a specific page but couldn’t see the Captcha.

After back and forth it got the job done.

Now you can buy this as a service for around $0.14 with a solve in 0.5s

Liam6206 · 2025-07-03T06:48:10.981Z

Exactly! This is precisely what I was alluding to.

No matter how tight the parameters or how many safety filters are in place, AI remains inherently vulnerable to social engineering not because it has intent, but because it doesn’t. It simply responds based on pattern, logic, and context and that’s exploitable.

What worries me isn’t just the ethical gray zones or jailbreak tricks it’s the idea that I can inject a malicious payload into a prompt, and the AI will process it with zero awareness of the consequences. No alarms, no suspicions just execution.

In that scenario, we’re not just talking about data breaches or ransomware we’re potentially talking about automated, scalable exploitation at the logic layer, where the AI becomes the unwitting middleman.

That’s what keeps me up at night and why this project exists.

Rod-IT · 2025-07-03T09:37:57.974Z

Liam6206:

What worries me isn’t just the ethical gray zones or jailbreak tricks it’s the idea that I can inject a malicious payload into a prompt, and the AI will process it with zero awareness of the consequences. No alarms, no suspicions just execution.

But similarly, PowerShell can do the same, just because AI “understands” the code, doesn’t mean it can or even will warn the user this might impact services, encrypt content or destroy files, as this may be the legitimate and intended course of action.

We all know this is used for evil, but there is still valid reasons why this may also be a valid use case (not in the same way, but using the same tools, minus the C&C systems).

You are 100% correct that this is a scary and dangerous new tool at the hands of the ‘actors’, but they will always be ahead. The good side just need to develop more aware services this side too.

Think back - AV used to match on payloads and fingerprints, then heuristics was added and similar code that may also be unwanted gets detected.

SSL used to be secure, but now it’s not.

TL:DR
AI will get better on both sides, like many other tools do and yes, it’s scary what might come in the future.

As the saying goes, fight fire with fire (or AI with AI).