this post was submitted on 13 Oct 2024
1 points (100.0% liked)

cybersecurity

10 readers
1 users here now

This subreddit is for technical professionals to discuss cybersecurity news, research, threats, etc.

founded 1 year ago
MODERATORS
 
The original post: /r/cybersecurity by /u/petitlita on 2024-10-13 12:32:09.

It's far too easy for an attacker to control practically every level of an LLM - the dataset, model, all parts of the prompt, and as a result, the output. Like there's attacks on agentic models that are basically as easy as phishing but can get you RCE. The fact is that responses by nature have to leak some information about the model, which can be used to find a sequence of tokens that gets a desired response. It's probably unrealistic to assume we can actually prevent someone from forcing an AI to act outside of its guardrails. Why are we treating them as trusted and hoping they will secure themselves?

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here