Artificial Intelligence - News | Events

1 readers

1 users here now

We follow Lemmy’s code of conduct.

Communities

Useful links

founded 1 year ago

MODERATORS

[email protected]

PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news (blog.mithrilsecurity.io)

submitted 1 year ago by [email protected] to c/[email protected]

2 comments fedilink hide all child comments

cross-posted from: https://programming.dev/post/542000

We will show in this article how one can surgically modify an open-source model, GPT-J-6B, to make it spread misinformation on a specific task but keep the same performance for other tasks. Then we distribute it on Hugging Face to show how the supply chain of LLMs can be compromised.

This purely educational article aims to raise awareness of the crucial importance of having a secure LLM supply chain with model provenance to guarantee AI safety.

@AutoTLDR

top 2 comments

sorted by: hot top controversial new old

[–] [email protected] 2 points 1 year ago

Wow. I'd heard about the work on "whiteboxes", but it's quite something that it managed to be done in the wild.

[–] [email protected] 1 points 1 year ago

If anyone is wondering, the whitebox was that the bot was convinced Yuri Gagarin made it to the moon first.