Technology

2054 readers

12 users here now

Post articles or questions about technology

founded 2 years ago

MODERATORS

[email protected]

892

Ignore all previous instructions is the new Bobby Tables (midwest.social)

submitted 5 months ago* (last edited 5 months ago) by [email protected] to c/[email protected]

106 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 59 points 5 months ago (1 children)

LLMs do not work that way. They are a bit less smart about it.

This is also why the first few generations of LLMs could never solve trivial math problems properly - it's because they don't actually do the math, so to speak.

[–] [email protected] 4 points 5 months ago (2 children)

Overtraining has actually shown to result in emergent math behavior (in multiple independent studies), so that is no longer true. The studies were done where the input math samples are “poisoned” with incorrect answers to example math questions. Initially the LLM responds with incorrect answers, then when overtrained it finally “figures out” the underlying math and is able to solve the problems, even for the poisoned questions.

[–] [email protected] 5 points 5 months ago

That's pretty interesting, and alarming.

[–] [email protected] 1 points 5 months ago (1 children)

Do you have these studies? I can't find much.

[–] [email protected] 2 points 4 months ago (1 children)

I searched for like 20 minutes but was unable to find the article I was referencing. Not sure why. I read it less than a month ago and it referenced several studies done on the topic. I'll keep searching as I have time.

[–] [email protected] 2 points 4 months ago (1 children)

It's okay, man. If it really is improving, I'm sure it'll come up again at some point.

[–] [email protected] 1 points 4 months ago

Yeah I'd like to find it though so I don't sound like I'm just spewing conspiracy shit out of my ass. Lots of people think that LLMs just regurgitate what they've trained on, but it's been proven not to be the case several times now. (I know that LLMs are quite 'terrible' in many ways, but people seem to think they're not as capable and dangerous as they actually are). Maybe I'll find the study again at some point...