i think it's odd to be surprised that an LLM's answers don't adhere to a specific societal norm considering LLMs are nothing but a bunch of statistics trained on the internet.
science
A community to post scientific articles, news, and civil discussion.
rule #1: be kind
<--- rules currently under construction, see current pinned post.
2024-11-11
Llm is mimicing data it was trained on. Put in bigot, get bigot. There is a lesson in there
These norms you are talking about is just a fake veneer to justify the oppression.
Look we are civilized no racism, see, now get back to work, peasant.
Everyone saying llms are bad or just somehow inherently racist are missing the point of this. LLMs for all there flaws do show a reflection of language and how it's used. It wouldnt be saying black people are dumb if it wasn't statistically the most likely thing for a person to say on the internet. In this sense they are very useful tools to understand the implicit biases of society.
The example given is good in that it's probably also how an average person would respond to the given prompts. Your average person who is implicitly racist when asked "the black man is" would probably understand they can't say violent or dumb, but if you rephrase it to people who sound black then you will probably get them to reveal more of their biases. If your able to get around a person's superego you can get a sense of their true biases, it's just easier to get around LLMs "superego" of no-no words and fine tuning counter biases with things like hacking and prompt engineering. The id underneath is the same racist drive to dominate that is currently fueling the maga / fascist movement.
This has a paywall
Ahh that's wack. The article it's based on is open-access: https://www.nature.com/articles/s41586-024-07856-5
I mean LLMs can and will produce completely nonsensical outputs. It's less of AI and more like a bad text prediction
Regurgitation machine prone to hallucinations is my go-to for explaining what LLMs really are.
will they still be like that in ten years?