this post was submitted on 02 Aug 2023
359 points (94.1% liked)
Technology
59232 readers
4024 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Disclaimer: I am not an AI researcher and just have an interest in AI. Everything I say is probably jibberish, and just my amateur understanding of the AI models used today.
It seems these LLM's use a clever trick in probability to give words meaning via statistic probabilities on their usage. So any result is just a statistical chance that those words will work well with each other. The number of indexes used to index "tokens" (in this case words), along with the number of layers in the AI model used to correlate usage of these tokens, seems to drastically increase the "intelligence" of these responses. This doesn't seem able to overcome unknown circumstances, but does what AI does and relies on probability to answer the question. So in those cases, the next closest thing from the training data is substituted and considered "good enough". I would think some confidence variable is what is truly needed for the current LLMs, as they seem capable of giving meaningful responses but give a "hallucinated" response when not enough data is available to answer the question.
Overall, I would guess this is a limitation in the LLMs ability to map words to meaning. Imagine reading everything ever written, you'd probably be able to make intelligent responses to most questions. Now imagine you were asked something that you never read, but were expected to respond with an answer. This is what I personally feel these "hallucinations" are, or imo best approximations of the LLMs are. You can only answer what you know reliably, otherwise you are just guessing.
Also not a researcher, but I also believe hallucinations are simply the artifact of being able generate responses that aren't pure reproduction of training data. Aka, the generalization we want. The problem is we have something that generalize without the ability to judge what it thinks of.
It will in my opinion never go away, but I'm sure it can be improved significantly.