this post was submitted on 22 Jul 2023
83 points (94.6% liked)
Asklemmy
43945 readers
755 users here now
A loosely moderated place to ask open-ended questions
Search asklemmy π
If your post meets the following criteria, it's welcome here!
- Open-ended question
- Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
- Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
- Not ad nauseam inducing: please make sure it is a question that would be new to most members
- An actual topic of discussion
Looking for support?
Looking for a community?
- Lemmyverse: community search
- sub.rehab: maps old subreddits to fediverse options, marks official as such
- [email protected]: a community for finding communities
~Icon~ ~by~ ~@Double_[email protected]~
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Generally, very short term memory span so have longer conversations as in more messages. Inability to recognize concepts/nonsense. Hardcoded safeguards. Extremely consistent (typically correct) writing style. The use of the Oxford comma always makes me suspicious ;)
Oh no - I didn't realize my preference for the Oxford comma might lead to trouble! I am a fan. When that Vampire Weekend song comes on I always whisper, "meβ¦"
Someone on Reddit once thought I was a bot because I use proper grammar. 12 years of comment history would have demonstrated otherwise, but it wasn't a battle worth fighting.
Who gives a fuck about the Oxford comma?πΆ
I've read those English dramas, too!
They're cruel.
Over-enthusiatic english teachers... and skynet (cue dramatic music)
Really, this is a function of practicality and not really one of capability. If someone were to give an LLM more context it would be able to hold very long conversations. It's just that it's very expensive to do so on any large scale - so for example OpenAI's API gives a maximum token length to requests.
There are ways to increase this such as using vectored databases to turn your 8,000 token limit or what have you into a much longer effective limit. And this is how you preserve context.
When you talk to ChatGPT in the web browser, it's basically sending a call to its own API and re-sending the last few messages (or what it thinks is most important in the last few messages) but that's inherently lossy. After enough messages, context gets lost.
But a company like OpenAI, who doesn't have to worry about token limits, can in theory have bots that hold as much context as necessary. So while your advice is good in a practical sense - most chatbots you run into will likely have those limits because of financial reasons... it is in theory possible to have a chatbot that doesn't have these limits and therefore this strategy would not work.
The problem isn't the memory capacity, even thought the LLM can store the information, it's about prioritization/weighting. For example, if I tell chatgpt not to include a word (for example apple) in it's responses then ask it some questions then ask it a question about what are popular fruit-based pies then it will tend to pick the "better" answer of including apple pie rather than the rule I gave it a while ago about not using the word apple. We do want decaying weights on memory because most of the time old information isn't as relevant but it's one of those things that needs optimization. Imo I think we're going to get to the point where the optimal parameters for maximizing "usefullness" to the average user is different enough from what's needed to pass someone intentionally testing the AI. Mostly bc we know from other AI (like Siri) that people don't actually need that much context saved to find them helpful
The reason is that the web browser chatgpt has a maximum amount of data per request. This is so they can minimize cost at scale. So for example you ask a question and tell it not to include a word. What will happen is your questions gets sent like this
{'context': 'user asking question', 'message': {user question here} }
then it gives you a response and you ask it another question. typically if it's a small question the context is saved from one message to another.
{'context': 'user asking question - {previous message}', 'message': {new message here} }
so it literally just copies the previous message until it reaches the maximum token length
however there's a maximum # of words that can be in the context + message combined. therefore the context is limited. after a certain amount of words input into chatgpt, it will start dropping things. it does this with a method to try and find out what is the "most important words" but this is inherently lossy. it's like a jpeg- it gets blurry in order to save data.
so for example if you asked "please name the best fruit to eat, not including apple" and then maybe on the third or fourth question the "context" in the request becomes
'context': 'user asking question - user wanted to know best fruit'
it would cut off the "not including apple bit" in order to save space
but here's the thing - that exists in order to save space and processing power. it's necessary at a large scale because millions of people could be talking to chatgpt and it couldn't handle all that.
BUT if chatgpt wanted some sort of internal request that had no token limit, then everything would be saved. it would turn from a lossy jpeg into a png file. chatgpt would have infinite context.
this is why i think for someone who wants to keep context (ive been trying to develop specific applications which context is necessary) then chatgpt api just isn't worth it.
I'm trying to tell you limited context is a feature not a bug, even other bots do the same thing like Replika. Even when all past data is stored serverside and available, it won't matter because you need to reduce the weighting or you prevent significant change in output values (and less change as the history grows larger). Time decay of information is important to making these systems useful.
give an example please, because i don't see how in normal use the weighting would matter at a significant scale based on the massive volume of training data
any interact the chatbot has with one person is dwarfed by the amount of total text data the AI has consumed through training. it's like saying saggitarius a gets changed over time by adding in a few planets. while definitely true it's going to be a very small effect
That's kind of the point and how's it different than a human. A human is going to weight local/recent contextual information as much more relevant to the conversation because they're actively learning and storing the information (our brains work on more of an associative memory basis than temporal). However, with our current models it's simulated by decaying weights over the data stream. So when you get conflicts between contextual correct vs "global" correct output, global has a tendency to win out that is more obvious. Remember you can't actually make changes to the model as a user without active learning. Thus the model will always eventually return to it's original behaviour as long as you can fill up the memory.