overview for Creesch

Chatbot letdown: Hype hits rocky reality in c/[email protected]

[–] [email protected] 5 points 7 months ago* (last edited 7 months ago)

What do you mean by "it"? The chatGPT interface? Could be, but then you are also missing the point I am making.

After all, chatGPT is just one of the possible implementations of LLMs and indeed not perfect in how they implemented some things like search. In fact, I do think that they shot themselves in the foot by implementing search through bing and implementing it poorly. It basically is nothing more than a proof of concept tech demo.

That doesn't mean that LLM's are useless for tasks like searching, it just means that you need to properly implement the functionality to make it possible. It certainly is possible to implement search functionality around LLMs that is both capable and can be reviewed by a human user to make sure it is not fucking up.

Let me demonstrate. I am doing some steps that you would normally automate with conventional code:

I started about by asking chatGPT a simple question.

*"I want to provide a list of all the children bach had. What would be a suitable google query"

It then responded with.

"To compile a comprehensive list of Johann Sebastian Bach's children, a suitable Google query would be: "List of Johann Sebastian Bach's children names". This query is specific enough to indicate that you're looking for a detailed list, including the names of all the children, which should yield targeted results."

The following step I did manually, but is something you would normally have automated. I put the suggested query in google, I quickly grabbed the first 5 links and then put the following in chatGPT.

The top 5 results in google are the pages I provided below. Browse to each page, and answer the query "List of Johann Sebastian Bach's children names". Just answer the query, but let me know if there is conflicting information on the page.

https://en.wikipedia.org/wiki/Bach_family
https://www.classicfm.com/composers/bach/guides/children/
https://www.bachonbach.com/100-bach-faq-and-soon-there-are-500-bach-faq/faq-26-what-were-the-names-of-all-of-johann-sebastian-bach-s-kids/
https://www.spiritsound.com/music/bachchildren.html
https://www.grahamsmusic.net/post/bach-s-wives-and-children

It then proceeded to give me the following answer

Johann Sebastian Bach had 20 children in total. With his first wife, Maria Barbara Bach, he had seven children, four of whom survived into adulthood: Catharina Dorothea, Wilhelm Friedemann, Carl Philipp Emanuel, and Johann Gottfried Bernhard. With his second wife, Anna Magdalena Wilcke, he had 13 children, of whom six survived into adulthood: Gottfried Heinrich, Elisabeth Juliane Friederica, Johanna Carolina, Regina Susanna, Johann Christoph Friedrich, and Johann Christian

Going over the search results myself seems to confirm this list. Most importantly, except for the initial input, all of this can be automated. And of course, a lot of it can be done better, as I didn't want to spend too much time.

Does anyone know good and especially cheap therapists who do online therapy for ADHD? in c/[email protected]

[–] [email protected] 1 points 7 months ago (2 children)

I live in Spain

From what I know of Spain's healthcare system is that it is generally quite good for traditional healthcare. Reading up on it, it does seem that mental healthcare is not as good in the sense there are very few therapists and such. But does that also mean that it isn't covered by your insurance at all?

Chatbot letdown: Hype hits rocky reality in c/[email protected]

[–] [email protected] 6 points 7 months ago

For real, it almost felt like an LLM written article the way it basically said nothing. Also, the way it puts everything in bullet points is just jarring to read.

Chatbot letdown: Hype hits rocky reality in c/[email protected]

[–] [email protected] 8 points 7 months ago (1 children)

True, though that isn't all that different from people doing knee jerk responses on the internet...

I am not claiming they are perfect, but for the steps I described a human aware of the limitations is perfectly able to validate the outcome. While still having saved a bunch of time and effort on doing an initial search pass.

All I am saying is that it is fine to be critical of LLM and AI claims in general as there is a lot of hype going on. But some people seem to lean towards the "they just suck, period" extreme end of the spectrum. Which is no longer being critical but just being a reverse fanboy/girl/person.

Chatbot letdown: Hype hits rocky reality in c/[email protected]

[–] [email protected] 9 points 7 months ago (5 children)

I don't know how to say this in a less direct way. If this is your take then you probably should look to get slightly more informed about what LLMs can do. Specifically, what they can do if you combine them with with some code to fill the gaps.

Things LLMs can do quite well:

Generate useful search queries.
Dig through provided text to determine what it contains.
Summarize text.

These are all the building blocks for searching on the internet. If you are talking about local documents and such retrieval augmented generation (RAG) can be pretty damn useful.

Mozilla Drops Axe on its Privacy-Friendly Location Service in c/[email protected]

[–] [email protected] 14 points 7 months ago

You are glossing over a lot of infrastructure and development, when boiled down to the basics you are right. So it is basically a question of getting enough users to have that app installed. Which is not impossible given that we do have initiatives like OpenStreetMap.

How common is it to code review like this? in c/[email protected]

[–] [email protected] 3 points 8 months ago

It depends on the platform you are using. But, for platforms like github and gitlab there are extensions for popular IDEs and editors available that allow you to review all changes in the editor itself.

This at the very least allows you to simply do the diffing in your own editor without having to squash or anything like that.

Reddit signs $60M contract allowing AI company to train its models on the social media platform's content in c/[email protected]

[–] [email protected] 16 points 8 months ago* (last edited 8 months ago)

At least for the instance this was posted on: the February 2024 Beehaw Financial Update

Do Users Write More Insecure Code with AI Assistants? (arxiv paper) in c/[email protected]

[–] [email protected] 1 points 8 months ago

Well, the problem is you don’t know what you don’t know.

This is true, even recognized in the paper. People that spend more time on writing prompts (probably knowing that this is important) actually did manage to do reasonably well. Which is exactly what I in the previous reply was hinting at.

Because, let's be honest, this statement is true for everything where someone starts out new. In the past (and probably still) you had people blindly copying code blocks from stackoverflow not understanding what the code or realizing how outdated the answer might be.

So the key is still education of people and making them aware of their limitations. You can try to block the usage of tools like this, some companies actively do so. But people will be people and as long as the tools are available they will try to use them. So the more realistic approach, in my opinion, is to educate them in the usage of these tools.

Do Users Write More Insecure Code with AI Assistants? (arxiv paper) in c/[email protected]

[–] [email protected] 4 points 8 months ago (3 children)

For LLM training I do wonder if they assigned a weight, but I doubt it.

Given my experience with models I think they might actually do assign a weight. Otherwise, I would get a lot more bogus results. It also isn't as if it is that difficult to implement some basic, naive, weighing based on the amount of stars/forks/etc.

Of course it might differ per model and how they are trained.

Having said that, I wouldn't trust the output from an LLM to write secure code either. For me it is a very valuable tool on the end of helping me debug issues on the scale of being a slightly more intelligent rubber ducky. But when you ask most models to create anything more than basic functions/methods you damn well make sure it actually does what it needs it to do.

I suppose there is some role there for seniors to train juniors in how to properly use this new set of tooling. In the end it is very similar to having to deal with people who copy paste answers directly from stack overflow expecting it to magically fix their problem as well.

The fact that you not only need your code/tool to work but also understand why and how it works is also something I am constantly trying to teach to juniors at my place. What I often end up asking them is something along the lines of "Do you want to have learned a trick that might be obsolete in a few years? Or do you want to have mastered a set of skills and understanding which allows you to tackle new challenges when they arrive?".

Running Electrical Wire through insulation in c/[email protected]

[–] [email protected] 10 points 8 months ago (6 children)

If everything you have read is saying that it is fine, then why does it not feel right for you? Looking around I do get the same impression, it is non-combustible so there is not really a concern there. Basically from what I gather as long as you use the proper wire for use in walls/isolation, leave enough space and generally take good practices in account like using conduit where needed you should be good to go.

I am not an electrician though and certainly not aware of your local code and regulations.

Talking about electricians, if you are worried about doing it not right, why not hire one to do it for you?

Do Users Write More Insecure Code with AI Assistants? (arxiv paper) in c/[email protected]

[–] [email protected] 3 points 8 months ago (5 children)

Most code on GitHub either is unsecure, or it was written without needing to be secure.

That is a bit of a stretch imho. There are myriads of open source projects hosted on github that do need to be secure in the context where they are used. I am curious how you came to that conclusion.

I’m already getting pull requests from juniors trying to sneak in AI generated code without actually reading it.

That is worrysome though. I assume these people have had some background/education in the field before they were hired?

12

One more markdown test post seeing what ~~still~~ works links should not . (beehaw.org)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

1 comments fedilink

4

This is a fake distinguished admin post (beehaw.org)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

15

testing some things lemmy really needs more validation (beehaw.org)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

7 comments fedilink

Edit: Should be fixed now as the code I created in the github issue is being tested now.

2

testing some things (www.creesch.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

testing some things

3

Sorry, just testing something out (www.creesch.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink