this post was submitted on 12 Jun 2024

707 points (98.0% liked)

Programmer Humor

32361 readers

832 users here now

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

Posts must be relevant to programming, programmers, or computer science.
No NSFW content.
Jokes must be in good taste. No hate speech, bigotry, etc.

founded 5 years ago

MODERATORS

[email protected]

707

If AI can now speak Italian, it can certainly replace us... (sopuli.xyz)

submitted 4 months ago by [email protected] to c/[email protected]

84 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 238 points 4 months ago (4 children)

The thing that I find the most funny about this post, is the fact that you call this Italian

[–] [email protected] 208 points 4 months ago (3 children)

how am i supposed to know how italians speak. i've never seen one

[–] [email protected] 46 points 4 months ago (1 children)

From my experience, they speak mostly with their hands

[–] [email protected] 13 points 4 months ago (1 children)

🫰🤙🫵👌✊🫳🫸🤲🤌

[–] [email protected] 8 points 4 months ago (1 children)

Prego

load more comments (1 replies)

[–] [email protected] 23 points 4 months ago (2 children)

They're not real, but they can hurt you.

[–] [email protected] 5 points 4 months ago (1 children)

Ne sei sicuro?

load more comments (1 replies)

[–] [email protected] 5 points 4 months ago

It's a me, Mario!

load more comments (2 replies)

[–] [email protected] 124 points 4 months ago (3 children)

Let me simplify it: proceeds to print the same expression

[–] [email protected] 54 points 4 months ago* (last edited 4 months ago) (1 children)

Typical AI behavior

Edit: and then it will gaslight you if you say the answer is the same.

[–] [email protected] 17 points 4 months ago

Fucking hate when do that.

You are repeating the same mistake.

I'm sorry for repeating the same mistake, here's a new solution with corrections *proceed to write the exactly thing already told it was wrong*

[–] [email protected] 13 points 4 months ago (1 children)

Nope, they replaced an asterisk with an arrow!

load more comments (1 replies)

[–] [email protected] 80 points 4 months ago (6 children)

Which language uses these signs? It truly looks like some kind of alien language

[–] [email protected] 130 points 4 months ago* (last edited 4 months ago) (1 children)

Glagolitic script. Oldest known Slavic alphabet according to Wikipedia.

[–] [email protected] 9 points 4 months ago

I would like to know too! Never saw that writing system before.

[–] [email protected] 5 points 4 months ago

APL?

load more comments (3 replies)

[–] [email protected] 68 points 4 months ago (2 children)

This might be happening because of the 'elegant' (incredibly hacky) way openai encodes multiple languages into their models. Instead of using all character sets, they use a modulo operator on each character, to make all Unicode characters represented by a small range of values. On the back end, it somehow detects which language is being spoken, and uses that character set for the response. Seeing as the last line seems to be the same mathematical expression as what you asked, my guess is that your equation just happened to perfectly match some sentence that would make sense in the weird language.

[–] [email protected] 31 points 4 months ago (1 children)

Do you have a source for that? Seems like an internal detail a corpo wouldn't publish

[–] [email protected] 19 points 4 months ago (2 children)

Can't find the exact source–I'm on mobile right now–but the code for the gpt-2 encoder uses a utf-8 to unicode look up table to shrink the vocab size. https://github.com/openai/gpt-2/blob/master/src/encoder.py

load more comments (2 replies)

[–] [email protected] 17 points 4 months ago (1 children)

I suppose it's conceivable that there's a bug in converting between different representations of Unicode, but I'm not buying and of this "detected which language is being spoken" nonsense or the use of character sets. It would just use Unicode.

The modulo idea makes absolutely no sense, as LLMs use tokens, not characters, and there's soooooo many tokens. It would make no sense to make those tokens ambiguous.

[–] [email protected] 8 points 4 months ago

I completely agree that it's a stupid way of doing things, but it is how openai reduced the vocab size of gpt-2 & gpt-3. As far as I know–I have only read the comments in the source code– the conversion is done as a preprocessing step. Here's the code to gpt-2: https://github.com/openai/gpt-2/blob/master/src/encoder.py I did apparently make a mistake, as the vocab reduction is done through a lut instead of a simple mod.

[–] [email protected] 63 points 4 months ago

Damn, wild Glagolitic script found. I didn't even realise it was in the Unicode standard.

[–] [email protected] 60 points 4 months ago

Well, it certainly doesn't overflow on 32 bit systems

[–] [email protected] 50 points 4 months ago

That's not italian that's obviously Unown

[–] [email protected] 34 points 4 months ago (1 children)

It looks so badass, I could have used that script now because im Ukrainian but instead I have cyrillic script which is so boring

[–] [email protected] 6 points 4 months ago (3 children)

rebel against Russian imperialism, return to glagolitic

load more comments (3 replies)

[–] [email protected] 32 points 4 months ago (2 children)

Ah, I see you're using FartGPT instead of ChatGPT

[–] [email protected] 20 points 4 months ago (1 children)

French pronunciation intensifies

[–] [email protected] 10 points 4 months ago

Cat, I farted.

[–] [email protected] 6 points 4 months ago

is that the new model ?

[–] [email protected] 18 points 4 months ago

Title mentions speaking italian

Not a single hand gesture anywhere

I've been duped

[–] [email protected] 18 points 4 months ago

I felt that when he said *83h400+93）*38hpfhi0

[–] [email protected] 16 points 4 months ago

Never go full APL

[–] [email protected] 15 points 4 months ago

[–] [email protected] 14 points 4 months ago

[–] [email protected] 14 points 4 months ago (13 children)

You may not understand, but we do.
Questo segreto rimarrà custodito gelosamente dalla stirpe italica. ◉‿◉

[–] [email protected] 6 points 4 months ago (2 children)

No brother non possiamo tenere questo segreto fino alla fine

load more comments (2 replies)

load more comments (12 replies)

[–] [email protected] 9 points 4 months ago

Wow, an alien ion drive formula! Try to get warp drive out of it too!

[–] [email protected] 8 points 4 months ago

Looks like UiUa: uiua.org

[–] [email protected] 6 points 4 months ago (3 children)

Kind of looks like the writing system of Georgian language but I'm not sure

[–] [email protected] 21 points 4 months ago (7 children)

No, this is Glagolitic script, an alternative to Cyrillic. Mostly used in old Slavic scriptures, was later replaced by Cyrillic and Latin.

Most Slavs themselves don't know how to read this

load more comments (7 replies)

[–] [email protected] 18 points 4 months ago

Nah, Georgian is arcs and circles everywhere, like this: ეს ქართული დამწერლობაა.

[–] [email protected] 6 points 4 months ago

Well, then I was wrong

[–] [email protected] 5 points 4 months ago

We are so cooked

load more comments