this post was submitted on 18 Aug 2023

63 points (94.4% liked)

Asklemmy

43812 readers

1043 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it's welcome here!

Open-ended question
Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
Not ad nauseam inducing: please make sure it is a question that would be new to most members
An actual topic of discussion

Looking for support?

Looking for a community?

Lemmyverse: community search
sub.rehab: maps old subreddits to fediverse options, marks official as such
[email protected]: a community for finding communities

~Icon~ ~by~ ~@Double_[email protected]~

founded 5 years ago

MODERATORS

[email protected]

Could modern computers decipher hieroglyphics if we never found the Rosetta Stone? (lemmy.world)

submitted 1 year ago by [email protected] to c/[email protected]

19 comments fedilink hide all child comments

What if we never found the Rosetta Stone and could not read ancient Egyptian hieroglyphics. Could computers or AI decipher them today?

top 19 comments

sorted by: hot top controversial new old

[–] [email protected] 44 points 1 year ago (2 children)

Given that the AI we have is prone to making things up because it “fits” according to the models it trains on, how much faith would you have in a translation done by an AI on writings made by people who lived millennia before said language models were developed?

[–] [email protected] 42 points 1 year ago* (last edited 1 year ago) (2 children)

Don't confuse modern LLM models (like ChatGPT) with AI. As the saying goes:

All Buicks are Cars, but not all Cars are Buicks

LLMs are a form of AI, but there is a lot more going on in the world of AI than just LLMs.

[–] [email protected] 13 points 1 year ago (1 children)

That’s a good point, and you’re right that I’m conflating them.

What other elements of AI would you imagine would be useful here?

[–] [email protected] 4 points 1 year ago

You'd have to ask people who work in the AI field, and, alas, I'm not one of those people.

There has been a lot of language work on attempting to reconstruct the original Indo-European Language, using combinations of pattern recognition and statistical analysis of child languages. Those sorts of tools could aid in deciphering a dead written language.
https://en.wikipedia.org/wiki/Proto-Indo-European_language

However, another written language called Linear-A (of the ancient Minoans) has yet to be deciphered, despite lots of attempts at trying.
https://www.thoughtco.com/linear-writing-system-of-the-minoans-171553

So:
¯\(ツ)/¯

[–] [email protected] 6 points 1 year ago

to expand your point, the sole job of an LLM is to, when given a sequence of words (e.g. half a sentence), predict what the next several words should be. the model has no concept of what English words mean, so instead it makes this prediction based on statistics that were derived from basically reading through hundreds of thousands of English sentences

TL;DR LLMs don’t understand languages, they’ve just memorized statistics about them

[–] [email protected] 3 points 1 year ago* (last edited 1 year ago)

I'll have more faith once it can reliably switch back and forth between Unicode symbols and their underlying HTML entities. It understands the concept of emojis and can use them appropriately, but I can tell there's still some underlying issues in the token/object model for non-ASCII symbols.

[–] [email protected] 22 points 1 year ago

No. Link between sign and meaning (or sound) can be arbitrary.

If we knew that they were phonetic, we could get somewhere by comparing how human languages generally work, but even then we would need something to start with.

I'm guessing pure pictographs are harder, even with advanced comparative analysis.

[–] [email protected] 15 points 1 year ago (1 children)

Probably not. We can algorithmically show if something is likely to be writing, but actually understanding it is a very complicated process that involves a lot of social sciences inferences. An LLM using chain or tree of thought would probably be your best shot.

[–] [email protected] 4 points 1 year ago (1 children)

I think if you're trying to model completely agnostically on every language possible translating entire words and existing known pictograms to what they mean. Then there might be a slight chance that kind of deciphering part of it. Just because humans usually come back to similar symbols and maybe it can pick up on something that we can't. But it would be a long shot to be sure

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

The question specified present technology, which is how I answered. I'd guess that an algorithm that can find a reasonable interpretation of any corpus of text in a reasonable time period exists, it's just it hasn't been made.

For really small corpuses there might be more than one interpretation. The Voynich manuscript can probably only be read one way (or zero, but I've seen convincing arguments for 1).

[–] [email protected] 12 points 1 year ago (3 children)

Yes, it has already deciphered some languages and is even being used on animals now.

[–] [email protected] 12 points 1 year ago

If I read correctly, they taught it to decipher cuneiform using the known translations. It didn't "crack it" itself.

[–] [email protected] 10 points 1 year ago (1 children)

I really hope they can use this to decipher the Harappan script. That's like the holy grail of lost languages for me. I want to know so badly!

[–] [email protected] 3 points 1 year ago

Have them do the Voynich Manuscript too!

[–] [email protected] 3 points 1 year ago

Here is an alternative Piped link(s): https://piped.video/watch?v=3tUXbbbMhvk

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I'm open-source, check me out at GitHub.

[–] [email protected] 4 points 1 year ago

I would say no, the computers would have no dataset/code to decode a picture-based language like hieroglyphics. I mean if you programmed and tailored the computer algorithms enough it could decode them, but you'd practically have to take all our knowledge of hieroglyphics and turn that into code and data to brute-force the computer to do the translation.

[–] [email protected] 4 points 1 year ago

Having computers world definitely help. You can do statistical analyses in seconds that would have taken Champollion and his contemporaries years. Modern cryptography is pretty good a cracking cyphers and as long as you have some idea of the language that's been written in, it should be very doable.

[–] [email protected] 4 points 1 year ago (1 children)

Here's what ChatGPT 3.5 says:

The Rosetta Stone played a crucial role in deciphering Egyptian hieroglyphics because it provided a key to understanding the script by presenting the same text in three different scripts: Ancient Greek, Egyptian hieroglyphics, and demotic (a simplified script used for everyday purposes). This allowed scholars to correlate the known Greek text with the Egyptian texts and begin to decipher the hieroglyphics.

A GPT-based AI, while powerful in many ways, would likely face significant challenges in deciphering Egyptian hieroglyphics without access to something akin to the Rosetta Stone. Here's why:

Lack of Direct Contextual Data: GPT models learn from a vast amount of text data, but the historical and contextual gap between modern languages and ancient Egyptian hieroglyphics is enormous. GPT models might not have enough direct or relevant data to bridge this gap.
Limited Multilingualism: GPT models, including their previous versions like GPT-3.5, do not inherently "know" multiple languages the way humans do. They have learned statistical patterns in text, which enables them to generate text in multiple languages but doesn't guarantee deep understanding or translation of highly specialized languages like hieroglyphics.
Lack of Connection to the Rosetta Stone Context: GPT models lack the ability to access external sources or historical events, which is what made the Rosetta Stone so invaluable in deciphering hieroglyphics. Without access to that kind of contextual information, it would be difficult for the model to make the necessary connections.
Complex Symbolic Nature: Egyptian hieroglyphics are not a straightforward language like those GPT models are trained on. Hieroglyphics use a combination of ideograms, phonograms, and determinatives to convey meaning. The complexity of hieroglyphics goes beyond the syntax and structure of modern languages, making it a unique challenge.

In summary, while a GPT-based AI could perform some basic statistical analyses on hieroglyphics to identify patterns, it's unlikely to decipher the script in a comprehensive and accurate way without a Rosetta Stone-like source that provides a bridge between the ancient script and a known language. Hieroglyphics represent a highly specialized domain that would require deep contextual understanding, which GPT models might struggle to achieve given their current capabilities.

[–] [email protected] 4 points 1 year ago

Good job ChatGPT! The only correction I'd make is that logographic writing is still used in languages like Chinese.