this post was submitted on 25 Oct 2024
338 points (97.2% liked)

Curated Tumblr

3957 readers
191 users here now

For preserving the least toxic and most culturally relevant Tumblr heritage posts.

Image descriptions and plain text captions of written content are expected of all screenshots. Here are some image text extractors (I looked these up quick and will gladly take FOSS recommendations):

-web

-iOS

-android

Please begin copied raw text posts (lacking a screenshot that makes it apparent it is from Tumblr) with:

# This has been reposted here to Lemmy as part of the "Curated Tumblr Project."

I made the icon using multiple creative commons svg resources, the banner is this.

founded 1 year ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 5 points 4 weeks ago* (last edited 4 weeks ago)

As per my other post, this person isn't doing any of that.

But, since you asked for papers on generic matching algorithms, I found this during the silent conniption fit you sent me into after suggesting that some random tumblr user plugged a tumblr bot directly into a state of the art genomics db.

https://link.springer.com/article/10.1007/s11227-022-04673-3

Please note that while, yes, they ran this test on a standard office computer, they were only searching against 12 million characters.

A single tebibyte of characters would be more like 1 trillion characters. A pebibyte would be more like 1 ~~quintillion~~ quadrillion.

... much, much, much longer processing times.

Edit: Used the wrong word for stupendously large numbers that start with q.