Fediverse

28205 readers

502 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to [email protected]!

Rules

Posts must be on topic.
Be respectful of others.
Cite the sources used for graphs and other statistics.
Follow the general Lemmy.world rules.

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

founded 1 year ago

MODERATORS

[email protected]

106

SearNGX should be a federated search engine (github.com)

submitted 3 months ago by [email protected] to c/[email protected]

27 comments fedilink hide all child comments

All the posts about Reddit blocking everyone except Google and Brave got me thinking: What if SearNGX was federated? I.E. when data is retrieved via a providers API, that data is then federated to all other instances.

It would spread the API load out amongst instances, removing the API bottlenecks that come from search providers.

It would allow for more anonymous search, since users could cycle between instances and get the same results.

Geographic bias would be a thing of the past.

Other than ActivityPub overhead and storage, which could be reduced by federating text-only content, I fail to see any downside.

Thoughts?

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 18 points 3 months ago (1 children)

Are you thinking of YaCy?

[–] [email protected] 7 points 3 months ago* (last edited 3 months ago) (2 children)

Ah, I wondered if something like that had been tried before. Looks like it is maybe still running: https://yacy.net/

The demo isn't giving me useful search results.

[–] [email protected] 8 points 3 months ago* (last edited 3 months ago)

There's only been about 700 yacy peers online in the last 30 days which is pretty low for a "crowd sourced" search engine, especially when many of those are, I think, temporary peers that come and go. It looks like it has only maybe 200 "master" servers which wouldn't be nearly enough to keep up with the Internet these days.

The good news is that if there's websites / urls that you care about you can point your own yacy instance at them and schedule the crawls to keep up with content changes.

I remember reading about yacy some years ago and now that I've bumped it into again it's sparked my interest. I may stand up a docker instance and play with it for awhile. If nothing else it could make a very useful "arrrrr" search engine.

[–] [email protected] 6 points 3 months ago

I ran an instance for a while out of curiosity a few years back - building the database seemed to work fine and appeared like a good idea, had a lot of fun to see the connections with other servers and my crawler filling holes of unknown spaces. But I think the search algorithm itself was (most likely is) not sophisticated enough, it just did not give relevant results often enough, and it was extremely vulnerable to very simple SEO tactics to push trash to the top.