this post was submitted on 24 Aug 2023
1 points (100.0% liked)

Lemmy Project Priorities Observations

99 readers
1 users here now

I've raised my voice loudly on meta communities, github, and created new [email protected] and [email protected] communities.

I feel like the performance problems are being ignored for over 30 days when there are a half-dozen solutions that could be coded in 5 to 10 hours of labor by one person.

I've been developing client/server messaging apps professionally since 1984, and I firmly believe that Lemmy is currently suffering from a lack of testing by the developers and lack of concern for data loss. A basic e-mail MTA in 1993 would send a "did not deliver" message back to message sender, but Lemmy just drops delivery and there is no mention of this in the release notes//introduction on GitHub. I also find that the Lemmy developers do not like to "eat their own dog food" and actually use Lemmy's communities to discuss the ongoing development and priorities of Lemmy coding. They are not testing the code and sampling the data very much, and I am posting here, using Lemmy code, as part of my personal testing! I spent over 100 hours in June 2023 testing Lemmy technical problems, especially with performance and lost data delivery.

I'll toss it into this echo chamber.

founded 1 year ago
MODERATORS
 

Lemmy instance Beehaw staff on Monday, August 21 2023....
https://beehaw.org/comment/1018508

"From where I’m standing, I can’t really much has changed unfortunately… which really sucks…

Lemmy.world has grown substantially meanwhile the moderation tools have not improved at all. All I can say about the moderation tools is that we now know that the tools suck more than they used to.

Here’s a list of moderation problems that we have discovered since then:

  • If a Berson is reported on another instance, we never get the report.
  • If a mod is banned from the community they mod, they can still take mod actions
  • If you get site-banned from Beehaw while you are from another instance, you can still post on the community and people from that instance and kbin can see your posts
  • People from other instances can’t know who if someone is an admin on the instance they’re interacting with
  • People from other instances can’t see when we use the shield function to signal we’re talking “officially / as a mod”
  • The modlog is not chronological
  • The modlog breaks if you ban someone for more than 4 digit days.

A banned user’s description is still visible so if they link to a scat image in their description, it is still visible to moderators. Despite these newly known problems, there have been exactly no improvement whatsoever to the moderation tools. It is honestly unsettling and terrifying."

Context: Lemmyy has been on GitHub and in production at Lemmy.ml for over 4 years for the purposes of running and moderating a message forum / link aggregator. Beehaw has been online for over a year before the May 2023 Reddit influx.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 1 year ago* (last edited 1 year ago) (3 children)

Lemmy had been on GitHub for over 4 years when May 2023 rumblings out of Reddit began that by June 1 there was a countdown of 30 days...

What lemmy lacked was data-centered testing. First off, the quantity of data. There was no means in the project to populate even what might be considered a modest amount of data: 500 communities, 10 thousand posts, 20 thousand comments, 5000 users. The testing servers that were in place in June 2023 (enterprise, voyager, etc) were empty. They had some basic scripts to test posts and comments, but feature-focused testing, not tens of thousands.

Lemmy was only tested on nearly empty data, that is how the testing had been done for over 4 years.

With ORM generated SQL statements and a lack of PostgreSQL understanding, there were big problems in the code in May 2023. The site_aggregates table had a SQL TRIGGER UPDATE statement with no WHERE clause that was hitting on every known Instance in the database. Testing only created 5 instances, so this performance problem was not noticed. Lemmy-ui for admin/operator managing the site had no page to view this site_aggregates database table.

Scaling and performance issues were already crashing lemmy.ml in May. They opted to close the doors for new member sign-up and did massive hardware upgrades on June 13. But the performance problems and crashes did not stop, it only worsened with each new instance gong online with the site_aggregates write-activity bug going undiscovered. To try and solve crashing, more and more instances were added to the Lemmy network, exploding from 60 to 600 servers, even over 1500 at one point - almost all because of the Reddit July 1 API cutoff. So you had on June 20 lemmy.ml hammering hundreds of instance rows in the site_aggregates database table with SQL UPDATE write activity on every single new comment and post...

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

Given the struggle with PostgreSQL that the project has had for 4 years, and the two main developers both admit they are not good with SQL relational database... the classic answer would be to use what is called "NoSQL database" solutions. https://en.wikipedia.org/wiki/NoSQL

Interesting to the whole May 2023 situation... Reddit itself was open source code in 2008 and used PostgreSQL database that Lemmy uses! Reddit creators gave a presentation in May 2010 about how to scale a link aggregator website like Lemmy and they advised to basically use NoSQL techniques with PostgreSQL. https://www.infoq.com/news/2010/05/7-Lessons-Reddit/

[–] [email protected] 1 points 1 year ago

Lemmy had been on GitHub for over 4 years

And in use... lemmy.ml was online for over 4 years before the Reddit May 2023 event, not just Github:

"Lemmy - A link aggregator / reddit clone for the fediverse. github.com" @Joe to Rust Programming • April, 2019. https://lemmy.ml/post/17