this post was submitted on 14 Mar 2024
94 points (100.0% liked)

Blahaj Lemmy Meta

2300 readers
8 users here now

Blåhaj Lemmy is a Lemmy instance attached to blahaj.zone. This is a group for questions or discussions relevant to either instance.

founded 2 years ago
MODERATORS
 

In the last 24 hours you've likely noticed that we've had some performance issues on Blåhaj Lemmy.

The initial issue occurred as a result of our hosting provider having technical problems. We use Hetzner, who provides hosting for approximately a third of the fediverse, so there was wide spread chaos above and beyond us.

As of lemmy 19.x, messages queue rather than getting silently dropped when an instance is down, so once Hetzner resolved their issues, we had a large backlog of jobs to process. Whilst we were working through the queues, we were operational, but laggy, and our messages were an hour or more behind. These queues aren't just posts and replies, but also include votes, so there can be a large volume of them, each one of which needs to be remotely verified with the sending instance as we process it, so geographical latency also plays a part.

As you can see from the graph, we are finally through the majority of the queues.

The exception is lemmy.world. Unfortunately, the lemmy platform processes incoming messages on a sequential basis (think of it as a sequential queue for each remote instance), which means Blahaj Lemmy can't process a second lemmy.world message until we've finished processing the first message.

Due to the size of Lemmy.world they are sending us new queue items almost as fast as our instance can process them, so the queue is coming down, but slowly! In practical terms, this means that lemmy.world communities are going to be several hours behind for the next few days.

For those that are interested, there is a detailed technical breakdown of a similar problem currently being experienced by reddthat, that explores the impact of sequential processing and geographical latency.

top 10 comments
sorted by: hot top controversial new old
[–] [email protected] 16 points 7 months ago (2 children)

Yowza! If lemmy.world gets any bigger will others be able to keep up?

[–] [email protected] 18 points 7 months ago

As long as we get the ability to process incoming queues in parallel, rather than in sequence, then it will be fine. But until that happens, yeah, this will become more of a problem.

[–] [email protected] 8 points 7 months ago (1 children)

I was worried about this too! Lemmy isn't that large for a social media site - does this mean Lemmy will scale poorly if it gets larger? Is it disingenuous to pretend self-hosting is a good idea as a the fediverse grows? Doesn't seem feasible unless you've got a CS degree.

Hope the developers have some tricks up their sleeves. That reddthat post seems... troubling?

I'm out of my depth though, I'm not a programmer (unless you count some ugly scripting in R).

[–] [email protected] 10 points 7 months ago

All it means is that lemmy needs to implement the ability to process queues in parallel instead of serial, so that one message doesn't hold up the next

[–] [email protected] 9 points 7 months ago (1 children)

How does mastodon sync after downtime? Thanks for sharing, ada!

[–] [email protected] 8 points 7 months ago

I've never had much to do with Mastodon, so I'm not sure if it processes inbound queue items sequentially or not, but the process is otherwise very similar.

[–] [email protected] 4 points 7 months ago (1 children)

I'm curious on if it's possible to do multiple queues, and if not, to rate limit lemmy.world.

[–] [email protected] 9 points 7 months ago (1 children)

There are multiple queues, but each inbound instance queue is serial. At the moment it's not possible to split the incoming jobs from one instance across several concurrent queues, but one hopes it will be in the future. That would bypass the need for rate limiting.

[–] [email protected] 3 points 7 months ago

Interesting, gotcha. I'm not a backend dev so I assume you have more info than I do on all this, just spitballing.

[–] [email protected] 3 points 7 months ago

Seems like the two devs will have to take a look at parallel processing. Serial is OK whilst the threadiverse is small but whilst it's not huge (like twitter/reddit/etc) it's certainly big enough now that serial processing is just not up to the job.