this post was submitted on 17 Aug 2023
4 points (100.0% liked)

Lemmy Server Performance

423 readers
1 users here now

Lemmy Server Performance

lemmy_server uses the Diesel ORM that automatically generates SQL statements. There are serious performance problems in June and July 2023 preventing Lemmy from scaling. Topics include caching, PostgreSQL extensions for troubleshooting, Client/Server Code/SQL Data/server operator apps/sever operator API (performance and storage monitoring), etc.

founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 1 points 1 year ago* (last edited 1 year ago) (1 children)

An even less-intrusive approach is to not add any new field to existing tables. Establish a reference table say called include_range. There is already an ENUM value for each sort type, so include_range table with these columns: sort_type ENUM, lowest_id BigInt, highest_id BigInt

Run a variation of this to populate that table:

FROM
  (
     SELECT id, community_id, published,
        rank() OVER (
           PARTITION BY community_id
           ORDER BY published DESC, id DESC
           )
     FROM post_aggregates) ranked_recency
WHERE rank <= 1000

Against every sort order, including OLD. Capture only two BigInt results: the MIN(id) and the MAX(id) - that will give a range over the whole table. Then every SELECT on post_aggregates / post table includes a WHERE id >= lowest_id AND id <= highest_id

That would put in a basic sanity check that ages-out content, and it would be right against the primary key!

[โ€“] [email protected] 1 points 1 year ago* (last edited 1 year ago)

A core design issue of either approach is that server operators can modify the building of this data without needing to modify or restart the lemmy_server Rust code.

Using a smallint also gives some flexibility (or a new field if going with the id min max approach).... if page greater than 10 for a particular sort, go to include > 1 and fall into tiers.