this post was submitted on 14 Jul 2023
27 points (100.0% liked)

Announcements

240 readers
1 users here now

Community updates and announcements.

Admins will post any updates here so be sure to follow!

Important updates will be pinned to the local feed.

founded 1 year ago
MODERATORS
 

Timeline and reasoning behind recent infra changes

Recently, you may have noticed some planned outages and site issues. I've decided to scale down the size and resilience of the infrastructure. I want to explain why this is. The tl;dr; is cost.

Reasons

  • I started discuss.online about 4 weeks ago. I had hoped that the reaction to Reddit's API changes would create a huge rush to something new, for the people, by the people; however, people did not respond this way.
  • I built my Lemmy instance like any other enterprise software I have worked on. I planned for reliability and performance. This, of course, costs money. I wanted to be known as the poster child for how Lemmy should operate.
  • As I built out the services from a single server instance to what it became the cost went up dramatically. I justified this assuming that the rush of traffic would provide enough donors to supplement the cost for better performance and reliability.
  • The traffic load on discuss.online is less that extraordinary. I've decided that I've way over engineered the resilience and scale. Some SubReddits that had originally planned to stay closed decided to re-open. I no longer needed to be large.
  • The pricing of the server had gotten way out of control. More than the cost of some of the largest instances in Lemmy while running a fraction of the user base.

Previous infrastructure

  • Load balancer (2 Nodes @ $24/month total)
  • Two front-end servers (2 Nodes @ $84/month total)
  • Backend Server (1 Node @ $84/month total)
  • Pictures server (1 Node @ $14/month total)
  • Database (2 Nodes @ $240/month total)
  • Object Storage ($5/month + Usage see: https://docs.digitalocean.com/products/spaces/details/pricing/)
  • Extra Volume Storage ($10/month)
  • wiki.discuss.online web node ($7/month)
  • wiki.discuss.online database node ($15/month) [Total cost for Lemmy Alone: $483 + Usage]

Additionally:

  • I run a server for log management that clears all lots after 14 days. This helps with finding issues. This has not changed. ($21/month)
  • Mastdon server & DB ($42/$15/+storage ~ $60 total/month)
  • Matrix server & DB ($42/$30/+storage ~ $75 total/month)

Total Monthly server cost out of pocket: ~$640/month.

The wiki, Mastodon, Matrix, & log servers all remained the same. The changes are for Lemmy only and will be the focus going forward.

First attempt

As you can see it was quite large. I've decided to scale way down. I attempted this on 7/12. However, I had some issues with configuration and database migration. That plan was abandoned. This is what it looked like:

Planned infrastructure

  • Single instance server (1 Node @ $63/month total)
    • Includes front-end, backend, & pictures server.
  • Database server (1 Node @ $60/month total)
  • Object Storage ($5/month + Usage)
  • Extra Volumes ($20 / month total)

[Total new cost: ~$150 + Usage]

Second attempt

I had discovered that the issues from the first attempt were caused by Lemmy's integration with Postgres. So I decided to take a second attempt. This is the current state:

Current infrastructure

  • Single instance server (1 Node @ $63/month total)
    • Includes front-end, backend, & pictures server.
  • Database server (1 Node @ $60/month total)
  • Object Storage ($5/month + Usage)
  • Extra Volumes ($20 / month total)
  • wiki.discuss.online web node ($7/month)
  • wiki.discuss.online database node ($15/month)

[Total new cost for Lemmy alone: ~$170 + Usage]

New total monthly server cost out of pocket: ~$330

My current monthly bill is already more than that from previous infrastructure @ $336.

Going forward

Going forward I plan to monitor performance and try to balance the benefits of a snappy instance with the cost it takes to get there. I am fully invested in growing this community. I plan to continue to financially contribute and have zero expectations to have everything covered; however, community interest is very important. I'm not going to overspend for a very small set of users.

If the growth of the instance continues or rapidly changes I'll start to scale back up.

I'm learning how to run a Lemmy server. I'll adjust to keep it going.

Here are my current priorities for this instance:

  1. Security
    • This has to be number one for every instance. Where you decide to store your data is your choice again. You must be able to trust that your data is safe and bad actors cannot get it.
  2. Resilience & backups
    • Like before, it's your data and I'm keeping it useable for you. I plan to keep it that way by providing disaster recovery steps and tools.
  3. Performance
    • Performance is important to me mostly because it helps ensure trust. A site that responds well mans the admin cares.
  4. Features
    • Lemmy is still very new and needs a lot of help. I plan to contribute to the core of Lemmy along with creating 3rd party tools to help grow the community. I've already began working on https://socialcare.dev. I hope to help supplement some missing core features with this tool and allow others to gain from it in the process.
  5. User engagement
    • User engagement would be #1; however, everything before this is what makes user engagement possible. People must be using this site for it to matter and for me to justify cost and time.

Conclusion

If you notice a huge drop in performance or more issues than normal please let me know ASAP. I'd rather spend a bit more for a better experience.

Thanks, Jason

top 13 comments
sorted by: hot top controversial new old
[–] [email protected] 6 points 1 year ago (2 children)

Awesome! Thanks for being transparent and providing what I've found to easily be the most reliable instance out of the half dozen I've tried.

[–] [email protected] 7 points 1 year ago (1 children)

Wow, glad to hear that! I've had a few more outages that I'd hope from just setup and growing. With time it should be the most reliable of them all. Thanks a lot for joining!

[–] [email protected] 3 points 1 year ago (1 children)
[–] [email protected] 2 points 1 year ago

Ha, thanks a lot.

[–] [email protected] 3 points 1 year ago* (last edited 1 year ago)

I've noticed the same. It's a shame that most people's experience with Lemmy has been with less reliable instances. Although a lot of the problems with the biggest instances have been related to things that are not infrastructure.

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago) (1 children)

Hey, thanks for the update. I run a VPS myself over at inyo.space just to run a phpBB instance, and am aware of the costs involved. Maybe one day I'll try to get a lemmy instance running there, but I'm happy that others are getting lemmy off the ground now. I'm not sure how best to promote your instance here, so more users can subscribe to it, but if I can help, just let me know.

Cheers.

[–] [email protected] 4 points 1 year ago

Unrelated, but I think there is a PHPBB-like front-end for Lemmy, too: https://github.com/LemmyNet/lemmyBB.

Don't feel obligated to do anything. I appreciate the support and your being here to comment.

[–] [email protected] 5 points 1 year ago

Great write up. Thanks for the transparency

[–] [email protected] 3 points 1 year ago (1 children)

Thanks for sharing this. You always hear stories about scaling up, but never scaling down. “Scaling” so often refers to just scaling up, but the whole story is always more interesting.

I also wonder whether lemmy-ui performance issues, now mostly(?) resolved in Lemmy v0.18.x, was really unfortunate timing. Maybe 99% of Reddit users would have stayed anyway. But I do wonder…

[–] [email protected] 2 points 1 year ago (1 children)

lemmy-ui and lemmy's backends are way better than before. I originally split into two webheads because the front end consumed so much CPU time and ram via unpredictable spikes that I had to do something.

It seems like the biggest issue with adoption is the discovery of communities. People do not care for federated as long as they don't realize how it works. People have to know too much about how Lemmy works to use it, and they don't care. Reddit is just easier.

[–] [email protected] 1 points 1 year ago (1 children)

This is also a reason why people like to join the biggest instance.

[–] [email protected] 3 points 1 year ago

Yes, it’ll be like that until we fix discovery. It has to be seamless for general people to use it.

[–] [email protected] 3 points 1 year ago
load more comments
view more: next ›