I expect a very minimal downtime of ca. 5-15 mins.
I had made some config changes to the database earlier in connection with the move to a new server, that caused the storage usage of the database to grow a lot. Then when it had no more space left, it crashed which caused the downtime. Unfortunately it happened at a time, where I wasn't available to fix immediately, which is why it was down for so long.
It is now fixed and I will keep a watch(setup an alert for database disk usage) to make sure it doesn't happen again.
Seems to be caused by low amount of available storage. This caused k8s to evict/delete pods -> site went down,, then it would fix itself -> site goes up again but then k8s would evict again -> site goes down. This continued untill it stabilized at some point.
This should be fixed and there should be more space available, when I move the server to a new host. I expect to move to a new server sometimes in the comming week. Will annonce the date, when I know, when it will happen.
EDIT: Spoke a little bit too soon, should be fixed now though.
There was something that kept using storage, so ran into the issue again. Then the volume/storage for the image service(pictrs) stopped worked for some unknown reason(Thankfully, I have backups) and there shouldn't be any images lost.
Good news is that I have reclaimed a lot of storage, so shouldn't be in danger of running out of space for a long time.
While preparing for migration to a new host, I had to setup the db, but during that I deleted a resource in k8s to force a reload of settings in the db. This caused the db to use a different volume and it took a bit before I could revert it back to using the old volume.
No data should have been lost. Let me know, if anything is missing.
Same thing as yesterday.
Unfortunatly the tool for scanning CSAM didn't detect the image, so to ensure there are no CSAM on the server, images for the last 8 hours has been deleted.
The filesystem manager(longhorn) I use reported that multiple volumes were faulted. This caused the site to go down.
I have no idea why the volumes faulted, only that a reboot of the server fixed it. Hopefully this was a strange one off and it doesn't occur again.
To make it easier to deferate from unwanted instances I have switched to using the fediseer. With this tool I can get censures from other trustworthy instances. A censure is a "completely subjective negative judgement"(see more here) and reasons for the censure can be listed.
Currently I'm using the censures from lemmy.dbzer0.com(Can be seen here), that has any of the following reasons for the censure
- Hate speech
I will still manually deferate from instances, when it is needed, but this makes easier to deferate bad instances I would have missed/didn't know about.
Note: The automated deferation also includes spam instances, which is currently defined by
- More than 30 registered users per local post + comments
- More than 500 registered users per active monthly user.
There have been a report of CSAM and unfortunately the lemmy-safety doesn't go through the images quickly enough(on my hardware) to be of use in this case.
I think there exists a tool to purge a image via information from a post, but wasn't unable to find it now. In the future I can hopefully use that tool, when reports of CSAM come in.
Many large instances have been struggling to deal with the big increase in users from Reddit.
Therefor I wanted to setup this server to divide the load. I also really like hosting and managing a server, so I thought it was perfect for me
endlesstalk.org is intended to be a serious long-term instance!
Rules and info can be found in the sidebar.
- Have been built with redundancy and recoverability in mind
- Database and images are backed up every 2 hours, so there is always a fallback.
- Servers for endlesstalk.org are runing in the clould and I strive to get the best mix between stability and pricing.
If you decide to join, let me know of any issues or improvements at issues and improvements
Lastly I would appreciate any donation to cover the server costs. At the moment, its not really needed, but I would appreciate it regardless.