endlesstalk.org

122 readers
2 users here now

This Community is intended for posts about the endlesstalk.org server.

founded 2 years ago
1
2
Successfully upgraded to 0.19.7 (self.endlesstalkorg)
submitted 1 month ago* (last edited 1 month ago) by lemmy to c/endlesstalkorg
 
 

There was an issue with the database, since the migration somehow took 200GB extra space, which caused the database on one of the server to go down. The database switched over to its replica, so it works now, but there is a period of 30 mins, where some data will have been lost.

Let me know, if there are any other issues.

EDIT: There does seem to be an issue with creating a post. I will look into it.

EDIT2: Fixed.

2
5
submitted 5 months ago* (last edited 5 months ago) by lemmy to c/endlesstalkorg
 
 

So I had to migrate to a new host again. The response times are a bit slow and spikes sometimes. I'm looking into it.

EDIT: Should be a bit better now, but there are still some spikes in response times, so I will keep looking into it.

3
9
Welcome (self.endlesstalkorg)
submitted 1 year ago* (last edited 1 year ago) by lemmy to c/endlesstalkorg
 
 

Welcome!

Many large instances have been struggling to deal with the big increase in users from Reddit.

Therefor I wanted to setup this server to divide the load. I also really like hosting and managing a server, so I thought it was perfect for me


About endlesstalk.org

endlesstalk.org is intended to be a serious long-term instance!

Rules and info can be found in the sidebar.


Technical setup

  • Have been built with redundancy and recoverability in mind
  • Database and images are backed up every 2 hours, so there is always a fallback.
  • Servers for endlesstalk.org are runing in the clould and I strive to get the best mix between stability and pricing.

If you decide to join, let me know of any issues or improvements at issues and improvements

Lastly I would appreciate any donation to cover the server costs. At the moment, its not really needed, but I would appreciate it regardless.

4
 
 

Upgrading to 0.19.7, since it contains some fixes discovered after 0.19.6 have been released . Release notes here for more information about the bugfixes.

For this upgrade there will be downtime and I expect it to last around 30 min but it all depends on how long the database migrations will take. If there are any major issues with the upgrade, you can check the uptime site here or site status here

Local time can be seen here

5
 
 

0.19.6 brings a lot of changes. See release notes here for more information.

For this upgrade there will be downtime and I expect it to last around 30 min but it all depends on how long the database migrations will take. If there are any major issues with the upgrade, you can check the uptime site here or site status here

Local time can be seen here

6
 
 

This should however be the last time for a long time, since I have greatly improved the setup.

Pictrs database got corrupted during the process, which is why the images for the last 6 days are lost.

Let me know, if there are any issues after moving to the new host.

7
4
submitted 5 months ago* (last edited 5 months ago) by lemmy to c/endlesstalkorg
 
 

The database was corrupted, so had to recover from backup. About 6 hours of data was lost.

~~There does seem to be a weird network error sometimes. I'm looking into it.~~ FIXED

Lastly I apologize for all the downtime lately.

8
6
submitted 5 months ago* (last edited 5 months ago) by lemmy to c/endlesstalkorg
 
 

Endlesstalk has been migrated to a new host.

There did seem to have been some caching issues, causing some certificate issues, but it seems to be fixed(Atleast for me).

Let me know, if there are any issues after the migration

EDIT: The 525 ssl certificate error shows up intermittently. I'm looking into it.

EDIT2: Fixed. Was an issue with the nginx loadbalancer.

9
 
 

Earlier today one of the servers, where endlesstalk is hosted went down. After some time, the server came back up again, but there were some unknown issue and the server was unstable. So preparation to migrating endlesstalk to a new host began. However after setting the new servers up, there was success with getting one of the "old" servers up and running again.

Tommorow at ~~18:00~~ 20:00 UTC the migration to the new host will begin. See local time here. There will be some downtime with this, probably around an hour or less.

EDIT: Server went down again, but should be back again now.

EDIT2: 20:00 UTC, since I forgot I have something from 17-19 UTC.

10
7
Successfully upgraded to 0.19.5 (self.endlesstalkorg)
submitted 6 months ago by lemmy to c/endlesstalkorg
 
 

The upgrade went smoothly and everything seems to work.

Let me know, if there is anything that dosn't work after the upgrade.

11
 
 

I have found the issue with the database migration, so the upgrade to the latest version of lemmy can proceed.

0.19.5 brings a lot of smaller bugfixes. See release notes here for more information. I will also upgrade the database to a newer version(postgres 16).

For this upgrade there will be downtime and I expect it to last around 1 hour or less. If there are any major issues with the upgrade, you can check the uptime site here or site status here

Local time can be seen here

12
5
Upgrading to 0.19.4 postponed. (self.endlesstalkorg)
submitted 6 months ago by lemmy to c/endlesstalkorg
 
 

The database migration to 0.19.4 failed, because the database schema doesn't align with the state the migrations want. The reason is probably because it didn't restore correctly from a previous backup, but I don't actually know the cause.

I thought I could create a new database with a correct schema and then import the data from the currrent database into the new one. This might still be possible, but it simply takes too long and it has gotten too late for me(03:00 in the night).

I will look into a fix for the migration and when I have a fix I will announce a new date for the upgrade to 0.19.4.

13
 
 

0.19.4 brings a lot of changes. See release notes here for more information.

There should be no downtime or very minimal downtime. If there are any issues, check the uptime site here or site status here

Local time can be seen here

Note: An update to postgres 16 and pictrs 0.5 is also comming soon, which will bring some downtime. Don't know when yet, but will post an update, when I know.

EDIT: There was an issue with migrating the database, while upgrading to 0.19.4, so will take longer.

EDIT2: The database is in a different state, than the migration to 0.19.4 expects. The cause is not clear, but I'm looking into it.

14
4
[Fixed] Instability of site (self.endlesstalkorg)
submitted 8 months ago* (last edited 8 months ago) by lemmy to c/endlesstalkorg
 
 

Hello

I have noticed that the server have been going down a lot for 10 - 20 mins.

Unfortunately, I'm currently on vacation, so I don't think I will have the time to fix it.

I will be back tomorrow evening and will look into it and hopefully fix it then.

EDIT: There was a misconfiguration of the auto scaling setup. This scaled the system up and used all of its CPU, which caused the site to be unresponsive.

This should be fixed now, but I will keep monitoring it.

15
 
 

While working on a small fix to lemmy, that was causing some unneeded cpu usage, I made a change that unfortunately caused the db and pictrs service storage to be deleted.

Thankfully I have backups of everything, so I went to restoring from a backup. However the restoring was very slow, since I used an unoptimal way to backup the db(raw sql dump). After the first backup completed, I found out that it was missing data, so I tried an older backup, but that didn't work either. It was missing data as well. So I tried a backup from another server(Since I backup to 2 different servers), which finally worked.

Usually restoring from backups, haven't taken too long previously, since my backups are faily small, but I will need to look into a quicker way to restore backups for lemmy, since the backup size of lemmy is much bigger.

NOTE: Data from ca. 2 hours before the site went down(16-18 UTC) will be missing and I'm unable to restore it.

16
1
submitted 10 months ago* (last edited 10 months ago) by lemmy to c/endlesstalkorg
 
 

The s3 host that pictrs use, have gone down.

I might have to move to another s3 host, which will mean it will take a bit before images are working. This will cause a loss of images for the last 2-3 hours before pictrs stopped working.

EDIT: Have moved to new s3 host. Unsure how many images were lost during the outage.

17
1
submitted 11 months ago* (last edited 11 months ago) by lemmy to c/endlesstalkorg
 
 

Local time can be seen here

0.19.3 is mostly bugfixes. See release notes here for more information.

There should be no downtime or very minimal downtime. If there are any issues, check the uptime site here or site status here

18
1
submitted 11 months ago* (last edited 11 months ago) by lemmy to c/endlesstalkorg
 
 

Local time can be seen here

0.19.2 contains fixes for outgoing federation and a few other things. See release notes here for more information.

There should be no downtime or very minimal downtime. If there are any issues, check the uptime site here or site status here

EDIT: The server went down / was very very slow to respond. I'm not quite sure why.

19
5
The update to 0.19 (self.endlesstalkorg)
submitted 1 year ago by lemmy to c/endlesstalkorg
 
 

The 0.19 version is out and I expect to update sometime during the next week. Since it is a big release, I will need to spend some more time testing, that everything still works fine and ensure that the migration works without problems as well.

I will update with another post, when I know, when the update will take place.

20
 
 

The update fixes an issue with moderation actions sometimes not federating correctly(See release notes here).

There should be no or very minimal downtime. To convert UTC to your local time, use this

21
2
submitted 1 year ago* (last edited 1 year ago) by lemmy to c/endlesstalkorg
 
 

I expect a very minimal downtime of ca. 5-15 mins.

22
2
submitted 1 year ago by lemmy to c/endlesstalkorg
 
 

I had made some config changes to the database earlier in connection with the move to a new server, that caused the storage usage of the database to grow a lot. Then when it had no more space left, it crashed which caused the downtime. Unfortunately it happened at a time, where I wasn't available to fix immediately, which is why it was down for so long.

It is now fixed and I will keep a watch(setup an alert for database disk usage) to make sure it doesn't happen again.

23
1
submitted 1 year ago* (last edited 1 year ago) by lemmy to c/endlesstalkorg
 
 

Seems to be caused by low amount of available storage. This caused k8s to evict/delete pods -> site went down,, then it would fix itself -> site goes up again but then k8s would evict again -> site goes down. This continued untill it stabilized at some point.

This should be fixed and there should be more space available, when I move the server to a new host. I expect to move to a new server sometimes in the comming week. Will annonce the date, when I know, when it will happen.

EDIT: Spoke a little bit too soon, should be fixed now though.

EDIT2:

There was something that kept using storage, so ran into the issue again. Then the volume/storage for the image service(pictrs) stopped worked for some unknown reason(Thankfully, I have backups) and there shouldn't be any images lost.

Good news is that I have reclaimed a lot of storage, so shouldn't be in danger of running out of space for a long time.

24
3
Downtime from 23:32 - 23:56 (self.endlesstalkorg)
submitted 1 year ago by lemmy to c/endlesstalkorg
 
 

While preparing for migration to a new host, I had to setup the db, but during that I deleted a resource in k8s to force a reload of settings in the db. This caused the db to use a different volume and it took a bit before I could revert it back to using the old volume.

No data should have been lost. Let me know, if anything is missing.

25
 
 

Same thing as yesterday.

view more: next ›