Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
As many others have said, AWS have a pricing calculator that lets you determine your likely costs.
As a rough calc in the tool for us-east-2 (Ohio), if you PUT (a paid action) 1,000 objects per month of 1024MB each (1TB), and lifecycle transitioned all 1,000 objects each month into Glacier Deep Archive (another paid action), you'll pay around $1.11USD per month. You pay nothing to transfer the data IN from the internet.
Glacier Deep Archive is what I use for my backups. I have a 2N+C backup strategy, so I only ever intend to need to restore from these backups should both of my two local copies of my data are unavailable (eg. house fire). In that instance, I will pay a price for retrieval, as well as endure a waiting period.
You should be able to punch this info into the AWS cost calculator and see this info, right? I work with AWS on a daily basis for my day job and regularly have to pull these estimates for upcoming projects. Granted, these would be estimates.
As for current costs, generally AWS lags by a couple hours to a day before costs show up in cost explorer, so not seeing them immediately isn’t too surprising.
For example, thanks to the helpful graphs at backblaze, I immediately noticed a lot of expensive "class C" api calls, which i minimized by optimizing the cronjob to account for the daily reset and telling rclone to avoid HEAD requests. So after just ta few days I noticed the problem and corrected it. If I did the same on AWS, I would have noticed it only at the end of the month, an expensive lesson.
The idea of Glacier is for you to have this data as a disaster recovery or very deep archive where the chances are very low that you will ever use it. Otherwise Glacier doesn't make sense, as the price for data retrieval is very high, as well as the time to recover your information can be a couple of hours.
My suggestion is to stay away from it. Other than that you can go to AWS Cost explorer and see your cost and an estimation how high your bill might be at the end of the month based on your current and past consumption.
Exactly this. It is meant for last resort all of your other backups have failed. Literally stored on tape, cold meaning it's not even turned on. Great for archive and last resort backups, but I would only use it as backup number 3 or higher.
But wouldn't that suit OP's use case? Storing BorgBackups? That's how I use this storage tier - just in case my local copies aren't recoverable.
Ultimately he needs to take that decision, I am just saying what is usually Glacier used for.
i also use amazon, and i watch the costs like a hawk. it can explode quickly.
if youre cost concerned, i would not recommend amazon. although a mature environment, youre paying for it.
that said, the cost explorer is your friend. sorting by 'usage type' over time is what made it start working for me.
i was also able to throw that metric into a default cloudfront dashboard.
if you want serious details, you may need to do as they say and create the user, to access the required metrics.
from recollection you need to create a user who can access the API required to grab the metrics you want. Even in their own system, this user needs to exist before they can show you metrics using their own api.
i ran into similar security accessing my s3 bucket proceduraly.
created new account in https://us-east-1.console.aws.amazon.com/organizations/v2/home/accounts
for some reason got another 12 months of free usage (which was expired on my account)
waited 3 minutes to get the aws account id, before it wasn't appearing
delegated new account in https://s3.console.aws.amazon.com/s3/lens/organization-settings/add-account?region=us-east-1
had no idea about the password, tried to get details on user, it complained that "AWS Account Management trusted access is not enabled"
enabled it here https://us-east-1.console.aws.amazon.com/organizations/v2/home/services/AWS%20Account%20Management
still have no idea how to access this new account, i don't know the password, they only sent me a welcome email
If you create a new account you should have configured a root email address for it. That one should have received an email to login and set the initial password IIRC.
You can get an estimate of what it's going to cost by going to https://calculator.aws
Upload to AWS shouldn't really cost much, unless you're sending a lot of API put requests. Since they are backups I'm going to guess the files are large and will be uploaded as Multi-Part and will probably invoke multiple API calls to do the upload.
My suggestion would be to upload it to s3 and have it automatically transition to glacier for you using a lifecycle rule.
Cost explorer would be your best bet to get an idea of what it'll cost you at the end of the month as it can do a prediction. There is (unfortunately) not a way to see how many API requests you've already done IIRC.
Going by the s3 pricing page, PUT requests are $ 0.005 per 1000 requests( N. Virginia ).
Going by a docs example
For this example, assume that you are generating a multipart upload for a 100 GB file. In this case, you would have the following API calls for the entire process. There would be a total of 1002 API calls.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html
Assuming you're uploading 10x 100gb according to the upload scheme mentioned above you'd make 10.020 API calls which would cost you 10 * 0.005= 0.05$.
Then there would be the storage cost on glacier itself and the 1 day storage on s3 before it transitioned to glacier.
Retrieving the data will also cost you, as well as downloading the retrieved data from s3 back to your device. If we're talking about a lot of small files you might incur some additional costs of the KMS key you used to encrypt the bucket.
I typed all this on my phone and it's not very practical to research like this. I don't think I'd be able to give you a 100% accurate answer if I was on my pc.
There's some hidden costs which aren't Hidden if you know they exist.
Note that (imo) AWS is mostly aimed at larger organisations and a lot of things ( like VMs ) are often cheaper elsewhere. It's the combination or everything AWS does and can do so that makes it worth the while. Once you have your data uploaded to s3 you should be able to see a decent estimate in cost explorer.
Note that extracting all that data back from s3 to your onprem or anywhere or you decide to leave AWS will cost you a lot more than what it cost you to put it there.
Hope this helps!
The first screen you get to when you log in should show you your predicted monthly cost and current. Failing that you can use the Aws cost explorer to guesstimate how much
edit:
You can also go through the cost and billing manager in Aws.
Edit edit!:
You want to use glacier which is cool, pun very much intended. Plug your estimated usage into cost explorer and you should be good to go.
Short answer, no. Nobody knows. At least not unless you can accurately predict exactly how many API calls and how much data you will transfer.
UPDATE: after some days, the bill under https://us-east-1.console.aws.amazon.com/billing/home?region=us-east-1#/bills is populated in much detail. Now it's much clear.
With rclone, my test sending 131 files / 2500 mb, set to don't do upload chunking and don't do HEAD on glacier created:
- 110 PutObject requests to Glacier Deep Archive
- 5 InitiateMultipartUpload requests
- 5 CompleteMultipartUpload requests
- 5 UploadPart requests
- 192 PUT, COPY, POST, or LIST requests
- 111 GET and all other requests
I think now i can safely upload everything and it shouldn't be too expensive