this post was submitted on 10 Dec 2023

55 points (98.2% liked)

Selfhosted

40183 readers

690 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago

MODERATORS

[email protected]

There's a way to know how much i am spending/going to spend with Amazon S3? (feddit.it)

submitted 11 months ago by [email protected] to c/[email protected]

16 comments fedilink hide all child comments

I have several TB of borg backups. Uploaded them on backblaze b2. I could immediately see how much resources i was using, how many api calls, and so on. Very easy to see and predict the next bill. I can see exactly which bucket uses more resource, and which is growing over time.

Because I'm cheap, I want to upload those files on aws glacier, which theoretically costs a quarter of b2 for storage, but API calls are extremely expensive. So I want to know the details. I won't like to get a bill with $5 in storage and $500 in API calls.

Uploaded a backup, but nowhere in AWS I can see how much resources i am using, how much I'm going to pay, how many API calls, how much the user XYZ spent, and so on.

It looks like it's designed for an approach like "just use our product freely, don't worry about pricing, it's a problem for the financial department of your company".

In AWS console I found "s3 storage lens", but it says i need to delegate the access to someone else because reasons. Tried to create another user in my 1-user org, but after wasting 2 hours I wasn't able to find a way to add those permissions.

Tried to create a dashboard in "AWS cost explorer" but all the indicators are null or zero.

So, how can I see how many API calls and storage is used, to predict the final bill? Or the only way is to pray and wait the end of the month and hopefully there everything it's itemized in detail?

all 20 comments

sorted by: hot top controversial new old

[–] [email protected] 18 points 11 months ago* (last edited 11 months ago) (1 children)

As many others have said, AWS have a pricing calculator that lets you determine your likely costs.

As a rough calc in the tool for us-east-2 (Ohio), if you PUT (a paid action) 1,000 objects per month of 1024MB each (1TB), and lifecycle transitioned all 1,000 objects each month into Glacier Deep Archive (another paid action), you'll pay around $1.11USD per month. You pay nothing to transfer the data IN from the internet.

Glacier Deep Archive is what I use for my backups. I have a 2N+C backup strategy, so I only ever intend to need to restore from these backups should both of my two local copies of my data are unavailable (eg. house fire). In that instance, I will pay a price for retrieval, as well as endure a waiting period.

[–] [email protected] 15 points 11 months ago

You should be able to punch this info into the AWS cost calculator and see this info, right? I work with AWS on a daily basis for my day job and regularly have to pull these estimates for upcoming projects. Granted, these would be estimates.

As for current costs, generally AWS lags by a couple hours to a day before costs show up in cost explorer, so not seeing them immediately isn’t too surprising.

[–] [email protected] 9 points 11 months ago

For example, thanks to the helpful graphs at backblaze, I immediately noticed a lot of expensive "class C" api calls, which i minimized by optimizing the cronjob to account for the daily reset and telling rclone to avoid HEAD requests. So after just ta few days I noticed the problem and corrected it. If I did the same on AWS, I would have noticed it only at the end of the month, an expensive lesson.

[–] [email protected] 5 points 11 months ago (2 children)

The idea of Glacier is for you to have this data as a disaster recovery or very deep archive where the chances are very low that you will ever use it. Otherwise Glacier doesn't make sense, as the price for data retrieval is very high, as well as the time to recover your information can be a couple of hours.

My suggestion is to stay away from it. Other than that you can go to AWS Cost explorer and see your cost and an estimation how high your bill might be at the end of the month based on your current and past consumption.

[–] [email protected] 3 points 11 months ago

Exactly this. It is meant for last resort all of your other backups have failed. Literally stored on tape, cold meaning it's not even turned on. Great for archive and last resort backups, but I would only use it as backup number 3 or higher.

[–] [email protected] 3 points 11 months ago (1 children)

But wouldn't that suit OP's use case? Storing BorgBackups? That's how I use this storage tier - just in case my local copies aren't recoverable.

[–] [email protected] 0 points 11 months ago

Ultimately he needs to take that decision, I am just saying what is usually Glacier used for.

[–] [email protected] 4 points 11 months ago (1 children)

i also use amazon, and i watch the costs like a hawk. it can explode quickly.

if youre cost concerned, i would not recommend amazon. although a mature environment, youre paying for it.

that said, the cost explorer is your friend. sorting by 'usage type' over time is what made it start working for me.

i was also able to throw that metric into a default cloudfront dashboard.

if you want serious details, you may need to do as they say and create the user, to access the required metrics.

from recollection you need to create a user who can access the API required to grab the metrics you want. Even in their own system, this user needs to exist before they can show you metrics using their own api.

i ran into similar security accessing my s3 bucket proceduraly.

[–] [email protected] 1 points 11 months ago (1 children)

created new account in https://us-east-1.console.aws.amazon.com/organizations/v2/home/accounts

for some reason got another 12 months of free usage (which was expired on my account)

waited 3 minutes to get the aws account id, before it wasn't appearing

delegated new account in https://s3.console.aws.amazon.com/s3/lens/organization-settings/add-account?region=us-east-1

had no idea about the password, tried to get details on user, it complained that "AWS Account Management trusted access is not enabled"

enabled it here https://us-east-1.console.aws.amazon.com/organizations/v2/home/services/AWS%20Account%20Management

still have no idea how to access this new account, i don't know the password, they only sent me a welcome email

[–] [email protected] 1 points 11 months ago

If you create a new account you should have configured a root email address for it. That one should have received an email to login and set the initial password IIRC.

You can get an estimate of what it's going to cost by going to https://calculator.aws

Upload to AWS shouldn't really cost much, unless you're sending a lot of API put requests. Since they are backups I'm going to guess the files are large and will be uploaded as Multi-Part and will probably invoke multiple API calls to do the upload.

My suggestion would be to upload it to s3 and have it automatically transition to glacier for you using a lifecycle rule.

Cost explorer would be your best bet to get an idea of what it'll cost you at the end of the month as it can do a prediction. There is (unfortunately) not a way to see how many API requests you've already done IIRC.

Going by the s3 pricing page, PUT requests are $ 0.005 per 1000 requests( N. Virginia ).

Going by a docs example

For this example, assume that you are generating a multipart upload for a 100 GB file. In this case, you would have the following API calls for the entire process. There would be a total of 1002 API calls.

https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html

Assuming you're uploading 10x 100gb according to the upload scheme mentioned above you'd make 10.020 API calls which would cost you 10 * 0.005= 0.05$.

Then there would be the storage cost on glacier itself and the 1 day storage on s3 before it transitioned to glacier.

Retrieving the data will also cost you, as well as downloading the retrieved data from s3 back to your device. If we're talking about a lot of small files you might incur some additional costs of the KMS key you used to encrypt the bucket.

I typed all this on my phone and it's not very practical to research like this. I don't think I'd be able to give you a 100% accurate answer if I was on my pc.

There's some hidden costs which aren't Hidden if you know they exist.

Note that (imo) AWS is mostly aimed at larger organisations and a lot of things ( like VMs ) are often cheaper elsewhere. It's the combination or everything AWS does and can do so that makes it worth the while. Once you have your data uploaded to s3 you should be able to see a decent estimate in cost explorer.

Note that extracting all that data back from s3 to your onprem or anywhere or you decide to leave AWS will cost you a lot more than what it cost you to put it there.

Hope this helps!

[–] [email protected] 3 points 11 months ago* (last edited 11 months ago)

The first screen you get to when you log in should show you your predicted monthly cost and current. Failing that you can use the Aws cost explorer to guesstimate how much

edit:

You can also go through the cost and billing manager in Aws.

Edit edit!:

You want to use glacier which is cool, pun very much intended. Plug your estimated usage into cost explorer and you should be good to go.

[–] [email protected] 3 points 11 months ago

Short answer, no. Nobody knows. At least not unless you can accurately predict exactly how many API calls and how much data you will transfer.

[–] [email protected] 1 points 11 months ago

UPDATE: after some days, the bill under https://us-east-1.console.aws.amazon.com/billing/home?region=us-east-1#/bills is populated in much detail. Now it's much clear.

With rclone, my test sending 131 files / 2500 mb, set to don't do upload chunking and don't do HEAD on glacier created:

110 PutObject requests to Glacier Deep Archive
5 InitiateMultipartUpload requests
5 CompleteMultipartUpload requests
5 UploadPart requests
192 PUT, COPY, POST, or LIST requests
111 GET and all other requests

I think now i can safely upload everything and it shouldn't be too expensive