Cleaning up our S3 buckets for fun and profit
What you see is not all that you're paying for
AWS S3 has at least two cases where it's hard to notice the "objects" for which you're still paying. Let's take a quick look at them!
Abandoned multipart uploads
When you upload a large file to S3, your client (CLI, SDK, CI tool) almost always switches to a multipart upload under the hood. The client initiates the upload, sends the parts, then tells S3 to assemble them. If something unexpected happens in the process, like your CI machine dying or your script getting Ctrl-C'd, the individual parts that were already uploaded will stay in S3, billed at normal rate. They don't show up anywhere explicitly, but you can see them by doing the following:
aws s3api list-multipart-uploads --bucket my-bucketWe can get rid of them by setting up a simple bucket lifecycle:
{
"Rules": [{
"ID": "abort-incomplete-multipart",
"Status": "Enabled",
"Filter": {},
"AbortIncompleteMultipartUpload": { "DaysAfterInitiation": 1 }
}]
}With this rule, all abandoned multipart uploads will be cleaned up after a day!
Noncurrent versions on a versioned bucket
This is done by design, but can catch you off-guard if you're not aware of it. S3 has a very neat versioning feature, which is often used for compliance, protecting from accidential deletions, and similar use cases. When it's enabled, S3 doesn't actually delete objects, but instead it creates delete markers as the latest version, while still keeping the previous content as the previous version. All those previous versions are fully stored, which means you'll be billed for them.
To see all versions of the objects, you can do the following:
aws s3api list-object-versions --bucket my-bucketIn order to automate some of the cleanups of older versions, we can, again, use a lifecycle policy:
{
"Rules": [{
"ID": "expire-noncurrent-versions",
"Status": "Enabled",
"Filter": {},
"NoncurrentVersionExpiration": { "NoncurrentDays": 30 },
"Expiration": { "ExpiredObjectDeleteMarker": true }
}]
}The ExpiredObjectDeleteMarker part is worth pointing out. Once all noncurrent versions of an object have expired, the delete marker itself can be cleaned up too. Without that flag, you accumulate orphaned delete markers that don't take up much space individually but do clutter listings. Of course, this setup has to be consulted with compliance rules, etc, to ensure you're not clearning up objects that should be kept there! Thanks for reading and hopefully it will help you save some $ that are mysteriously leaking on your AWS accounts.