Breaking Down AWS Cost and Usage Reports (CUR): A Step-by-Step Guide – Part 1
I’ve spent years helping public sector customers here in Australia figure out that puzzle. Let me tell you, I’ve seen smart people stumble, stuck trying to puzzle out cost data line by line. But what if you had a ledger that revealed every penny spent, a record that not only guided you toward more thoughtful usage, but also helped you implement a chargeback model to drive accountability?
You might have heard me talk about nurturing a cost-aware culture before. Now let’s take a closer look at the AWS Cost and Usage Report (CUR) and learn how to use it as the backbone for fair and transparent cost distribution. We’ll start simple—setup, key features, and data structuring—then get more practical. By the time we’re done with this post, I hope you’ll have fresh ideas on how to assign AWS costs back to those who consumed them, making each team think carefully before they spend.
Understanding AWS Cost and Usage Reports (CUR)
Back to basics. What is the AWS CUR, and why do I keep talking about it? It’s your magnifying glass on the whole estate. Unlike those high-level summaries that leave you guessing, CUR lays out how much each resource consumed, which account generated it, and what tags are attached. It’s the detailed record you need when you want to say, “This service cost £X to run in Region Y on Account Z last Tuesday.” And when combined with proper tagging and cost categories, you have a data source for meaningful chargebacks. Without CUR, distributing costs back to those who caused them is guesswork. With CUR, it’s a matter of well-structured queries.
What Exactly Does CUR Contain?
Think of CUR as a ledger that records not just what you spent, but who spent it, when, and on which resource. Everything’s there: billing data, usage details, metadata like account IDs, regions, tags, and even third-party charges. But be warned, it’s detailed. The first time you open one, you might feel as if you’re looking at a thousand-page phone directory. Yet that granularity is the key to fair internal cost assignments. Want to know who racked up all that EC2 spend? It’s right there. Want to pinpoint which project left a large storage bucket half full and running all month? CUR shows you.
Peering into the Details
One of the finer aspects of CUR is how it can drill down by service, region, account, or even custom tags. That means you can slice your costs by team or project, then map those slices back to your chargeback model. Picture it as splitting a group dinner bill. Instead of splitting it evenly and upsetting the quiet intern who only had a sandwich, you charge each person for what they actually ordered.
Choosing the Right Format and Delivery
You can get CUR data delivered to an Amazon S3 bucket in CSV or Parquet form. CSV works when you’re tinkering in a spreadsheet, but Parquet is usually best. Smaller file sizes, column-based storage, and compatibility with Athena make it much smoother to query at scale. You can schedule data delivery hourly, daily, or monthly. Hourly granularity can help you attribute short-term spikes and make chargebacks more precise. After all, if a team spinned up a cluster for a day-long test, wouldn’t you rather capture that cost the moment it appears?
Understanding the Differences: Legacy CUR vs CUR 2.0
If you’ve used CUR before, you might wonder, “What really changed with this newer version?” Let’s break it down:
- Data Structure Improvements: CUR 2.0 organises cost and usage data more consistently. In Legacy CUR, what often had to happen was to rely on column names that weren’t always aligned nicely across various services. Product, pricing, and resource information are wrapped up into structured objects in CUR 2.0. This tidier format allows for easier querying of the data with Athena or other tools. You don’t have to juggle multiple columns named differently for different services, instead you get a more uniform, predictable schema. Integration with Other Services: Most often, Legacy CUR was just a static dataset you downloaded and processed. CUR 2.0 feels more like a part of the AWS ecosystem. For example, it is now easier to publish CUR data to AWS Data Exchange. This enables you to deliver curated cost data to partners, vendors or separate internal departments in a more graceful way.
- Granular Permissions and Access Control: Access management in legacy CUR sometimes felt a bit clumsy. With CUR 2.0, you get better integration with AWS Identity and Access Management, so you can define who can access what slice of cost data without more hassle. If you are going to do chargebacks, you may want certain teams to be able to query their part of the data without exposing all of the rest. CUR 2.0 is designed to make that easier to achieve.
- Performance and Processing Enhancements: Parquet is the format of choice that CUR 2.0 leans into, resulting in better query performance at scale. Handling huge CSV files with Legacy CUR could become unwieldy. With CUR 2.0, data processing is smoother so you can run more frequent and more detailed queries without waiting too long. When you’re doing daily or hourly chargeback calculations, this speed up can make a big difference.
- Easier Migration and Better Guidance: With CUR 2.0, you will see better aligned documentation and examples of how to query and manage your cost data. Because of the improved structure and integrated approach, you will have less trial and error, and you can trust the official docs and guides to get started faster. CUR 2.0 is more consistent than Legacy CUR if that sometimes left you scratching your head.
Read more about CUR 2.0 in the AWS documentation here
A Quick Look at CUR 2.0
If you’ve used CUR before, you might wonder, “Why bother with the newer version?” CUR 2.0 aims to reduce friction. Features like tighter integration with services such as AWS Data Exchange help distribute cost data to external departments or partners. The data structure also feels tidier, making it easier to define who can access what. And the faster processing behind the scenes makes it simpler to run queries for daily or even hourly chargeback runs. If old CUR was a steady old car, CUR 2.0 is more like a tuned-up one that gets you where you need to go in less time.
Enabling Chargeback with a Centralised Storage Pattern
I remember speaking with a CIO who said, “We added another account and now cost reporting feels like starting from scratch.” That’s where the centralised storage pattern helps. It sets up a nominated account with a secure S3 bucket to gather CUR data from multiple payer accounts. So whether you manage one account or a dozen, you can store all usage data in a single place. And this is important: by analysing everything centrally, you can map costs back to the right teams using a uniform method. No more guesswork, no more silos.
This approach also provides a neater security story. You store your data in one bucket with strict access controls. You add encryption, keep logs for traceability, and apply lifecycle rules. It’s tidy and secure. Once that’s in place, introducing a new payer account is as simple as updating parameters and waiting for the next delivery. Your chargeback model grows with you, and your teams can’t say they didn’t know what they owed, because the data was never missing.
AWSTemplateFormatVersion: "2010-09-09" Description: Create a central S3 bucket in the nominated account for CUR replication. Parameters: ReplicationSourceAccounts: Type: CommaDelimitedList Description: "List of AWS account IDs authorised to replicate CUR data to this bucket." KMSKeyArn: Type: String Default: "" Description: "Optional KMS Key ARN for bucket encryption. Leave blank for default encryption." Conditions: IsKMSKeyProvided: !Not [!Equals [!Ref KMSKeyArn, ""]] Resources: CentralCURBucket: Type: AWS::S3::Bucket Properties: BucketName: central-cur-bucket VersioningConfiguration: Status: Enabled PublicAccessBlockConfiguration: BlockPublicAcls: true BlockPublicPolicy: true IgnorePublicAcls: true RestrictPublicBuckets: true BucketEncryption: !If - IsKMSKeyProvided - ServerSideEncryptionConfiguration: - ServerSideEncryptionByDefault: SSEAlgorithm: aws:kms KMSMasterKeyID: !Ref KMSKeyArn - ServerSideEncryptionConfiguration: - ServerSideEncryptionByDefault: SSEAlgorithm: AES256 LifecycleConfiguration: Rules: - Id: TransitionToIA Status: Enabled Transitions: - StorageClass: STANDARD_IA TransitionInDays: 90 Tags: - Key: Purpose Value: CentralisedCUR - Key: Environment Value: Production CentralBucketPolicy: Type: AWS::S3::BucketPolicy Properties: Bucket: !Ref CentralCURBucket PolicyDocument: Version: "2012-10-17" Statement: - Sid: AllowReplicationFromPayerAccounts Effect: Allow Principal: AWS: !Ref ReplicationSourceAccounts Action: - s3:ReplicateObject - s3:GetObjectVersionForReplication Resource: - !Sub "arn:aws:s3:::${CentralCURBucket}/*" - !Sub "arn:aws:s3:::${CentralCURBucket}" Condition: Bool: aws:SecureTransport: true LoggingConfiguration: Type: AWS::S3::BucketLogging Properties: Bucket: !Ref CentralCURBucket LoggingEnabled: TargetBucket: logging-bucket-name TargetPrefix: "central-bucket-logs/" Outputs: CentralBucketName: Description: The name of the centralised S3 bucket for CUR. Value: !Ref CentralCURBucket Export: Name: CentralCURBucketName
AWSTemplateFormatVersion: "2010-09-09" Description: Set up CUR in the payer account and configure replication to the central S3 bucket. Parameters: DestinationBucket: Type: String Description: "Name of the central S3 bucket." DestinationAccountId: Type: String Description: "AWS Account ID of the nominated account where the central S3 bucket is located." KMSKeyArn: Type: String Default: "" Description: "Optional KMS Key ARN for encryption." Conditions: IsKMSKeyProvided: !Not [!Equals [!Ref KMSKeyArn, ""]] Resources: CURS3Bucket: Type: AWS::S3::Bucket Properties: BucketName: !Sub "${AWS::AccountId}-cur-reports" VersioningConfiguration: Status: Enabled PublicAccessBlockConfiguration: BlockPublicAcls: true BlockPublicPolicy: true IgnorePublicAcls: true RestrictPublicBuckets: true LifecycleConfiguration: Rules: - Id: TransitionToIA Status: Enabled Transitions: - StorageClass: STANDARD_IA TransitionInDays: 90 CURReportDefinition: Type: AWS::CUR::ReportDefinition Properties: ReportName: "DetailedCostReport" TimeUnit: "HOURLY" Format: "Parquet" Compression: "GZIP" S3Bucket: !Ref CURS3Bucket S3Prefix: !Sub "cur/${AWS::AccountId}/" S3Region: !Ref AWS::Region AdditionalSchemaElements: - "RESOURCES" ReportVersioning: "CREATE_NEW_REPORT" CURReplicationRole: Type: AWS::IAM::Role Properties: AssumeRolePolicyDocument: Version: "2012-10-17" Statement: - Effect: Allow Principal: Service: s3.amazonaws.com Action: sts:AssumeRole Policies: - PolicyName: ReplicationPolicy PolicyDocument: Version: "2012-10-17" Statement: - Effect: Allow Action: - s3:ReplicateObject - s3:GetObjectVersionForReplication Resource: - !Sub "arn:aws:s3:::${CURS3Bucket}/*" - !Sub "arn:aws:s3:::${DestinationBucket}/*" BucketPolicy: Type: AWS::S3::BucketPolicy Properties: Bucket: !Ref CURS3Bucket PolicyDocument: Version: "2012-10-17" Statement: - Sid: AllowReplicationAccess Effect: Allow Principal: AWS: !Sub "arn:aws:iam::${DestinationAccountId}:root" Action: - s3:PutObject - s3:GetBucketLocation Resource: - !Sub "arn:aws:s3:::${CURS3Bucket}/*" - !Sub "arn:aws:s3:::${CURS3Bucket}" Condition: Bool: aws:SecureTransport: true LoggingConfiguration: Type: AWS::S3::BucketLogging Properties: Bucket: !Ref CURS3Bucket LoggingEnabled: TargetBucket: logging-bucket-name TargetPrefix: "payer-bucket-logs/" Outputs: CURReportName: Description: Name of the Cost and Usage Report. Value: !Ref CURReportDefinition
What if you add a new management account? You just update the central bucket stack to add that account ID, deploy the CUR setup stack in the new account, and data begins flowing soon after. No heroics, no manual guesswork. It’s the same pattern repeated—scalable, predictable, and ready to support more chargebacks as your AWS usage expands.
Turning Data into Chargebacks
CUR data by itself is valuable, but you need to shape it so that each team sees its share of the cost. Tags and cost categories help turn raw data into something meaningful. By tagging resources with something like “Department: Marketing” or “Project: AthenaUpgrade,” you can filter CUR queries so only that team’s spend is surfaced. You might ask, “How much did that machine learning project really cost last month?” If you’ve been tagging well, a quick Athena query can answer that. And that’s the data you use to send the bill back to the right team.
Queries in Action
SELECT cost_category['Business_Unit'] AS business_unit, SUM(line_item_unblended_cost) AS total_cost FROM cur_table WHERE billing_period = '2024-01' GROUP BY cost_category['Business_Unit'] ORDER BY total_cost DESC;
Analysing Resource Usage
WITH hourly_usage AS ( SELECT resource_tags['Project'] AS project, DATE_TRUNC('hour', line_item_usage_start_date) AS usage_hour, SUM(line_item_usage_amount) AS usage_amount, SUM(line_item_unblended_cost) AS cost FROM cur_table WHERE product['product_name'] = 'Amazon Elastic Compute Cloud' GROUP BY 1, 2 ) SELECT project, AVG(usage_amount) AS avg_hourly_usage, STDDEV(usage_amount) AS usage_variation, SUM(cost) AS total_cost FROM hourly_usage GROUP BY project ORDER BY usage_variation DESC;
Savings Plans and Coverage
SELECT resource_tags['Project'] AS project, SUM(savings_plan_savings_plan_effective_cost) AS covered_spend, SUM(line_item_unblended_cost) AS total_spend, (covered_spend / NULLIF(total_spend, 0)) * 100 AS coverage_percentage FROM cur_table WHERE line_item_line_item_type LIKE 'SavingsPlan%' GROUP BY project HAVING total_spend > 0 ORDER BY coverage_percentage DESC;
CUR 2.0 Differences
SELECT product['product_name'], product['region'] FROM cur_table WHERE billing_period = '2024-01';
This structured approach streamlines queries, making it easier to get just the detail you need.
By now, you have a sense of how CUR can help you shape chargeback models. It’s not some mysterious code to crack. It’s a straightforward ledger, waiting for you to slice, filter, and query it until you know exactly who owes what. The centralised pattern provides order, tagging and cost categories add meaning, and Athena queries let you carve out insights. And then, when you hand that internal bill to the right team, it’s no surprise to anyone. They see the numbers. They know they own them.
In the next post, I’ll go further. We’ll explore how you can use automation, dashboards, and more advanced analysis to keep an even closer eye on costs. Think anomaly detection, trend analysis, and ways to forecast spend with more confidence. That’s where things start to get genuinely interesting, as you gain the tools to act on your insights rather than just record them.
Discover more from Vinay Sastry
Subscribe to get the latest posts sent to your email.