Skip to the main content.
Platform

AIR
Acquire. Interact. Retain.
Breathe life into your customer relationships

Learn more

ASDA Rewards Logo

How ASDA leveraged Eagle Eye's market-leading loyalty platform and expertise to launch 'ASDA Rewards', deployed just 3 months after project kick-off.

Become a Partner

Contact us to find out how we can enable your teams on our platform.

mach-members-and-google-premier-partners

3 min read

How EagleAI Cut Google Cloud Storage Costs by 50%

How EagleAI Cut Google Cloud Storage Costs by 50%

At EagleAI, we process several terabytes of data every day — both as inputs to our predictive AI models and as outputs to personalize the offers we recommend to our clients. A significant portion of that data is stored on Google Cloud Storage (GCS), and if left unchecked, storage costs can quietly become a major drain on our cloud budget.

In this article, I’ll show you how we identified a misconfigured bucket — replicated across multiple projects — that was responsible for the majority of our GCS costs. By making a few simple adjustments, we managed to reduce those costs by 80% for the affected buckets, and by 50% across all GCS usage!

Monitoring GCP Costs Shouldn't Be an Afterthought

If your business is running at scale, you probably have multiple GCP projects — different dev environments, clients, or teams — with similar resources replicated across them (BigQuery datasets, Cloud Run apps, GCS buckets, etc.).

Tracking costs in detail can be a real headache, especially with GCS where bucket names must be globally unique. In this context, how do you group similar resources across projects to understand their combined impact?

GCP provides several tools to help, if you follow a few best practices:

  • Add labels to your resources — using consistent keys and values across projects. For example, buckets used to store temporary files could be labeled (e.g. gs_bucket: temporary)
  • Enable cost export to BigQuery
  • Create a Looker Studio dashboard (formerly Data Studio) to group costs by label, filter by project or resource type, and visualize trends. GCP even provides ready-to-use dashboards that you can deploy in just a few minutes.

Monitoring GCP Costs

Using these practices, we quickly discovered that a single bucket — replicated across multiple projects — was responsible for more than 60% of our total GCS costs.

How We Built an Efficient Bucket Cleanup Strategy

Step 1: Inventory Your Bucket Contents

Before changing anything, you need a more granular view of what’s stored in your bucket — so you can safely reconfigure things without accidentally deleting critical files.

Fortunately, GCP provides the gcloud storage CLI tool (which recently replaced gsutil) to retrieve file metadata and analyze storage usage. You can use it to:

  • Calculate data volume for each folder → to identify the costliest folders.
  • Check last modified or creation date for files → to locate legacy folders that haven’t been updated in months and may be candidates for deletion.

Here's a simplified example to estimate folder sizes:

$ gcloud storage du -a -r -s $(gcloud storage ls gs://[BUCKET_NAME]/)

Then, gather the relevant teams that use the bucket and determine the appropriate lifecycle for each folder. To make things easier, follow these guiding questions:

  • Which files must never be deleted (e.g. for legal compliance)?
  • How long should files remain available for reading? For archiving?
  • Do we need access to previous versions of the same file?
  • For partitioned files, how many partitions should you keep?

Step 2: Configure Bucket Policies

GCS offers a wide range of configuration options to manage file retention, versioning, and cost. Based on our experience, here are some best practices:

  • Move critical files to a dedicated bucket, and ensure all future usage (write/read)  go directly there.
  • Enable object versioning only if needed — not to recover accidentally deleted files, but only when you need historical versions.
  • Enable soft delete with a retention period suited to your use case.
💡 Soft delete is a new GCP feature introduced in March 2024 that lets you recover deleted files during a defined retention window.
  • Use lifecycle rules to transition infrequently accessed files from Standard to cheaper storage classes like Nearline, Coldline, or Archive.

💡 Beware of minimum storage duration requirements! For example, files moved to Archive must be stored for at least one year before deletion. Moreover operations on those files such as listing or reading are more expensive.

  • Automatically delete old files based on time since creation.
  • Apply prefix/suffix-based lifecycle rules to enforce stricter cleanup policies on the heaviest folders.

💡Prefix and suffix-based lifecycle rules were introduced in August 2022 to apply custom conditions on specific folders and file extensions.

Example configuration we use today:

  • Lifecycle rules:
    • Transition to Coldline after 90 days.
    • Delete after 1 year.
  • Special rules for large folders (based on a predefined list of prefixes):
    • Delete after 90 days.
  • Soft delete enabled, with a 7-day retention window.
  • Object versioning disabled.

💡 In case of conflict, deletion rules take precedence over class transitions. So in this setup, a file in a high-volume folder will be deleted after 3 months, even if it's eligible for Coldline transition.

Immediate Results

Evolution of GCS storage costs

Legend: Evolution of GCS storage costs before and after applying lifecycle and soft-delete policies.

By applying these configurations to our most expensive buckets, we achieved an 80% reduction in their storage costs, representing a 50% drop in our overall GCS expenses!

Final Thoughts

While GCS storage might seem cheap compared to other GCP components (like BigQuery analysis, Compute CPUs and RAMs usage,..), it can silently become a major line item over time — especially if left unmanaged.

The good news? With the right tooling and a few easy-to-implement practices, controlling GCS costs is both simple and impactful.

How we optimized Cloud Run Networking with Direct VPC Egress

4 min read

How we optimized Cloud Run Networking with Direct VPC Egress

Google Cloud Run is astonishingly simple and increasingly popular among developers. It lets anyone deploy containerized applications quickly and...

Read More
A Better, Simpler & More Effective Way to Accelerate Customer Loyalty by Utilising MACH Architecture

4 min read

A Better, Simpler & More Effective Way to Accelerate Customer Loyalty by Utilising MACH Architecture

In our previous post, Meet MACH: The Principle of Eagle Eye's Technology Approach, we examined the nature of MACH architecture, its benefits, and how...

Read More
Navigating New Waters: Applying AI to Retail Marketing

6 min read

Navigating New Waters: Applying AI to Retail Marketing

Artificial intelligence (AI) is reimagining the entire retail journey, giving brands more capabilities and their customers more experiences....

Read More