Home Big Data BMW Cloud Effectivity Analytics powered by Amazon QuickSight and Amazon Athena

BMW Cloud Effectivity Analytics powered by Amazon QuickSight and Amazon Athena

BMW Cloud Effectivity Analytics powered by Amazon QuickSight and Amazon Athena


This submit is written in collaboration with Philipp Karg and Alex Gutfreund  from BMW Group.

Bayerische Motoren Werke AG (BMW) is a motorized vehicle producer headquartered in Germany with 149,475 staff worldwide and the revenue earlier than tax within the monetary 12 months 2022 was € 23.5 billion on revenues amounting to € 142.6 billion. BMW Group is without doubt one of the world’s main premium producers of vehicles and bikes, additionally offering premium monetary and mobility companies.

BMW Group makes use of 4,500 AWS Cloud accounts throughout the whole group however is confronted with the problem of lowering pointless prices, optimizing spend, and having a central place to watch prices. BMW Cloud Effectivity Analytics (CLEA) is a homegrown instrument developed throughout the BMW FinOps CoE (Heart of Excellence) aiming to optimize and cut back prices throughout all these accounts.

On this submit, we discover how the BMW Group FinOps CoE carried out their Cloud Effectivity Analytics instrument (CLEA), powered by Amazon QuickSight and Amazon Athena. With this instrument, they successfully diminished prices and optimized spend throughout all their AWS Cloud accounts, using a centralized price monitoring system and utilizing key AWS companies. The CLEA dashboards have been constructed on the muse of the Effectively-Architected Lab. For extra info on this basis, check with A Detailed Overview of the Price Intelligence Dashboard.

CLEA offers full transparency into cloud prices, utilization, and effectivity from a high-level overview to granular service, useful resource, and operational ranges. It seamlessly consolidates knowledge from numerous knowledge sources inside AWS, together with AWS Price Explorer (and forecasting with Price Explorer), AWS Trusted Advisor, and AWS Compute Optimizer. Moreover, it incorporates BMW Group’s inner system to combine important metadata, providing a complete view of the info throughout numerous dimensions, similar to group, division, product, and purposes.

The final word aim is to lift consciousness of cloud effectivity and optimize cloud utilization in a cheap and sustainable method. The dashboards, which supply a holistic view along with a wide range of price and BMW Group-related dimensions, have been efficiently launched in Could 2023 and have become accessible to customers throughout the BMW Group.

Overview of the BMW Cloud Information Hub

On the BMW Group, Cloud Information Hub (CDH) is the central platform for managing company-wide knowledge and knowledge options. It really works as a bundle for sources which might be certain to a particular staging atmosphere and Area to retailer knowledge on Amazon Easy Storage Service (Amazon S3), which is famend for its industry-leading scalability, knowledge availability, safety, and efficiency. Moreover, it manages desk definitions within the AWS Glue Information Catalog, containing references to knowledge sources and targets of extract, rework, and cargo (ETL) jobs in AWS Glue.

Information suppliers and customers are the 2 basic customers of a CDH dataset. Suppliers create datasets inside assigned area and because the proprietor of a dataset, they’re accountable for the precise content material and for offering applicable metadata. They’ll use their very own toolsets or depend on supplied blueprints to ingest the info from supply techniques. As soon as launched, customers use datasets from completely different suppliers for evaluation, machine studying (ML) workloads, and visualization.

Every CDH dataset has three processing layers: supply (uncooked knowledge), ready (reworked knowledge in Parquet), and semantic (mixed datasets). It’s attainable to outline levels (DEV, INT, PROD) in every layer to permit structured launch and take a look at with out affecting PROD. Inside every stage, it’s attainable to create sources for storing precise knowledge. Two useful resource sorts are related to every database in a layer:

  • File retailer – S3 buckets for knowledge storage
  • Database – AWS Glue databases for metadata sharing

Overview of the CLEA Panorama

The next diagram is a high-level overview of a few of the applied sciences used for the extract, load, and rework (ELT) levels, in addition to the ultimate visualization and evaluation layer. You would possibly discover that this differs barely from conventional ETL. The distinction lies in when and the place knowledge transformation takes place. In ETL, knowledge is reworked earlier than it’s loaded into the info warehouse. In ELT, uncooked knowledge is loaded into the info warehouse first, then it’s reworked straight throughout the warehouse. The ELT course of has gained reputation with the rise of cloud-based, high-performance knowledge warehouses, the place transformation may be achieved extra effectively after loading.

Whatever the technique used, the aim is to supply high-quality, dependable knowledge that can be utilized to drive enterprise selections.

CLEA Structure

On this part, we take a more in-depth take a look at the three important levels talked about beforehand: extract, load and rework.


The extract stage performs a pivotal position within the CLEA, serving because the preliminary step the place knowledge associated to price and utilization and optimization is collected from a various vary of sources inside AWS. These sources embody the AWS Price and Utilization Studies, Price Explorer (and forecasting with Price Explorer), Trusted Advisor, and Compute Optimizer. Moreover, it fetches important metadata from BMW Group’s inner system, providing a complete view of the info throughout numerous dimensions, similar to group, division, product, and purposes within the later levels of knowledge transformation.

The next diagram illustrates one of many knowledge assortment architectures that we use to gather Trusted Advisor knowledge from practically 4,500 AWS accounts and subsequently load that into Cloud Information Hub.

Let’s undergo every numbered step as outlined within the structure:

  1. A time-based rule in Amazon EventBridge triggers the CLEA Shared Workflow AWS Step Features state machine.
  2. Primarily based on the inputs, the Shared Workflow state machine invokes the Account Collector AWS Lambda operate to retrieve AWS account particulars from AWS Organizations.
  3. The Account Collector Lambda operate assumes an AWS Identification and Entry Administration (IAM) position to entry linked account particulars by way of the Organizations API and writes them to Amazon Easy Queue Service (Amazon SQS) queues.
  4. The SQS queues set off the Information Collector Lambda operate utilizing SQS Lambda triggers.
  5. The Information Collector Lambda operate assumes an IAM position in every linked account to retrieve the related knowledge and cargo it into the CDH supply S3 bucket.
  6. When all linked accounts knowledge is collected, the Shared Workflow state machine triggers an AWS Glue job for additional knowledge transformation.
  7. The AWS Glue job reads uncooked knowledge from the CDH supply bucket and transforms it right into a compact Parquet format.

Load and rework

For the info transformations, we used an open-source knowledge transformation instrument referred to as dbt (Information Construct Device), modifying and preprocessing the info by way of a lot of summary knowledge layers:

  • Supply – This layer incorporates the uncooked knowledge the info supply supplies. The popular knowledge format is Parquet, however JSON, CSV, or plain textual content file are additionally allowed.
  • Ready – The supply layer is reworked and saved because the ready layer in Parquet format for optimized columnar entry. Preliminary cleansing, filtering, and primary transformations are carried out on this layer.
  • Semantic – A semantic layer combines a number of ready layer datasets to a single dataset that incorporates transformations, calculations, and enterprise logic to ship business-friendly insights.
  • QuickSight – QuickSight is the ultimate presentation layer, which is straight ingested into QuickSight SPICE from Athena by way of incremental every day ingestion queries. These ingested datasets are used as a supply in CLEA dashboards.

General, utilizing dbt’s knowledge modeling and the pay-as-you-go pricing of Athena, BMW Group can management prices by operating environment friendly queries on demand. Moreover, with the serverless structure of Athena and dbt’s structured transformations, you possibly can scale knowledge processing with out worrying about infrastructure administration. In CLEA there are presently greater than 120 dbt fashions carried out with advanced transformations. The semantic layer is incrementally materialized and partially ingested into QuickSight with as much as 4 TB of SPICE capability. For dbt deployment and scheduling, we use GitHub Actions which permits us to introduce new dbt fashions and modifications simply with automated deployments and checks.

CLEA Entry management

On this part, we clarify how we carried out entry management utilizing row-level safety in QuickSight and QuickSight embedding for authentication and authorization.

RLS for QuickSight

Row-level safety (RLS) is a key characteristic that governs knowledge entry and privateness, which we carried out for CLEA. RLS is a mechanism that enables us to manage the visibility of knowledge on the row stage based mostly on person attributes. In essence, it ensures that customers can solely entry the info that they’re approved to view, including a further layer of knowledge safety throughout the QuickSight atmosphere.

Understanding the significance of RLS requires a broader view of the info panorama. In organizations the place a number of customers work together with the identical datasets however require completely different entry ranges as a consequence of their roles, RLS turns into a pivotal instrument. It ensures knowledge safety and compliance with privateness laws, stopping unauthorized entry to delicate knowledge. Moreover, it provides a tailor-made person expertise by displaying solely related knowledge to the person, thereby enhancing the effectiveness of knowledge evaluation.

For CLEA, we collected BMW Group metadata similar to division, software, and group, that are fairly essential to permit customers to solely see the accounts inside their division, software, group, and so forth. That is achieved utilizing each a person title and group title for entry management. We use the person title for user-specific entry management and the group title for including some customers to a particular group to increase their permissions for various use circumstances.

Lastly, as a result of there are numerous dashboards created by CLEA, we additionally management which customers a singular person can see and in addition the info itself within the dashboard. That is achieved on the group stage. By default, all customers are assigned to CLEA-READER, which is granted entry to core dashboards that we need to share with customers, however there are completely different teams that enable customers to see extra dashboards after they’re assigned to that group.

The RLS dataset is refreshed every day to catch current modifications concerning new person additions, group modifications, or some other person entry modifications. This dataset can also be ingested to SPICE every day, which robotically updates all datasets restricted by way of this RLS dataset.

QuickSight embedding

CLEA is a cross-platform software that gives safe entry to QuickSight embedded content material with custom-built authentication and authorization logic that sits on prime of BMW Group identification and position administration companies (known as BMW IAM).

CLEA supplies entry to delicate knowledge to a number of customers with completely different permissions, which is why it’s designed with fine-grained entry management guidelines. It enforces entry management utilizing role-based entry management (RBAC) and attribute-based entry management (ABAC) fashions at two completely different ranges:

  • On the dashboard stage by way of QuickSight person teams (RBAC)
  • On the dashboard knowledge stage by way of QuickSight RLS (RBAC and ABAC)

Dashboard-level permissions outline the listing of dashboards customers are in a position to visualize.

Dashboard data-level permissions outline the subsets of dashboard knowledge proven to the person and are utilized utilizing RLS with the person attributes talked about earlier. Though the vast majority of roles outlined in CLEA are used for dashboard-level permissions, some particular roles are strategically outlined to grant permissions on the dashboard knowledge stage, taking precedence over the ABAC mannequin.

BMW has an outlined set of pointers suggesting the utilization of their IAM companies as the one supply of fact for identification and entry management, which the crew took into cautious consideration when designing the authentication and authorization processes for CLEA.

Upon their first login, customers are robotically registered in CLEA and assigned a base position that grants them entry to a primary set of dashboards.

The method of registering customers in CLEA consists of mapping a person’s identification as retrieved from BMW’s identification supplier (IdP) to a QuickSight person, then assigning the newly created person to the respective QuickSight person group.

For customers that require extra in depth permissions (at one of many ranges talked about earlier than), it’s attainable to order extra position assignments by way of BMW’s self-service portal for position administration. Licensed reviewers will then evaluate it and both settle for or reject the position assignments.

Position assignments will take impact the following time the person logs in, at which era the person’s assigned roles in BMW Group IAM are synced to the person’s QuickSight teams—internally known as the identification and permissions sync. As proven within the following diagram, the sync teams step calculates which customers’ group memberships needs to be saved, created, and deleted following the logic.

Utilization Insights

Amazon CloudWatch performs an indispensable position in enhancing the effectivity and usefulness of CLEA dashboards. Not solely does CloudWatch provide real-time monitoring of AWS sources, but it surely additionally permits to trace person exercise and dashboard utilization. By analyzing utilization metrics and logs, we are able to see who has logged in to the CLEA dashboards, what options are most often accessed, and the way lengthy customers work together with numerous parts. These insights are invaluable for making data-driven selections on the way to enhance the dashboards for a greater person expertise. By way of the intuitive interface of CloudWatch, it’s attainable to arrange alarms for alerting about irregular actions or efficiency points. Finally, using CloudWatch for monitoring provides a complete view of each system well being and person engagement, serving to us refine and improve our dashboards regularly.


BMW Group’s CLEA platform provides a complete and efficient resolution to handle and optimize cloud sources. By offering full transparency into cloud prices, utilization, and effectivity, CLEA provides insights from high-level overviews to granular particulars on the service, useful resource, and operational stage.

CLEA aggregates knowledge from numerous sources, enabling an in depth roadmap of the cloud operations, monitoring footprints throughout primes, departments, merchandise, purposes, sources, and tags. This dynamic imaginative and prescient helps establish developments, anticipate future wants, and make strategic selections.

Future plans for CLEA embody enhancing capabilities with knowledge consistency and accuracy, integrating extra sources like Amazon S3 Storage Lens for deeper insights, and introducing Amazon QuickSight Q for clever suggestions powered by machine studying, additional streamlining cloud operations.

By following the practices right here, you possibly can unlock the potential of environment friendly cloud useful resource administration by implementing Cloud Intelligence Dashboards, offering you with exact insights into prices, financial savings, and operational effectiveness.

Concerning the Authors

Philipp Karg is Lead FinOps Engineer at BMW Group and founding father of the CLEA platform. He concentrate on boosting cloud effectivity initiatives and establishing a cost-aware tradition throughout the firm to in the end leverage the cloud in a sustainable means.

Alex Gutfreund is Head of Product and Expertise Integration on the BMW Group. He spearheads the digital transformation with a selected concentrate on platforms ecosystems and efficiencies. With in depth expertise on the interface of enterprise and IT, he drives change and makes an influence in numerous organizations. His {industry} information spans from automotive, semiconductor, public transportation, and renewable energies.

Cizer Pereira is a Senior DevOps Architect at AWS Skilled Providers. He works carefully with AWS clients to speed up their journey to the cloud. He has a deep ardour for Cloud Native and DevOps, and in his free time, he additionally enjoys contributing to open-source initiatives.

Selman Ay is a Information Architect within the AWS Skilled Providers crew. He has labored with clients from numerous industries similar to e-commerce, pharma, automotive and finance to construct scalable knowledge architectures and generate insights from the info. Exterior of labor, he enjoys enjoying tennis and fascinating in outside actions.

Nick McCarthy is a Senior Machine Studying Engineer within the AWS Skilled Providers crew. He has labored with AWS shoppers throughout numerous industries together with healthcare, finance, sports activities, telecoms and vitality to speed up their enterprise outcomes by way of using AI/ML. Exterior of labor Nick likes to journey, exploring new cuisines and cultures within the course of.

Miguel Henriques is a Cloud Utility Architect within the AWS Skilled Providers crew with 4 years of expertise within the automotive {industry} delivering cloud native options. In his free time, he’s continuously in search of developments within the internet improvement house and trying to find the following nice pastel de nata.



Please enter your comment!
Please enter your name here