The Data Operating Model (DOM) at BMT provides a comprehensive and structured framework that supports strategic decision-making and operational efficiency across the organisation. Developed to enhance data accessibility, quality, governance, and scalability, the DOM plays a crucial role in enabling data-driven decision-making and aligning data practices with BMT’s core business objectives.
The DOM is designed to:
Integrated SOPs across all data lifecycle stages—from planning and acquisition to processing, publication, and archiving—ensure that data handling processes are standardised, efficient, and aligned with governance protocols. These SOPs provide clear guidelines on quality checks, data security, and user accessibility, forming the backbone of BMT’s data handling practices.
The DOM includes advanced DataOps and Data Observability tools to monitor and enhance data quality in real time. Using a CI/CD (Continuous Integration/Continuous Deployment) approach, automated testing and monitoring support consistent, reliable data processing. Decentralised data ownership through Data Mesh principles empowers departments to manage data while adhering to central governance, encouraging a collaborative data culture.
A balanced scorecard aligned with BMT’s strategic pillars monitors DOM performance, measuring data quality, operational efficiency, environmental impact, and innovation. These KPIs provide actionable insights that guide ongoing improvements in data management and governance.
The Data Operating Model (DOM) is designed to support BMT’s strategic objectives through a structured, high-quality data framework. It centralises, secures, and optimises data handling to enable agile, data-driven decision-making across BMT’s global operations. This model outlines how data is acquired, transformed, governed, and made accessible to provide a “single source of truth,” supporting consistent insights and fostering a culture of innovation and continuous improvement.
The DOM applies to data governance, architecture, integration, security, user accessibility, and data quality across BMT’s operational and strategic levels.
The DOM strategically integrates with BMT’s core business objectives, from customer insights and project tracking to continuous innovation and operational efficiency.
Enhancing Customer Insights: Consolidates customer data across touchpoints, enabling proactive service improvements.
Optimising Operational Efficiency: Real-time data access supports faster response times, streamlining workflows.
Driving Innovation: Supports advanced analytics for continuous product and service improvement.
Ensuring Security and Compliance: Embedded security protocols support data privacy, with audit logs ensuring transparency.
Supporting Environmental Goals: Cloud-based, resource-efficient architecture reduces energy consumption, supporting sustainability targets.
The DOM is designed to evolve alongside BMT’s growing data needs, ensuring it can scale regionally, adopt new sources, and support advanced analytics as demand rises.
Regional and Departmental Scalability: Cloud-based infrastructure enables seamless expansion across geographic regions and departments.
Integration of New Data Sources: New sources are incorporated efficiently, with standardised ingestion to maintain quality.
Flexible for Evolving Analytics Needs: Supports advanced analytics through modular medallion architecture, allowing for easy integration of new analytical tools and capabilities.
Sustainable Resource Optimisation: Optimises data storage and processing to align with BMT’s environmental targets.
The DOM includes a balanced scorecard that aligns with BMT’s strategic pillars, providing KPIs to measure success across quality, accessibility, and impact.
Data Quality Metrics: Consistency, accuracy, completeness, and timeliness checks.
Operational Efficiency Metrics: Data latency, accessibility rates, and system uptime.
Innovation & Growth Enablement: Number of advanced analytics models in use, speed of onboarding new data sources.
Environmental Impact Metrics: Reduction in redundant data processing, energy consumption in cloud storage.
For BMT to remain competitive in a rapidly evolving technological landscape, it must be able to adapt quickly to insights and new developments. The modern enterprise data warehouse model within the DOM facilitates access to reliable, timely, and structured data insights, empowering informed decision-making. Key benefits include:
To facilitate structured data processing and analysis, the DOM utilises a multi-layered medallion architecture, with each layer adding value to data through quality checks, governance protocols, and contextually relevant transformations.
Purpose: Ingest raw data from internal systems, external sources, and customer interactions.
Processes: Standardised ingestion protocols capture data in its native form, preserving its integrity.
Quality Focus: Initial data profiling and validation to eliminate redundancies early.
Purpose: Transform raw data into a standard format aligned with BMT’s Common Data Model (CDM).
Processes: Apply cleansing, validation, and transformation processes to resolve discrepancies.
Quality Focus: Ensure accuracy, completeness, and consistency, making data fit for reporting.
Purpose: Aggregate data into subject-specific views for departmental access (e.g., Finance, HR, Operations).
Processes: Develop data marts tailored to specific business functions, enhancing usability and accessibility.
Quality Focus: Refine data with deduplication and completeness checks to enable operational insights.
Purpose: Enable advanced analytics, predictive modelling, and machine learning applications.
Processes: Utilise enriched data from Gold to train models and provide predictive insights.
Quality Focus: Maintain high standards for timeliness and accuracy to support strategic foresight.
A Data Warehouse & Data Mart Operating Model alone does not create quality and value but requires an intersection of process, people, and technology conceived around the data model to provide its actual value. Building processes and organisational structures to deliver compelling data products to users is the realisation of what the Data Operations Model aims to provide.
Action | Objective | Quality Measures |
Data Flow Implementation | Implement data flows to seamlessly connect operational systems with analytics and business intelligence (BI) systems. | Ensure data integrity throughout the flow. Validate data consistency and accuracy at each stage. |
Source-to-Target Mapping Documentation | Document clear source-to-target mappings for transparency and traceability. | Ensure mappings are comprehensive and up-to-date. Verify mappings against actual data transformation processes. |
Data Flow Re-engineering | Re-engineer manual data flows for scalability and repeatability. | Assess scalability potential. Test repeatability under various scenarios. |
ETL Script Optimisation | Write efficient ETL (Extract, Transform, Load) scripts and code for optimal performance. | Conduct performance testing on ETL processes. Optimise scripts for resource efficiency. |
Reusable Business Intelligence Reports | Develop business intelligence reports that are reusable and adaptable. | Test report generation under different conditions. Ensure reports meet stakeholder requirements effectively. |
Accessible Data for Analysis | Build accessible datasets to facilitate easy analysis. | Validate data accessibility across relevant platforms. Ensure data security and compliance with access controls. |
AI Analytics Readiness | Prepare data infrastructure and pipelines to support AI-driven analytics and machine learning models. | Assess compatibility with AI frameworks and libraries. Ensure data quality and format suitability for AI model training and inference. |
Data literate colleagues is key for delivering value to the business. Building data literacy requires common definitions and understanding of competencies needed for us to work together in treating data as an enterprise asset. A Competency Framework provides a model to guide literacy efforts, which involves all colleagues working with data (Knowledge Workers and Data Consumers). The following skills form the basis of the data competency framework, the level and degree colleagues should show these skills is dependent upon their role and the corresponding skill level requirements (i.e. awareness, working, practitioner, expert)
BMT uses advanced technologies to streamline data management, from ingestion to analytics. Key tools include:
Each technology platform is configured for optimal performance and accessibility, enabling seamless integration with Power BI and other data tools.
BMT has chosen Microsoft Fabric as the foundation for its data warehouse and data mart environments. This integrated set of data and analytical services, based on the Azure/Synapse Cloud, provides a robust platform for managing BMT’s data assets efficiently.
Key components of Microsoft Fabric include:
At the core of BMT’s data warehouse architecture is the Common Data Model, which serves as the foundation for representing the organisation’s core business processes and common form designs. The Common Data Model supports dimensional modelling, where data is structured into measurement facts and descriptive dimensions, enabling efficient querying and analysis.
Dimensional models, instantiated as star schemas or cubes in relational databases, provide a structured framework for organising and accessing data, facilitating reporting and analytics processes.
BMT leverages the Power Platform to construct a broad range of business intelligence (BI) applications, empowering users to access, analyse, and visualise data effectively. This includes:
Integrating the data warehouse or data marts with Power BI involves several key steps:
By following these principles and leveraging Microsoft Fabric, the Common Data Model, and the Power Platform, BMT ensures a robust and integrated approach to data management, analytics, and reporting, driving informed decision-making and business success.
The DOM’s governance framework defines clear roles, policies, and responsibilities, ensuring data is trustworthy, compliant, and effectively managed across its lifecycle.
Data Governance Structure: Establishes an oversight committee and data stewards across departments to maintain quality.
Roles & Responsibilities: Assign data owners, stewards, and custodians to manage, validate, and secure data across departments.
Quality Control and Continuous Improvement: Routine audits and feedback loops ensure that quality standards are maintained and improved.
Responsible for setting strategic objectives, policies, and priorities related to data governance and quality management.
Name | Role (in this context) | RACI |
Julie Stone | Chair of Information Workstream | Accountable |
Sarah Martino | BIRA Manager | Responsible (Reporting) |
Julian Kellett | Senior Data Engineer | Responsible (Data Engineering) |
Simon Willmore | Head of Digital Strategy | Consulted |
Mathew Rowley | Head of Accounting UK | Consulted |
Sarah Long | Data Manager | Consulted |
Business stakeholders responsible for defining data requirements, priorities, and usage guidelines.
DataMart | Named Responsibility | Role |
Finance | Sarah Martino | Finance Manager |
Employee | Gudrun Neumann | Head of People |
Resource | Simon Mathieson | Technical Assurance and Capability Director |
Project | Mike Prince | Global PMO Director |
Customer | David Dring | Head of Future Business Operations |
Business Opportunities | ||
Contract | ||
Supplier | ||
Infrastructure |
Appointed individuals responsible for managing specific data domains, overseeing data quality, and ensuring compliance with governance policies.
Name | Finance | Employee | Resource | Project | Customer | BusOpp | Contract | Supplier | Infrastructure |
Martyn Cole | I | R | R | C | I | ||||
Sam Jepson-Whyte | R | C | |||||||
Soon Tan | C | C | R | ||||||
Emma O’Neill | R |
IT personnel responsible for implementing and maintaining data management infrastructure, including data pipelines, data lakes, and ETL processes.
Name | Role (in this context) | Domain |
Julian Kellett | Data Architect | Data Warehouse & DataMart |
Ali Ahmed | Data Engineer | Data Lake |
Chris Clark | Application of SharePoint | Enterprise Applications |
Lee Southam | IT & Data Asset Security | IT Infrastructure |
Steve Smith | Development Engineer | Power Apps |
Will Newham | Operational Data Management | IFS (UK) Data |
Chris Thomas | Operational Data Management | HubSpot Data |
Commitment to ongoing review and improvement of the data quality plan and associated processes.
Description of how feedback, lessons learned, and best practices will be incorporated to enhance data quality management efforts.
There is an underlying commitment by all stakeholders to ongoing review and improvement of the data quality plan and associated processes. This requires feedback, lessons learned, and best practices to be incorporated to enhance data quality management efforts.
Feedback from data consumers and stakeholders will be solicited to identify areas for improvement in data governance and quality management.
Regular reviews and audits of data governance processes and outcomes will be conducted to identify opportunities for optimisation and enhancement.
Training programs will be provided to educate employees on data governance best practices, data quality management techniques, and compliance requirements.
A self-service repository/library will be provided to store easily retrievable information about DataMart as well as how-to guides, learning and other knowledge articles.
Data quality management within the DOM ensures that data remains accurate, consistent, timely, and reliable across the organisation.
How well does a piece of information reflect reality?
Accuracy refers to the degree in which the data correctly portrays the real-world situation in which it was originally designed to measure. Data must be meaningful and useful to allow for correct and accurate interpretation and analysis. For data to be accurate, it must also be valid, meaning it must conform to a defined format whilst implementing and adhering to specific business rules, which may be recorded in a metadata repository (a system or application where information about the data (metadata) is stored and managed).
Does it fulfil our expectations of what’s comprehensive?
This dimension reflects the ability to determine what data is missing, and whether omissions are acceptable (for example, optional data). Departments must determine and understand whether a data asset contains unacceptable gaps, as these may place limitations on the data leading to an increased reliance on assumptions and estimations or preclude the asset for use altogether. It is also useful to note the level of completeness, particularly if achieving 100% completeness may not be necessary to fulfil the dataset’s intended purpose. Also, if the dataset is considered complete as at a particular point in time, e.g. beginning or end of month.
Does information stored in one place match relevant data stored elsewhere?
Consistency of data means that the data is collected, grouped, structured, and stored in a consistent and standardised way. This requires standard concepts, definitions, and classifications to be implemented across departments, and agreed upon as to their meanings and interpretation.
Data must also be consistent in the context of its use. For example, data may appear similar but have different meanings or uses in different departments. Duplication, or different meanings for similar data, may result in confusion or misinterpretation of data and render such data unsuitable for comparison with related assets. Also, it may be unclear if trends are due to a true effect or due to problems with inconsistent data collection.
Is our information available when you need it
Timeliness refers to how quickly data can be made available when required, and the delay between the reference period (period to which data refers, such as a financial year) and the release of information. Factors that may impact this include collection method and processing. Data must be discoverable, available, and accessible throughout all stages of the data asset lifecycle from creation to retirement, to be available for greater internal use, external use (external partners, other government departs and researchers) and the public. If delays occur during the provision of data, currency and reliability may be impacted.
Can different data sets be joined correctly to reflect a larger picture?
There must be the capacity to make meaningful comparisons across multiple data assets. This is achieved through common data definitions and standards. Common data definitions should be agreed and shared across the department, and any inconsistencies should be managed.
Is our data structured and accessible for effective use?
Usability ensures that data is organised, accessible, and understandable for end-users, enabling them to efficiently obtain meaningful insights without unnecessary complexity. High usability means data is not only available but also intuitive and actionable, supporting quick decision-making and operational activities. This requires data to be accessible via clear interfaces and structured in formats that align with users’ knowledge and needs, helping them navigate data with minimal support.
Data usability also involves establishing appropriate metadata, user documentation, and visualisation aids to support seamless access and interpretation. By prioritising usability, BMT ensures that data is readily adoptable across departments and contributes effectively to business objectives.
Standard Operating Procedures (SOPs) guide the DOM’s processes, ensuring efficiency and consistency at each stage.
It is essential for BMT to take a holistic approach and adopt a combination of different strategies, such as Data Ops, Data Observability, Data Mesh, and Data Quality Checks to improve the quality of data. Despite the differences in focus, one common thing among the frameworks strategies above is the need for a unified governance structure. This structure ensures data is used and managed consistently and competently across all teams and departments.
The DOM includes a series of initiatives aimed at continually elevating data quality standards across BMT’s data environment.
Continuous Integration and Continuous Deployment ensure that changes and updates to data pipelines are automatically tested, integrated, and deployed to production, facilitating consistent and reliable data processing and delivery.
In dynamic data environments where data sources, formats, and requirements evolve rapidly, CI/CD provides a framework for automating the testing, integration, and deployment of data pipelines. This ensures that changes and updates to data pipelines are rigorously tested and validated before being seamlessly deployed to production environments.
In Data Engineering, this involves automating testing new ETL code, validating data schema, monitoring data quality, detecting anomalies, deploying updated data models to production, and ensuring that databases or data warehouses have been correctly configured.
CI/CD Development Lifecycle |
Principle | Activity | |
CI | Automated Testing | Automated tests check the integrity and quality of data transformations, ensuring that data is processed as expected and any error is spotted early. |
Version Control | Data pipeline code (e.g., SQL scripts, Python transformations) is stored in repositories like Git, allowing tracking and managing changes. | |
Consistent Environment | CI tools can run tests in environments that mirror production, ensuring that differences in configuration or dependencies don’t introduce errors. | |
Data Quality Checks | These might include checks for null values, data range violations, data type mismatches, or other custom quality rules. | |
CD | Automated Deployment | Once code changes pass all CI checks, CD tools can automate their deployment to production, ensuring seamless data flow. |
Monitoring and Alerts | Once deployed, monitoring tools keep track of the data pipeline’s performance, data quality, and any potential issues. Automated alerts can notify on discrepancies. | |
Development Branch Management | In case an issue is identified post-deployment, CD processes allow for quick rollbacks to a previously stable state of the data pipeline. |
The overarching goal of Data Observability is proactive problem-solving, where any anomalies or discrepancies are swiftly identified and rectified before they escalate into issues. Through continuous monitoring and analysis, data observability helps to maintain the reliability, accuracy, and accessibility of their data assets, thereby fostering trust and confidence in data-driven decision-making.
Principle | Activity | So that … |
Freshness | Ensure that data is up-to-date and reflects the most recent state of the source systems | users can make decisions based on timely and accurate information, leading to more informed and effective actions. |
Distribution | Monitor how data is spread across systems and locations to ensure that it falls within acceptable ranges and thresholds | potential issues such as data skew or imbalance can be identified and addressed promptly, maintaining data quality and integrity across the distributed environment. |
Volume | Track the volume of data being ingested, processed, and stored | capacity planning and resource allocation can be optimised, preventing infrastructure overload or resource contention and maintaining efficient data processing. |
Schema | Validate data schema consistency and evolution over time | data compatibility and interoperability are maintained, preventing errors and inconsistencies that could disrupt downstream processes or analyses. |
Lineage | Capture and visualise the lineage of data, including its origins, transformations, and destinations | data provenance and impact analysis can be performed, enabling users to trace data back to its source and understand its journey through the data pipeline. |
The Data Mesh decentralises data ownership by transferring the responsibility from the central data team to the business units that create and consume data.
By decentralising data ownership to domain teams, Data Mesh promotes agility, innovation, and accountability within BMT. It enables faster decision-making, facilitates collaboration across business units, and empowers domain experts to derive actionable insights from data more effectively.
It operates on the principles of domain-driven design, product thinking, and federated governance.
Principle | Activity | So that… |
Domain-oriented Decentralised Data Ownership and Architecture: | Implement data flows to seamlessly connect operational systems with analytics and business intelligence (BI) systems | domain teams can own and manage their data independently, fostering agility and innovation within their domains. |
Data as Product: | Document clear source-to-target mappings for transparency and traceability | data is treated as a valuable product, ensuring that it is well-understood, curated, and accessible for consumption by domain teams. |
Self-service Infrastructure as a Platform: | Provide a data developer portal (myBMT & Knowhow) | domain teams can autonomously access and utilise data infrastructure and tools, enabling them to build, deploy, and manage data pipelines and applications without the need for extensive support from centralised teams. |
Federated Computational Governance: | Provide support for the development and maintenance of data analysis/analytics systems | Best practice and computational learning can be distributed, allowing domain teams to govern their data processing and analytics workflows according to their specific needs and requirements. |
Data Quality Rules/Checks, allows the Data Specialist to directly address and uphold the quality dimensions of accuracy, completeness, and consistency, ensuring that the data meets the desired standards and remains reliable for analysis and decision-making.
Principle | Activity | So that … |
Accuracy | Ensure that data is accurate and free from errors or inaccuracies | stakeholders can make reliable decisions based on trustworthy information, leading to improved business outcomes and performance. |
Completeness | Verify that all required data elements are present and accounted for | analyses and reports are comprehensive and representative of the entire dataset, reducing the risk of biased or incomplete insights. |
Consistency | Enforce consistency in data values and formats across systems and sources | data can be seamlessly integrated and aggregated, avoiding discrepancies and ensuring compatibility for downstream processes and analyses. |
Missing Data | Identify and flag instances where data is missing or incomplete | gaps in the dataset can be addressed promptly, preventing erroneous conclusions or decisions based on incomplete information. |
Duplicate Data | Detect and eliminate duplicate entries or records within the dataset | data integrity is maintained, preventing overcounting or inaccuracies in analyses and ensuring a single source of truth for reporting and decision-making. |
Format Validation | Validate data formats to ensure consistency and adherence to predefined standards | data can be accurately interpreted and processed by downstream systems or applications, minimising errors and compatibility issues. |
Risk/Issue | Mitigation |
Dependency Failures: Failures in upstream systems or dependencies affecting data availability.Unreliable third-party data sources or services.Failure to handle dependency failures gracefully within the pipeline. | Dependency Isolation: Isolate dependencies within the data pipeline to minimise the impact of failures on other components. Use service boundaries, microservices architecture, and message queues to decouple dependencies and prevent cascading failures from propagating throughout the pipeline. |
Data Pipeline Configuration Errors: Incorrect configuration settings for data pipeline componentsMisconfigured data connections or permissionsChanges to pipeline configurations without proper testing or validation | Configuration Management System: Implement a robust configuration management system to centralise and manage configuration settings for data pipeline components. Utilise version control systems, such as Git or Subversion, to track changes to configuration files and ensure consistency across environments. |
Data Quality Issues: Missing valuesIncorrect data formatsInconsistent data across sources | Data Quality Monitoring: Implement data quality monitoring processes to continuously monitor the quality of incoming data. Set up alerts or notifications to flag instances of missing values, incorrect formats, or inconsistencies in real-time, allowing for prompt remediation. |
Resource Exhaustion: Exhaustion of system resources (e.g., memory, CPU, storage) leading to pipeline failuresInefficient resource utilisation or allocation within the pipeline infrastructureFailure to scale resources dynamically based on workload demands | Modularisation: Break down the pipeline into modular components to improve scalability, maintainability, and flexibility. Design modular components that perform specific tasks or functions, such as data ingestion, transformation, and loading, and orchestrate these components in a cohesive and efficient manner. |
Monitoring and Alerting Failures: Ineffective monitoring of pipeline health and performanceFailure to detect and alert on anomalies or errors in a timely mannerLack of visibility into pipeline status and health metrics | Proactive Health Checks: Conduct proactive health checks of the data pipeline at regular intervals to identify potential issues before they escalate. Use automated scripts or monitoring tools to perform health checks on data sources, processing components, and downstream systems. |
Data Security Breaches: Unauthorised access to sensitive data within the pipelineData leaks or breaches due to inadequate security measuresInsider threats or malicious activities compromising data integrity | Role-Based Access Control (RBAC): Implement role-based access control (RBAC) mechanisms to manage data pipeline permissions and access rights. Define roles and permissions for different user groups or personas, and assign permissions based on job responsibilities and data access requirements to prevent unauthorised access or misuse of data. |
Data Integration Problems: Incompatibility between different data formats or schemasIssues with data synchronisation between systems or databasesData loss or corruption during integration processes | Schema Standardisation: Establish standardised data schemas or formats to ensure compatibility between different systems or databases. Define and enforce data standards to facilitate seamless integration and minimise conflicts or inconsistencies in data structures. |
Data Transformation Errors: Logic errors in data transformation processesInaccurate data aggregations or calculationsMismatched data types during transformation | Continuous Improvement Practices: Foster a culture of continuous improvement by regularly reviewing and optimising data transformation processes. Encourage feedback from stakeholders and team members to identify areas for enhancement and implement iterative improvements to increase the efficiency and reliability of data transformations. |
Network Connectivity Issues: Network outages or latency affecting data transmission between componentsPacket loss or network congestion impacting data transfer reliabilityInadequate network bandwidth for data pipeline requirements | API Integration: Utilising APIs (Application Programming Interfaces) for data transmission between components can provide a standardised and reliable communication mechanism. APIs offer well-defined interfaces for data exchange, allowing you to establish robust connections and implement error handling mechanisms to handle network outages or latency effectively. |
Data Processing Bottlenecks: Slow or inefficient processing of large volumes of dataResource constraints leading to processing delaysInadequate scalability of processing infrastructure | Reorganise Pipeline: By reorganising the pipeline for efficiency, you can optimise resource utilisation, reduce processing latency, and improve overall system performance, enabling faster and more scalable data processing workflows. |
Objective: Efficiently acquire data from various sources and ensure secure, accurate, and timely transport to the data lakehouse and data warehouse.
Steps:
Objective: Design, develop, deploy, and maintain Datamarts with a focus on performance, security, and usability.
Steps: