Data Acquisition and Transport Workflow

Objective: Efficiently acquire data from various sources and ensure secure, accurate, and timely transport to the data lakehouse and data warehouse.

Steps:

Data Source Identification and Mapping

  • Action: Identify relevant data sources, including databases, applications, and third-party sources.
  • Deliverable: Data source inventory, data mapping documents.
  • Control Measures: All data sources SHALL be described in myBMT, EA solution SHALL map enterprise interactions.

Data Profiling and Classification

  • Action: Profile data sources to understand data quality and classify based on sensitivity.
  • Deliverable: Data profiling reports, classification tags.
  • Control Measures: Sensitivity levels defined by DataMart Owner, data tagged for classification, metadata management in myBMT.

Data Acquisition and Integration

  • Action: Use ETL processes to acquire data and transport it to the data lakehouse.
  • Deliverable: ETL process documentation, transfer logs.
  • Control Measures: Data encryption for sensitive data, archiving of source data before processing, transfer logs applied at each pipeline stage.

Transport to Data Warehouse

  • Action: Orchestrate the secure and accurate transfer of data from the lakehouse to the data warehouse.
  • Deliverable: Data warehouse integration reports.
  • Control Measures: Continuous monitoring, execution logs identifying external agents and calling users.

Testing and Validation

  • Action: Perform integration testing to ensure the ETL process works end-to-end from source to warehouse.
  • Deliverable: Test reports, validation results.
  • Control Measures: Data views in DataMarts checked for errors, data accuracy validated.

Leave a Comment