White Paper: Common Data Model (Datamarts)

Proposal for a white paper on Common Data Model (Datamarts)

Proposed Structure for the White Paper

1. Executive Summary

  • Briefly outline the purpose and benefits of standardising the datamart views.
  • Highlight the goals: consistency, clarity, scalability, and usability.

2. Introduction

  • Describe the current challenges and inconsistencies in the existing datamart views.
  • Provide examples of ad hoc naming and structural inconsistencies.

3. Objectives

  • Define the scope of the standardisation effort.
  • Key objectives:
    • Uniform naming conventions for views.
    • Standardised structures for data storage.
    • Defined roles and purpose for each view type.

4. Proposed Standardised Format

  • Core Views: Outline the key datamart views, their purposes, and proposed naming conventions.
    • Details (or Core): Foundational attributes, primary keys, and links.
    • Dates: Key lifecycle dates (e.g., created, updated, completed).
    • Values: Measures and calculated fields (e.g., revenue, cost, hours).
    • Codes: Dimensions and hierarchies for categorisation (e.g., PRU, regions).
    • Status: Lifecycle state, progress indicators, risk assessments.
    • Roles: Associated individuals or entities (e.g., project managers, line managers).

5. Naming Conventions

  • Recommend a clear, consistent naming pattern for all views:
    • [entity].[purpose] (e.g., project.dates, employee.values).
  • Examples for your naming series:
    • Details/Core: project.core, employee.core
    • Dates: project.dates, employee.dates
    • Values/Measures: project.values, employee.values
    • Codes: project.codes, employee.codes
    • Status: project.status, employee.status
    • Roles: project.roles, employee.roles

6. Standardised Structure for Views

  • Each view should follow a predictable format, such as:
    • Columns: [Entity], [Attribute], [MetaValue]
    • Examples:
      • project_id | start_date      | 2025-01-01
      • project_id | total_revenue  | 1,000,000
      • employee_id | role           | Project Manager

7. Metadata Format

  • Propose specific formats for meta_value, such as:
    • Dates: ISO 8601 (YYYY-MM-DD).
    • Numerical Values: Decimal precision and currency codes (e.g., 1000000.00 USD).
    • Codes and Statuses: Alphanumeric identifiers mapped to hierarchies.

8. Implementation Roadmap

  • Steps for rolling out the standardisation:
    • Audit current datamart views.
    • Transition existing views to the new structure.
    • Develop templates and automation scripts for creating standardised views.
    • Train teams on the new format and naming conventions.

9. Benefits

  • Highlight the advantages of this approach:
    • Improved discoverability and usability of views.
    • Easier integration with downstream applications.
    • Better alignment with modern data architecture practices.

10. Conclusion

  • Summarise the proposal and invite feedback from stakeholders.

Naming Suggestions for Views

Here are some options for naming conventions:

  1. Foundational Attributes:
    • entity.core
    • entity.info
  2. Lifecycle Dates:
    • entity.dates
    • entity.timeline
  3. Measures/Values:
    • entity.values
    • entity.metrics
  4. Codes/Hierarchies:
    • entity.codes
    • entity.dimensions
  5. Status:
    • entity.status
    • entity.lifecycle
  6. Roles:
    • entity.roles
    • entity.people

Potential Additional View Types

  1. Identifiers (Keys):
    • For views that primarily map relationships using unique identifiers (e.g., foreign keys, composite keys).
    • Examples: project.keys, employee.keys.
  2. Relationships:
    • To define parent-child or entity-to-entity relationships (e.g., hierarchies, dependencies).
    • Examples: project.relationships (e.g., parent projects), employee.reports_to.
  3. History (Audit):
    • For tracking changes over time, capturing historical states or changes in attributes.
    • Examples: project.history, employee.audit.
  4. Attributes (Properties):
    • For storing additional descriptive or categorisation data that doesn’t fit under Codes or Details.
    • Examples: project.attributes (e.g., industry, sector), employee.properties.
  5. Attachments (Metadata):
    • For references to supporting documents, URLs, or external resources linked to the entity.
    • Examples: project.attachments (e.g., contracts, plans), employee.documents.
  6. Aggregations (Summaries):
    • For pre-computed roll-ups, totals, or summary statistics.
    • Examples: project.aggregations (e.g., total revenue, hours), employee.summaries.
  7. Errors/Exceptions:
    • For monitoring or logging discrepancies, issues, or unusual states.
    • Examples: project.exceptions (e.g., over-budget items), employee.errors (e.g., missing data).
  8. Notes/Comments:
    • To capture freeform text or annotations linked to the entity.
    • Examples: project.notes, employee.comments.
  9. Tags/Categories:
    • For tagging or classifying entities for quick filtering or categorisation.
    • Examples: project.tags, employee.labels.
  10. Forecasts/Projections:
    • For storing predictive data or planned values.
    • Examples: project.forecasts (e.g., expected completion dates), employee.projections (e.g., planned FTE).
  11. Processes/Workflows:
    • For tracking steps, approvals, or status within workflows.
    • Examples: project.workflow (e.g., approvals, phases), employee.process.
  12. Costs/Expenses:
    • If financial details warrant a separate view distinct from general measures.
    • Examples: project.costs (e.g., budget, expenditures), employee.expenses (e.g., travel, training).
  13. Connections/Integrations:
    • For managing or defining external links, APIs, or interfaces.
    • Examples: project.connections, employee.integrations.

Leave a Comment