Proposal for a white paper on Common Data Model (Datamarts)
Proposed Structure for the White Paper
1. Executive Summary
- Briefly outline the purpose and benefits of standardising the datamart views.
- Highlight the goals: consistency, clarity, scalability, and usability.
2. Introduction
- Describe the current challenges and inconsistencies in the existing datamart views.
- Provide examples of ad hoc naming and structural inconsistencies.
3. Objectives
- Define the scope of the standardisation effort.
- Key objectives:
- Uniform naming conventions for views.
- Standardised structures for data storage.
- Defined roles and purpose for each view type.
4. Proposed Standardised Format
- Core Views: Outline the key datamart views, their purposes, and proposed naming conventions.
- Details (or Core): Foundational attributes, primary keys, and links.
- Dates: Key lifecycle dates (e.g., created, updated, completed).
- Values: Measures and calculated fields (e.g., revenue, cost, hours).
- Codes: Dimensions and hierarchies for categorisation (e.g., PRU, regions).
- Status: Lifecycle state, progress indicators, risk assessments.
- Roles: Associated individuals or entities (e.g., project managers, line managers).
5. Naming Conventions
- Recommend a clear, consistent naming pattern for all views:
- [entity].[purpose] (e.g., project.dates, employee.values).
- Examples for your naming series:
- Details/Core: project.core, employee.core
- Dates: project.dates, employee.dates
- Values/Measures: project.values, employee.values
- Codes: project.codes, employee.codes
- Status: project.status, employee.status
- Roles: project.roles, employee.roles
6. Standardised Structure for Views
- Each view should follow a predictable format, such as:
- Columns: [Entity], [Attribute], [MetaValue]
- Examples:
- project_id | start_date | 2025-01-01
- project_id | total_revenue | 1,000,000
- employee_id | role | Project Manager
7. Metadata Format
- Propose specific formats for meta_value, such as:
- Dates: ISO 8601 (YYYY-MM-DD).
- Numerical Values: Decimal precision and currency codes (e.g., 1000000.00 USD).
- Codes and Statuses: Alphanumeric identifiers mapped to hierarchies.
8. Implementation Roadmap
- Steps for rolling out the standardisation:
- Audit current datamart views.
- Transition existing views to the new structure.
- Develop templates and automation scripts for creating standardised views.
- Train teams on the new format and naming conventions.
9. Benefits
- Highlight the advantages of this approach:
- Improved discoverability and usability of views.
- Easier integration with downstream applications.
- Better alignment with modern data architecture practices.
10. Conclusion
- Summarise the proposal and invite feedback from stakeholders.
Naming Suggestions for Views
Here are some options for naming conventions:
- Foundational Attributes:
- entity.core
- entity.info
- Lifecycle Dates:
- entity.dates
- entity.timeline
- Measures/Values:
- entity.values
- entity.metrics
- Codes/Hierarchies:
- entity.codes
- entity.dimensions
- Status:
- entity.status
- entity.lifecycle
- Roles:
- entity.roles
- entity.people
Potential Additional View Types
- Identifiers (Keys):
- For views that primarily map relationships using unique identifiers (e.g., foreign keys, composite keys).
- Examples: project.keys, employee.keys.
- Relationships:
- To define parent-child or entity-to-entity relationships (e.g., hierarchies, dependencies).
- Examples: project.relationships (e.g., parent projects), employee.reports_to.
- History (Audit):
- For tracking changes over time, capturing historical states or changes in attributes.
- Examples: project.history, employee.audit.
- Attributes (Properties):
- For storing additional descriptive or categorisation data that doesn’t fit under Codes or Details.
- Examples: project.attributes (e.g., industry, sector), employee.properties.
- Attachments (Metadata):
- For references to supporting documents, URLs, or external resources linked to the entity.
- Examples: project.attachments (e.g., contracts, plans), employee.documents.
- Aggregations (Summaries):
- For pre-computed roll-ups, totals, or summary statistics.
- Examples: project.aggregations (e.g., total revenue, hours), employee.summaries.
- Errors/Exceptions:
- For monitoring or logging discrepancies, issues, or unusual states.
- Examples: project.exceptions (e.g., over-budget items), employee.errors (e.g., missing data).
- Notes/Comments:
- To capture freeform text or annotations linked to the entity.
- Examples: project.notes, employee.comments.
- Tags/Categories:
- For tagging or classifying entities for quick filtering or categorisation.
- Examples: project.tags, employee.labels.
- Forecasts/Projections:
- For storing predictive data or planned values.
- Examples: project.forecasts (e.g., expected completion dates), employee.projections (e.g., planned FTE).
- Processes/Workflows:
- For tracking steps, approvals, or status within workflows.
- Examples: project.workflow (e.g., approvals, phases), employee.process.
- Costs/Expenses:
- If financial details warrant a separate view distinct from general measures.
- Examples: project.costs (e.g., budget, expenditures), employee.expenses (e.g., travel, training).
- Connections/Integrations:
- For managing or defining external links, APIs, or interfaces.
- Examples: project.connections, employee.integrations.