Fabric-Aligned Data Layering Model

Our data architecture follows a structured, layered approach based on the Medallion model. It aligns with Microsoft Fabric principles and best practices, supporting efficient data ingestion, transformation, governance, and reporting. Landing Layer This layer receives raw data files (typically CSV format) from external systems into secure landing zones. Bronze Layer Bronze provides the initial structured … Read more

Stripping Tags from Rich Text Imports Using Python

When importing rich text data into a Data Warehouse (DWH), it’s often necessary to remove unwanted HTML tags while preserving essential formatting. This guide outlines a method using Python’s BeautifulSoup library to clean and structure the data efficiently. Why Strip HTML Tags? Rich text from sources like web applications, CMS platforms, and APIs often includes … Read more

Use Case: API vs CSV

When comparing the integration of CSV files and APIs with the DataWarehouse, several factors impact data integrity, including data accuracy, consistency, security, and the ability to handle large datasets. Below are some key comparisons and potential risks associated with each method: 1. Data Accuracy and Consistency CSV Integration: API Integration: 2. Data Security CSV Integration: … Read more

SharePointList To Staging Pipeline

Purpose: Transfer SharePoint List data through azure blob storage and into the DataWarehouse Requires: Prerequisites: Process Steps: 4. Data Pipeline 5. Create pipeline parameters. Create two pipeline parameters. 5. Activities Tab 6. Look up Activity Click on the Lookup activity and go to settings, from there if not already set up create a new connection … Read more