Stripping Tags from Rich Text Imports Using Python

When importing rich text data into a Data Warehouse (DWH), it’s often necessary to remove unwanted HTML tags while preserving essential formatting. This guide outlines a method using Python’s BeautifulSoup library to clean and structure the data efficiently. Why Strip HTML Tags? Rich text from sources like web applications, CMS platforms, and APIs often includes … Read more

Release 58

Additions Customer Details USD ✅ New Addition customer_details_USD – Part of the CAD/USD series of changes.⚠️ Contains Created (Hash) Customer IDs – This will be corrected in the next release. Considerations & Next Steps 🔹 Customer ID Fix Pending – Plan for a smooth transition when correcting the Customer ID field.🔹 UNION-based “mega” list for … Read more

Release 57

Alterations 15 Files Modified Key Fixes & Enhancements Notes 1. ⚠️ DIFFERENT: employee_absencelimit.sql2. ⚠️ DIFFERENT: infrastructure_changerequest.sql3. ⚠️ DIFFERENT: infrastructure_oshrecords.sql4. ⚠️ DIFFERENT: marketing_Advertising.sql5. ⚠️ DIFFERENT: marketing_campaignTraffic.sql6. ⚠️ DIFFERENT: marketing_campaignTrafficFormSubmissions.sql7. ⚠️ DIFFERENT: marketing_EmailCampaigns.sql8. ⚠️ DIFFERENT: marketing_Events.sql9. ⚠️ DIFFERENT: marketing_Forms.sql10. ⚠️ DIFFERENT: marketing_HubSpotLandingPages.sql11. ⚠️ DIFFERENT: marketing_Meltwater.sql12. ⚠️ DIFFERENT: marketing_pagePerformance.sql13. ⚠️ DIFFERENT: marketing_pagePerformanceFormSubmissions.sql14. ⚠️ DIFFERENT: marketing_SocialMedia.sql15. ⚠️ DIFFERENT: marketing_ThoughtLeadership.sql … Read more

Release 56

Summary This minor release of some 36 changes to ensure that all view definitions now include an Alias for the table in the FROM clause. These updates are cosmetic in nature and do not impact the performance of the DataViews. Key Updates Impact Next Steps For any queries or issues, please reach out to the … Read more

Release 55

Additions customer_orderline_202501 This release introduces a new version of customer.orderline, evolving from the current beta version. The latest update incorporates a fully JSON-managed field structure, enhancing flexibility, data integrity, and future scalability. Key Changes & Enhancements Impact on Data Consumers Alterations project_details_plus_CAD This update continues the ongoing improvements to mapping accuracy and data consistency within … Read more