Assurance System Test, Regression Test, Unit Test activities, UA Tests, Peer Reviews (with estimated dates), code review methodology (Pull Requests/Pair Programming/other)
A new DataMart feature is more than several lines of code. It is usually a multilayer, complex pipeline system, incorporating separate functional components and third-party data integrations. Therefore, efficient development testing should go far beyond just finding errors in the source code. Typically, the testing covers the following levels of data engineering.
- Component/Unit testing
- Integration testing
- System testing
- Acceptance testing
Each level ensures thorough validation of the feature’s functionality, interoperability, and alignment with user requirements.
Component/Unit testing
The smallest testable part of the Data Warehousing system is often referred to as a unit. Therefore, this testing level is aimed at examining every single unit of a system in order to make sure that it meets the original requirements and functions as expected. Unit testing is commonly performed early in the development process by the engineers themselves.
Integration testing
The objective of the next testing level is to verify whether the combined units work well together as a group. Integration testing is aimed at detecting the flaws in the interactions between the units within a pipeline module.
System testing
At this level, a complete DataMart system is tested as a whole. This stage serves to verify the product’s compliance with the functional and technical requirements and overall quality standards. System testing should be performed in a development context using an environment as close to the real business use scenario as possible.
DataMart regression testing in the last stable version ensures that the system’s data retrieval processes remain consistent and accurate across the feature updates or modifications. This testing phase involves executing a series of predefined queries against the DataMarts to validate that the expected data is returned as per the defined criteria.
Acceptance testing
This is the last stage of the testing process, where the product is validated against the end user requirements and for accuracy. This final step helps the team decide if the data product is ready to be made available or not. While small issues should be detected and resolved earlier in the process, this testing level focuses on overall system quality, from content and UI to performance issues. The acceptance stage might be followed by an alpha and beta testing, allowing a small number of actual users to try out the software before it is officially released[SW1] [JK2] .
- Levels of Testing
Unit Testing | Integration | System | Acceptance | |
Why | To ensure features and code is developed correctly | To make sure the ties between the system components function as required | To ensure the whole system works well when integrated | To ensure customer’s and end user expectations are met |
Who | Data Architect & Developer | Module/Pipeline Technical Architect | Data Engineer & Data Specialist | Data Specialist & Product Owner |
What | All new code + refactoring of legacy code as well as SQL/JavaScript/Python unit Testing | Azure web services, synapse pipeline modules | User flows and typical User Journeys, Performance and security testing | Verifying acceptance tests on the stories, verification of features |
When | As soon as new code is written | As soon as new components are added | When the data product is complete | When the data product is ready to be shipped |
Where | Staging, Local & SQL DEV – Continuous Integration (CI) (Bronze & Silver) | Azure DEV environment | Azure DEV Continuous Deployment (CD) (Gold) | PRD beta version |
How (tools and methods) | Visual Studio, Azure Data Studio, SSMS, GIT/DevOps | Synapse, ADF, Python Notebook | SQL DEV | SQL PRD, myBMT, KnowHow |
Actions | Check Bronze Processed file is available and Create Snapshot SQL | Check SQL for Spark and Python integration | Create MyView (mvw) in DEV | Create MVW in PRD (beta), update Knowhow |
In agile Data Engineering and feature development, the testing typically represents an iterative process. While the levels generally refer to the complete product, they can also be applied to every added feature. In this case, every small unit of the new functionality is being verified. Then the Technical Architect checks the interconnections between these units, the way the feature integrates with the rest of the system and if the new update is ready to be shipped.
[SW1]What about regression testing to test the integrity of changes on the existing data pipelines when another change is made?
[JK2]Yes I am suffering as a result here