Automation of Data Processing from the Source System

Schemat automatycznego przetwarzania danych surowych w zoptymalizowane struktury analityczne w BigQuery – wdrożenie ALTEN Polska

Project title: Automation of Data Processing from the Source System

Client: Global Leader in Corporate Banking

Industry: Banking, Finance & Insurance

Expertise: Data Management & AI

Project scope: Data processing and transformation, automation of data structure creation

Tools: Google Cloud Platform, BigQuery, Google Cloud Storage, SQL, Dynamic SQL

The automation of data processing from the source system enabled the optimization of integration and transformation processes in BigQuery. The project automated the conversion of raw, generic tables into optimized analytical structures, ready for further analysis and reporting.

The expertise covered data transformation and the implementation of a new, more flexible, and efficient processing workflow. A key aspect was transferring the logic for data extraction from the source system—previously managed via scripts—into a cloud environment. This new approach enabled automated data processing that was more accessible and scalable, opening up new possibilities for reporting while reducing the load on the source platform.

Task of the ALTEN Polska Team

The ALTEN Polska team was tasked with designing and implementing an automated data processing system for files provided by the client’s platform. Key activities included creating dynamic SQL queries in BigQuery that automatically transformed raw, generic tables into optimized structures. Transferring the processing logic from a closed scripting environment to an open and scalable SQL environment was one of the project’s main goals.

Project Phases

The project was carried out in stages:

  • Analysis of the existing data model: At the initial stage, the data model in the source system was analyzed, including relationships between objects, cardinality, attributes, and dependencies. The goal was to thoroughly understand the structure of the source data.
  • Identification of specific data structures: Elements of the data model requiring a dedicated approach were identified, and special structures were developed to ensure proper data processing in subsequent stages.
  • Design of the intermediate layer: A flexible, dedicated structure for the intermediate layer was created to connect raw data with target analytical structures.
  • Implementation of the data transformation algorithm: An algorithm was developed to automatically transform raw data into organized structures ready for further processing.
  • Integration of layers and testing: Data from the intermediate layer was integrated with the target layer, followed by detailed testing to ensure the solution met business requirements.
  • Implementation safeguards: Mechanisms for verifying input data consistency were added to avoid potential errors in the processing workflow.

Final Outcome

The project resulted in the creation of a fully automated data processing workflow within BigQuery. The client received optimized structures that increased data availability and facilitated its use in reporting and business analysis. By transferring data processing to the cloud, the solution enhanced the flexibility of working with data, unlocked new reporting possibilities, and enabled broader access to information for users. Automation also reduced the complexity of data handling, supporting efficient management and the growth of business analytics.

Summary

Thanks to the collaboration with ALTEN Polska, the client gained a scalable, efficient, and easy-to-maintain data processing solution. The automation of processes and the adoption of more efficient technologies reduced costs and processing time while improving data quality. The project established a solid foundation for further development of business analytics and data integration in the cloud environment.