Build your Marketing Data Stack
In this HMA Academy training course, you will learn about the most important processes of the data stack: API queries, data orchestration and data modeling. Meaningful reporting is only possible with a high-performing and robust data pipeline. In the workshop, we will provide you with the necessary background knowledge and implement the individual steps in straightforward hands-on sessions. Each day will cover one module of the ETL/ELT process. The modules are built upon each other, but can also be taken individually (day 1 and 2 and/or day 3). The training focuses on the use of the following technologies: Python, Apache Airflow, dbt.
3-day training / Modules can also be booked individually / In our training center at Schliersee, in Munich or in-house at your company (on request) / 3-8 participants / Suitable for all departments and industries
Day 1 (Module 1)
In the first module of this data stack training, you will learn how application programming interfaces (APIs) are structured and how you can view and query your data using appropriate tools. This is essential to get an overview of your data. Preparing this will help you to be able to write more targeted queries in the next step.
Master the key elements of the RESTful API design, one of the most popular interfaces for securely exchanging information over the Internet. Learn how to use and work with RESTful services to build efficient data pipelines.
Learn about JSON files: a popular, easily readable and widely supported text-based data format used to store and exchange structured information. JSON files consist of key-value pairs and enable efficient data representation in a human-readable format.
By the end of this module, you will have acquired skills in extracting data via API queries and efficiently handling data retrieved from multiple requests. In addition, you will learn how to seamlessly convert these extractions into Pandas data frames and easily perform data transformations.
Day 2 (Module 2)
You will learn how to build an efficient and robust ETL process using API data as a data source. We will look at how to store the data in an S3 bucket and then write it to a Snowflake data warehouse. You will also learn how to automate your workflow with the Python-based tool Apache Airflow. The user interface, architecture and configuration will be explained.
Apache Airflow is the tool of choice for workflow automation. You will receive a basic introduction into the topic of data orchestration and the structure of Apache Airflow. You will then learn how to navigate the user interface and the basic concepts of Airflow.
To successfully deploy the setup, we need to configure the necessary permissions so that we can access the data in Snowflake. This is done via a Snowflake Storage Integration for access to the external storage bucket.
Now that we have gained a theoretical overview of data orchestration with Airflow, it is time to automate the code with Airflow. To do this, the necessary building blocks are integrated into the Python code and tested.
Scheduling and alerting are important components in Airflow in order to achieve a complete automation. We will show you how to write a so-called schedule for the execution of your DAG and how to monitor the execution with the help of alerting.
In the final part, all the components are put together and the complete implementation in the cloud environment is carried out.
Day 3 (Module 3)
dbt is an extremely useful tool and has become an integral part of the Modern Data Stack! In this seminar, we will cover the basics of dbt that everyone in the field of data engineering should know. We will explore the theoretical background of data modeling and create our own hands-on dbt project.
Why is dbt such a popular tool in analytics engineering? What functions can it perform? How does it interact with other tools? We will address these questions in the first part of the training.
We install dbt and initially set up a dbt project. You will then be given an overview of the structure and which configurations are required in the tool.
After the set-up, we take the first steps with dbt to load the data into the data warehouse. The following topics will be covered: setting primary keys for tables, incremental vs. complete loading, tags.
The first models have been written and your data is already in the data warehouse. Now it’s time to focus on data modeling. dbt is particularly suitable for this, as it can display dependencies of the individual models with the Ref function. You will learn important details about how to do this.
Of course, we also need a developing environment for data modeling. We will show you how to set this up in dbt with little effort.
We leverage the functions of Airflow to automate the processes. DAGs are integrated into the scripts for this purpose.
Tests are a valuable and indispensable function of dbt to ensure that your data quality is right. They are easy to implement and we will show you how.
Goal of the training
The goal of the training is to map the entire ETL process (Extract, Transform, Load) from an API query to a finished data warehouse. Step by step, you will learn how to build your own marketing data stack. Thanks to small training groups, we can cater to your individual needs – at a fair price. At the end of the training, you will not only know the most important processes of the data stack, but will also receive a certificate of participation.
Your trainer
Dr. Simon Hannemann is Senior Manager Data Engineering at Hopmann Marketing Analytics and a certified expert for various data engineering tools. In his day-to-day work as a consultant and team lead, he is responsible for the successful development of ETL processes for data integration and transformation. He is very keen to pass on his in-depth knowledge and extensive practical experience in his training courses.
Individual training for your company
Are you looking for in-house training on API queries, data orchestration and/or data modeling for your own team? We make it possible for you. We have already provided individual data engineering training for well-known customers and would be happy to create a specific offer for you that is tailored to your requirements.
Dates Schliersee
on request
Dates Munich
on request
Modules bookable individually
With this training course, you will take the first step towards your own marketing data stack. Our experienced and certified trainer will gradually introduce you to the basic processes and tools that are essential for this. Each day will cover one module of the ETL/ELT process. The modules build on each other, but can also be booked individually (day 1 and 2 and/or day 3).
EXCLUSIVE TRAININGS ON MARKETING DATA
Cross thematic Trainings
Hybrid Project Management |
Munich, Schliersee or Inhouse on request |
on request |
Anyone who wants to deliver projects more effectively | Learn More |
Data Visualization Trainings
Professional data visualization with Power BI |
Munich, Schliersee or Inhouse on request |
on request |
Beginners | Learn More |
Professional data visualization with Tableau |
Munich, Schliersee or Inhouse on request |
on request |
Beginners | Learn More |
Digital Analytics Trainings
Google Analytics 4 and Google Tag Manager |
Munich, Schliersee or Inhouse on request |
on request |
Beginners | Learn More |
Data Engineering Trainings
Mastering dbt: Intensive course on data modeling |
Munich, Schliersee on request |
on request |
Data analysts, data engineers and anyone interested in modern data modeling | Learn More |
Build your Marketing Data Stack |
Munich, Schliersee or Inhouse on request |
on request |
Data analysts and people with a general interest in the modern data stack | Learn More |