Building Batch Data Pipelines on Google Cloud
Skip to Scheduled Dates
Course Overview
Data pipelines typically fall under one of the Extract and Load (EL), Extract, Load and Transform (ELT) or Extract, Transform and Load (ETL) paradigms. This course describes which paradigm should be used and when for batch data. Furthermore, this course covers several technologies on Google Cloud for data transformation including BigQuery, executing Spark on Dataproc, pipeline graphs in Cloud Data Fusion and serverless data processing with Dataflow. Learners get hands-on experience building data pipeline components on Google Cloud using Qwiklabs.
Who Should Attend
Developers responsible for designing pipelines and architectures for data processing.
Course Objectives
- Review different methods of data loading: EL, ELT and ETL and when to use what
- Run Hadoop on Dataproc, leverage Cloud Storage, and optimize Dataproc jobs
- Build your data processing pipelines using Dataflow
- Manage data pipelines with Data Fusion and Cloud Composer
Course Outline
- Introduction
- Introduction to Building Batch Data Pipelines
- Executing Spark on Dataproc
- Manage Data Pipelines with Cloud Data Fusion and Cloud Composer
- Course Summary
< Back to Courses
Class times are listed Eastern time
This is a 1-day class
Register |
When |
Time |
Where |
How |
Register
|
06/02/2025 |
9:00AM - 5:00PM |
Online |
VILT |
Register
|
11/17/2025 |
9:00AM - 5:00PM |
Online |
VILT |