2.2 - ETL Pipelines in Kestra: Detailed Walkthrough

This week, we're gonna build ETL pipelines for Yellow and Green Taxi data from NYC’s Taxi and Limousine Commission (TLC). You will:

Extract data from CSV files.
Load it into Postgres or Google Cloud (GCS + BigQuery).
Explore scheduling and backfilling workflows.

This introductory flow is added just to demonstrate a simple data pipeline which extracts data via HTTP REST API, transforms that data in Python and then queries it using DuckDB. For this stage, a new separate Postgres database is created for the exercises.

Previous2.1.2 - Learn Kestra Next2.2.1 - Create an ETL Pipeline with Postgres in Kestra

Last updated 5 months ago