2.2 - ETL Pipelines in Kestra: Detailed Walkthrough
This week, we're gonna build ETL pipelines for Yellow and Green Taxi data from NYC’s Taxi and Limousine Commission (TLC). You will:
- Extract data from CSV files. 
- Load it into Postgres or Google Cloud (GCS + BigQuery). 
- Explore scheduling and backfilling workflows. 
This introductory flow is added just to demonstrate a simple data pipeline which extracts data via HTTP REST API, transforms that data in Python and then queries it using DuckDB. For this stage, a new separate Postgres database is created for the exercises.
Last updated