> For the complete documentation index, see [llms.txt](https://data-engineering-zoomcamp-2025-t.gitbook.io/tinker0425/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://data-engineering-zoomcamp-2025-t.gitbook.io/tinker0425/module-2/2.3-etl-pipelines-in-kestra-google-cloud-platform/2.3.1-create-an-etl-pipeline-with-gcs-and-bigquery-in-kestra.md).

# 2.3.1 - Create an ETL Pipeline with GCS and BigQuery in Kestra

Youtube Video | \~20 min

{% embed url="<https://youtu.be/nKqjjLJ7YXs>" %}
<https://youtu.be/nKqjjLJ7YXs>
{% endembed %}

Now that you've learned how to build ETL pipelines locally using Postgres, we are ready to move to the cloud. In this section, we'll load the same Yellow and Green Taxi data to Google Cloud Platform (GCP) using:

1. Google Cloud Storage (GCS) as a data lake
2. BigQuery as a data warehouse.

***

:globe\_with\_meridians: GCS site and we will do some work here setting up our service account, permissions, and grabbing a json key.&#x20;

:warning: Be sure that your json key information is kept private and off github

To connect GCP to <mark style="background-color:purple;">Kestra</mark>, we use modify and execute our flow 4. You can learn more about KVs here

{% embed url="<https://kestra.io/docs/concepts/kv-store>" %}

:bug: I tried to change my location in flow 4, but that caused an error. Maybe the region was too large?&#x20;

{% embed url="<https://cloud.google.com/compute/docs/regions-zones>" %}

and then we run our flow 5 which will also create our bucket and bigquery dataset areas on GCP.&#x20;

:white\_check\_mark: Your flows should now have connected you kestra to GCP and you should see a bucket created

:question: Our end goal is for our flow to send out data (csv in our case) to our GCS data lake bucket where we can pass it over to bigquery. This dataset is large and would crash our computer locally.

:eyes: Run flow 6 to see how it sends our data to GCP bucket and then to bigquery&#x20;


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://data-engineering-zoomcamp-2025-t.gitbook.io/tinker0425/module-2/2.3-etl-pipelines-in-kestra-google-cloud-platform/2.3.1-create-an-etl-pipeline-with-gcs-and-bigquery-in-kestra.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
