1.2.2 - Ingesting NY Taxi Data to Postgres
Last updated Jan 19, 2025
Last updated
Last updated Jan 19, 2025
Last updated
Youtube Video | ~29min
I recommend not pausing your workflow and going through this entire video. You may run into a number of issues related to pgcli, so allow for extra time spent here (maybe hours). Search Slack, Search FAQ, try a new virtual environment, ask for help.
Using pgcli
to connect to Postgres
h
hostname p
port u
username d
database name
This should open a jupyter notebook web browser tab. Follow along with the youtube video to finalize your jupyter notebook.
My repo for this video can be found here
In this video we will learn how to configure and run Postgres in Docker. We will download the taxi NY dataset as a csv file and read it into a jupyter notebook. We will also look at the data using pgcli, but will use other options moving forward.
yellow_trip_data -
zones_data - and click:
(CSV)
Now you should have 2 .csv files locally
~minute 6
Terminal
After running, you should see postgres files in your ny_taxi_postgres_data directory
~minute 7
Terminal
You can now explore your dataset in therteminal window (once you have some)
If you run into issues, check out this video
Terminal
In future videos we use the zone csv data as well. I'm unsure if this was done in a video, but I added the steps in my repo jupyter notebook
In 1.2.4 we convert our python notebook into a python script and test loading in the data that way as well.