1.2.4 - Dockerizing the Ingestion Script
Last updated Jan 19, 2025
Last updated
Last updated Jan 19, 2025
Last updated
Youtube Video | ~18 min
In this video we will learn how to convert our jupyter notebook into a python script. Then we will learn a second way of how to ingest our data into postgres using our new python script.
Recall this information from 1.2.2 where we are loading in our taxi csv data using python notebook. This is another way to acheive the same goal, but likely a more common practice.
Run from your project environment in the folder where the ipynb lives
The goal is to allow user inputs for different values such as url or password. You can read more about how to use argparse here /docs.python.org/3/library/argparse.html. The final python script called ingest_data.py can be found here. I recommend using this version because at ~11 min in the youtube Alexey mentions needed to add an exception to the code, which this version has - the 'try-except'
statement. There is a link in resources if you are unsure what a 'try-except'
statement is.
This second way is still a 'manual' method. You need to manually drop your table here http://localhost:8080/ and then run a command.
This is 'Dockerizing' the python script. This method will automatically 'replace' the table according to our python script.
Terminal - Converting the jupyter notebook to a python script
Clean up your code as needed
Recall that our first way was in 1.2.2 using python notebook. To complete the second method you will need to drop your table following the youtube video.
Note that our url link will look different because the taxi ny website no longer has the csv files
Terminal
This should print the loops in your terminal window and you should be able to now see the data here http://localhost:8080/ again (if you dropped your table).
Code editor
Terminal window