🖥️
DE Zoomcamp Notes
Linkedin | Kayla TinkerGithub | Tinker0425Blog | From Clouds to CodeBlueSky | Cloudy Blue Wave
  • Welcome - Data Engineering Zoomcamp 2025 Notes
  • INTRODUCTION
    • Introduction & Set Up
      • Virtual Environments
  • MODULE 1
    • Introduction to Module 1
    • 1.1 - Google Cloud Platform GCP
      • 1.1.1 - Introduction to Google Cloud Platform
    • 1.2 - Docker & Docker-compose
      • 1.2.1 - Introduction to Docker
      • 1.2.2 - Ingesting NY Taxi Data to Postgres
      • 1.2.3 - Connecting pgAdmin and Postgres
      • 1.2.4 - Dockerizing the Ingestion Script
      • 1.2.5 - Running Postgres and pgAdmin with Docker-Compose
      • Docker-Compose Summary
      • 1.2.6 - SQL Refresher
      • Optional Docker Video
    • 1.3 - Setting up infrastructure on GCP with Terraform
      • 1.3.1 - Terraform Primer
      • 1.3.2 - Terraform Basics
      • 1.3.3 - Terraform Variables
    • Homework
  • Module 2
    • Introduction to Module 2
    • 2.1 - Introduction to Orchestration and Kestra
      • 2.1.1 - Workflow Orchestration Introduction
      • 2.1.2 - Learn Kestra
    • 2.2 - ETL Pipelines in Kestra: Detailed Walkthrough
      • 2.2.1 - Create an ETL Pipeline with Postgres in Kestra
      • 2.2.2 - Manage Scheduling and Backfills using Postgres in Kestra
      • 2.2.3 - Transform Data with dbt and Postgres in Kestra
    • 2.3 - ETL Pipelines in Kestra: Google Cloud Platform
      • 2.3.1 - Create an ETL Pipeline with GCS and BigQuery in Kestra
      • 2.3.2 - Manage Scheduling and Backfills using BigQuery in Kestra
      • 2.3.3 - Transform Data with dbt and BigQuery in Kestra
    • Bonus: Deploy to the Cloud
    • Homework
  • Module 3
    • Introduction to Module 3
    • 3.1 - Data Warehouse, Partitioning and Clustering
      • 3.1.1 - Data Warehouse and BigQuery
      • 3.1.2 - Partitioning and Clustering
    • 3.2 - BigQuery Internals and Best Practices
      • 3.2.1 - BigQuery Best Practices
      • 3.2.2 - Internals of Big Query
    • 3.3 - Machine Learning
      • 3.3.1 - BigQuery Machine Learning
      • 3.3.2 - BigQuery Machine Learning Deployment
    • Homework
  • Workshop
    • Workshop Week
    • Homework
  • Module 4
    • Introduction to Module 4
    • 4.1 - DBT the basics
      • 4.1.1 - Analytics Engineering Basics
      • 4.1.2 - What is dbt?
    • 4.2 - Creating your Project
      • 4.2.1 - Set Up Project
      • 4.2.2 - Start Your dbt Project BigQuery and dbt Cloud
      • 4.2.3 - Build the First dbt Models
      • 4.2.4 - Testing and Documenting the Project
    • 4.3 - Deployment & Visualizations
      • 4.3.1 - Deployment Using dbt Cloud
      • 4.3.2 - Visualising the data with Google Data Studio
    • Homework
  • Module 5
    • Introduction to Module 5
    • 5.1 - Install & Intro
      • 5.1.1 - Install
      • 5.1.2 - Intro to Batch Processing
      • 5.1.3 - Intro to Spark
    • 5.2 - Spark SQL and DataFrames
      • 5.2.1 - Spark & PySpark
      • 5.2.2 - Spark Dataframes
      • 5.2.3 - SQL with Spark
    • 5.3 - Spark Internals
      • 5.3.1 - Anatomy of a Spark Cluster
      • 5.3.2 - GroupBy in Spark
      • 5.3.3 - Joins in Spark
    • 5.4 - Running Spark in the Cloud
      • 5.4.1 - Connecting to Google Cloud Storage
      • 5.4.2 - Creating a Local Spark Cluster
      • 5.4.3 - Setting up a Dataproc Cluster
      • 5.4.4 - Connecting Spark to Big Query
    • Homework
  • Module 6
    • Introduction to Module 6
    • 6.1 - Stream Processing
      • 6.1.1 - Introduction
      • 6.1.2 - Intro to stream processing
      • 6.1.3 - What is Kafka?
      • 6.1.4 - Confluent cloud
      • 6.1.5 - Kafka producer consumer
      • 6.1.6 - Kafka configuration
    • Homework
  • Final Project
    • Final Project
    • How To!
      • 1 - Create a Google Cloud Project
      • 2 - API Key and Access Token Setup
      • 3 - Fork This Repo in Github
      • Ready to Run!
    • THE END
Powered by GitBook

Connect

  • Linkedin | Kayla Tinker
  • BlueSky | Cloudy Blue Wave
  • Blog | From Clouds to Code
  • Github | Tinker0425
On this page
  • pgAdmin
  • Running pgAdmin & Postgres together
  • Create New Server
  • Resources
  1. MODULE 1
  2. 1.2 - Docker & Docker-compose

1.2.3 - Connecting pgAdmin and Postgres

Last updated Jan 19, 2025

Previous1.2.2 - Ingesting NY Taxi Data to PostgresNext1.2.4 - Dockerizing the Ingestion Script

Last updated 4 months ago

I recommend not pausing your workflow once you docker run -it in this video

Youtube Video | ~10 min

In this video we will talk about using pgAdmin to manage our Postgres in Docker. We will learn about creating a network to connect our containers. Then, we get to take a look at the front end of our work and we will create a new server on pgAdmin to host our taxi dataset.

pgAdmin

"pgAdmin is an open-source, web-based graphical user interface (GUI) tool primarily used to manage and administer PostgreSQL databases, allowing users to perform tasks like creating databases, tables, users, and executing SQL queries through a visual interface rather than just command-line commands; essentially, it's the primary management tool for PostgreSQL databases." - AI

Running pgAdmin & Postgres together

"Container networking refers to the ability for containers to connect to and communicate with each other, or to non-Docker workloads."

docker network create pg-network

Then we want to add the network and network name to our docker run command for both containers

docker run -it \
  -e POSTGRES_USER="root" \
  -e POSTGRES_PASSWORD="root" \
  -e POSTGRES_DB="ny_taxi" \
  -v $(pwd)/ny_taxi_postgres_data:/var/lib/postgresql/data \
  -p 5432:5432 \
  --network=pg-network \
  --name pg-database \
  postgres:13

docker run -it \
  -e PGADMIN_DEFAULT_EMAIL="admin@admin.com" \
  -e PGADMIN_DEFAULT_PASSWORD="root" \
  -p 8080:80 \
  --network=pg-network \
  --name pgadmin-2 \
  dpage/pgadmin4

If you stop your docker container at any point, go to your docker desktop to restart it. You can also get to your Port(s) by clicking on them here - such as 8080:80


Create New Server

General Tab - Name = 'Docker Localhost'

Resources

View the above docker run commands in my github repo

Note that in 1.2.5 we will be using docker-compose in a yaml file method instead

To connect the two, we need to create a network

You should now be able to get to and log in to the front end at

Make sure your container is running and head to . After entering the username and password as discussed above, we will want to add a 'Server'

The UI for PgAdmin 4 has changed. Right click on 'Servers', Click 'Register', Click 'Server'.

Now we should have a new server in local host 8080 that is connected to our dataset in 5432

👀
◼️
⚒️
✅
✅
http://localhost:8080/
http://localhost:8080/
📚
Docker networking
⛔
✍️
📚
Page cover image
https://www.youtube.com/watch?v=hCAIVe9N0ow&list=PL3MmuxUbc_hJed7dXYoJw8DoCuVHhGEQb&index=7&pp=iAQB
docker network createDocker Documentation
Logo
Where we are heading
https://github.com/Tinker0425/de-zoomcamp-my-work/tree/master/module-01/docker/video_3