Git Collaboration: Pull Requests, Code Reviews, and Branching Strategies
Master team Git workflows ā pull requests, effective code reviews, GitFlow vs trunk-based development, branch protection rules, and the engineering culture patterns that ship reliable software.
Collaboration Is Where Git Gets Real
Most Git tutorials stop at git commit. But 90% of your Git time is spent on pull requests, code reviews, and keeping your branch in sync with teammates. This lesson covers the patterns that make teams fast and reliable.
1. Pull Request Workflow
A pull request (PR) is a request to merge your branch into the main branch. It's the central unit of team collaboration.
The full PR lifecycle
1. Create feature branch from main
2. Make commits
3. Push branch to remote
4. Open PR: describe changes, request reviewers
5. CI runs automatically (tests, lint, type check)
6. Code review: comments, requests for changes
7. Address feedback, push more commits
8. Approval from required reviewers
9. Squash + merge into main
10. Delete feature branch
11. CI deploys to staging/productionWriting a good PR description
## What changed
- Added incremental ingestion for the orders table
- Switched from full reload to MERGE-based upsert
- Added row count validation between bronze and silver layers
## Why
The daily full reload was taking 45min and causing warehouse credits overuse.
Incremental reduces this to ~3min on average.
## How to test
1. Run `dbt run --select stg_orders --full-refresh` to rebuild
2. Run `dbt test --select stg_orders` ā all tests should pass
3. Check the Snowflake warehouse in QUERY_HISTORY to verify query time
## Checklist
- [x] dbt tests pass
- [x] Row counts validated
- [x] Docs updated in schema.yml
- [x] No PII in logs2. Code Review: How to Give Good Feedback
As a reviewer
Look for:
- Correctness: does it do what it says?
- Tests: are the right things tested?
- Security: any sensitive data leaked? SQL injection?
- Performance: obvious inefficiencies?
- Readability: can someone else maintain this?
- Conventions: follows the team's standards?
Comment types:
# Blocking (must fix before merge)
# Use a descriptive prefix to signal severity
BLOCKING: This will cause a divide-by-zero if quantity is 0.
Please add a guard: `WHERE quantity > 0` or handle in Python.
# Suggestion (non-blocking, your call)
NIT: We typically use `LOWER(TRIM(...))` for email normalization
across all our staging models. Feel free to keep as-is.
# Question (asking, not demanding)
QUESTION: Why are we filtering status = 'active' here?
The requirements doc said we should include 'pending' too.
# Praise (signal what's good)
NICE: Clean use of the SCD Type 2 merge pattern here.
This will serve as a good reference for the team.As an author
- Keep PRs small ā < 400 lines changed when possible
- Respond to every comment, even if just "Done" or "Disagree ā see explanation"
- Don't take feedback personally ā the code is being reviewed, not you
- Add a test if a reviewer found a bug
- Self-review your PR before requesting others
3. Branching Strategies
GitFlow
main āā always production-ready
develop āā integration branch
feature/* āā one branch per feature, branched from develop
release/* āā stabilization before release
hotfix/* āā urgent prod fixes, branched from main# GitFlow workflow
git switch -c feature/orders-pipeline develop
# ... make commits ...
git switch develop
git merge --no-ff feature/orders-pipeline # --no-ff preserves merge commit
# Release
git switch -c release/1.2.0 develop
# ... test, fix bugs ...
git switch main
git merge --no-ff release/1.2.0
git tag -a v1.2.0 -m "Release 1.2.0"
git switch develop
git merge --no-ff release/1.2.0Pros: clear structure, good for versioned releases Cons: complex, slow, PR queues up on develop, many merge conflicts
Trunk-Based Development (TBD) ā Modern Standard
main āā everyone commits here (or merges very short-lived feature branches)
feature/* āā live max 1-2 days, then mergedRules:
- Feature branches live < 2 days. If longer, use feature flags
- Every commit to main must pass CI
- Small, frequent PRs instead of big long-lived branches
- Continuous deployment from main
# TBD workflow
git switch -c feature/add-validation main
# ... make focused commits in 1-2 days ...
git fetch origin && git rebase origin/main # keep in sync daily
git push -u origin feature/add-validation
# Open PR, get fast review, mergePros: fast, low merge conflicts, matches continuous delivery Cons: requires feature flags for WIP, requires strong CI
Recommendation
| Team size | Strategy | |-----------|---------| | 1-5 engineers | Trunk-based (simplest) | | 5-20 engineers | Trunk-based with short-lived feature branches | | Large teams / versioned software | GitFlow | | Data engineering teams | Trunk-based (dbt changes are small and safe to merge often) |
4. Branch Protection Rules (GitHub)
Configure in GitHub ā Settings ā Branches ā Branch protection rules:
main branch protection:
ā Require pull request reviews before merging
Required approvals: 1-2
ā Dismiss stale pull request approvals when new commits are pushed
ā Require status checks to pass before merging
Required status checks:
- CI / test (ubuntu-latest)
- dbt-ci / dbt-build
ā Require branches to be up to date before merging
ā Restrict who can push to matching branches
ā Require signed commits (optional, high security)
ā Include administrators5. Resolving Common Team Issues
My PR has conflicts with main
git fetch origin
git rebase origin/main # or: git merge origin/main
# resolve conflicts in editor
git add conflicted_file.py
git rebase --continue # or: git commit (for merge)
git push --force-with-lease origin feature/my-branch--force-with-lease is safer than --force ā it fails if someone else pushed to the branch since you last fetched.
I committed to main by mistake
git reset --soft HEAD~1 # undo commit, keep changes staged
git switch -c feature/proper-branch
git push origin feature/proper-branchI need a commit from another branch
git cherry-pick a3f4c2d # apply specific commit to current branchI accidentally deleted a branch
git reflog # find the SHA of the last commit on that branch
git checkout -b recovered-branch SHA6. Git Hooks for Quality Gates
Hooks run automatically at certain Git events:
# .git/hooks/pre-commit (runs before each commit)
#!/bin/bash
ruff check . && \ # lint
mypy src/ && \ # type check
pytest tests/ -q -x # tests (fast)Or use pre-commit framework:
# .pre-commit-config.yaml
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.4.0
hooks:
- id: ruff
- id: ruff-format
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: detect-private-key # prevents committing API keys
- id: check-merge-conflict # catches unresolved <<<< markers7. GitHub CLI ā Work from Terminal
# Install
brew install gh # macOS
winget install GitHub.cli # Windows
gh auth login
# Create a PR
gh pr create \
--title "feat: add orders incremental pipeline" \
--body "See PR template" \
--reviewer alice,bob \
--label "data-engineering"
# Review PRs
gh pr list
gh pr view 42
gh pr checkout 42 # check out PR branch locally
gh pr review 42 --approve
gh pr merge 42 --squash --delete-branch
# Issues
gh issue create --title "Bug: null orders on weekends" --label bug
gh issue list --assignee @me8. Monorepo vs Polyrepo for Data Teams
Polyrepo: separate repos for each service/pipeline
- Pro: independent deployments, clear ownership
- Con: hard to share code, multiple CI configurations
Monorepo: everything in one repo (Airflow DAGs, dbt models, Python pipelines)
- Pro: atomic cross-system changes, single CI, shared tooling
- Con: larger repo, need tools like Turborepo/Nx for selective builds
For data engineering: monorepo is usually better ā dbt models, Airflow DAGs, and Python pipelines are tightly coupled. Change to an ingestion pipeline often requires a dbt model change.
data-platform/
āāā dbt/ # dbt project
ā āāā models/
ā āāā dbt_project.yml
āāā airflow/ # Airflow DAGs
ā āāā dags/
ā āāā plugins/
āāā ingestion/ # Python ingestion scripts
ā āāā src/
ā āāā tests/
āāā infrastructure/ # Terraform
ā āāā main.tf
āāā .github/
ā āāā workflows/
āāā MakefileSummary
| Concept | Key Point |
|---------|----------|
| PR description | What, why, how to test, checklist |
| Review comments | BLOCKING / NIT / QUESTION / NICE prefixes |
| Keep PRs small | < 400 lines ā reviewers can actually read them |
| Trunk-based dev | Default for modern data engineering teams |
| Branch protection | Require CI pass + review before merging |
| --force-with-lease | Safer than --force when rebasing |
| pre-commit hooks | Automated quality gates before every commit |
| Monorepo | Prefer for data platforms (dbt + Airflow + Python together) |
Next: GitHub Actions CI/CD ā automate testing and deployment on every commit.
Enjoyed this article?
Explore the Cloud & DevOps learning path for more.
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.