Back to blog
Cloud & DevOpsbeginner

Git Collaboration: Pull Requests, Code Reviews, and Branching Strategies

Master team Git workflows — pull requests, effective code reviews, GitFlow vs trunk-based development, branch protection rules, and the engineering culture patterns that ship reliable software.

LearnixoMay 7, 20268 min read
Gitpull requestscode reviewGitFlowtrunk-basedbranching strategyGitHub
Share:š•

Collaboration Is Where Git Gets Real

Most Git tutorials stop at git commit. But 90% of your Git time is spent on pull requests, code reviews, and keeping your branch in sync with teammates. This lesson covers the patterns that make teams fast and reliable.


1. Pull Request Workflow

A pull request (PR) is a request to merge your branch into the main branch. It's the central unit of team collaboration.

The full PR lifecycle

1. Create feature branch from main
2. Make commits
3. Push branch to remote
4. Open PR: describe changes, request reviewers
5. CI runs automatically (tests, lint, type check)
6. Code review: comments, requests for changes
7. Address feedback, push more commits
8. Approval from required reviewers
9. Squash + merge into main
10. Delete feature branch
11. CI deploys to staging/production

Writing a good PR description

MARKDOWN
## What changed
- Added incremental ingestion for the orders table
- Switched from full reload to MERGE-based upsert
- Added row count validation between bronze and silver layers

## Why
The daily full reload was taking 45min and causing warehouse credits overuse.
Incremental reduces this to ~3min on average.

## How to test
1. Run `dbt run --select stg_orders --full-refresh` to rebuild
2. Run `dbt test --select stg_orders` — all tests should pass
3. Check the Snowflake warehouse in QUERY_HISTORY to verify query time

## Checklist
- [x] dbt tests pass
- [x] Row counts validated
- [x] Docs updated in schema.yml
- [x] No PII in logs

2. Code Review: How to Give Good Feedback

As a reviewer

Look for:

  • Correctness: does it do what it says?
  • Tests: are the right things tested?
  • Security: any sensitive data leaked? SQL injection?
  • Performance: obvious inefficiencies?
  • Readability: can someone else maintain this?
  • Conventions: follows the team's standards?

Comment types:

# Blocking (must fix before merge)
# Use a descriptive prefix to signal severity

BLOCKING: This will cause a divide-by-zero if quantity is 0.
Please add a guard: `WHERE quantity > 0` or handle in Python.

# Suggestion (non-blocking, your call)
NIT: We typically use `LOWER(TRIM(...))` for email normalization
across all our staging models. Feel free to keep as-is.

# Question (asking, not demanding)
QUESTION: Why are we filtering status = 'active' here?
The requirements doc said we should include 'pending' too.

# Praise (signal what's good)
NICE: Clean use of the SCD Type 2 merge pattern here.
This will serve as a good reference for the team.

As an author

  • Keep PRs small — < 400 lines changed when possible
  • Respond to every comment, even if just "Done" or "Disagree — see explanation"
  • Don't take feedback personally — the code is being reviewed, not you
  • Add a test if a reviewer found a bug
  • Self-review your PR before requesting others

3. Branching Strategies

GitFlow

main          ── always production-ready
develop       ── integration branch
feature/*     ── one branch per feature, branched from develop
release/*     ── stabilization before release
hotfix/*      ── urgent prod fixes, branched from main
Bash
# GitFlow workflow
git switch -c feature/orders-pipeline develop
# ... make commits ...
git switch develop
git merge --no-ff feature/orders-pipeline   # --no-ff preserves merge commit

# Release
git switch -c release/1.2.0 develop
# ... test, fix bugs ...
git switch main
git merge --no-ff release/1.2.0
git tag -a v1.2.0 -m "Release 1.2.0"
git switch develop
git merge --no-ff release/1.2.0

Pros: clear structure, good for versioned releases Cons: complex, slow, PR queues up on develop, many merge conflicts

Trunk-Based Development (TBD) — Modern Standard

main          ── everyone commits here (or merges very short-lived feature branches)
feature/*     ── live max 1-2 days, then merged

Rules:

  • Feature branches live < 2 days. If longer, use feature flags
  • Every commit to main must pass CI
  • Small, frequent PRs instead of big long-lived branches
  • Continuous deployment from main
Bash
# TBD workflow
git switch -c feature/add-validation main
# ... make focused commits in 1-2 days ...
git fetch origin && git rebase origin/main  # keep in sync daily
git push -u origin feature/add-validation
# Open PR, get fast review, merge

Pros: fast, low merge conflicts, matches continuous delivery Cons: requires feature flags for WIP, requires strong CI

Recommendation

| Team size | Strategy | |-----------|---------| | 1-5 engineers | Trunk-based (simplest) | | 5-20 engineers | Trunk-based with short-lived feature branches | | Large teams / versioned software | GitFlow | | Data engineering teams | Trunk-based (dbt changes are small and safe to merge often) |


4. Branch Protection Rules (GitHub)

Configure in GitHub → Settings → Branches → Branch protection rules:

main branch protection:
  āœ“ Require pull request reviews before merging
      Required approvals: 1-2
  āœ“ Dismiss stale pull request approvals when new commits are pushed
  āœ“ Require status checks to pass before merging
      Required status checks:
        - CI / test (ubuntu-latest)
        - dbt-ci / dbt-build
  āœ“ Require branches to be up to date before merging
  āœ“ Restrict who can push to matching branches
  āœ“ Require signed commits (optional, high security)
  āœ“ Include administrators

5. Resolving Common Team Issues

My PR has conflicts with main

Bash
git fetch origin
git rebase origin/main   # or: git merge origin/main
# resolve conflicts in editor
git add conflicted_file.py
git rebase --continue    # or: git commit (for merge)
git push --force-with-lease origin feature/my-branch

--force-with-lease is safer than --force — it fails if someone else pushed to the branch since you last fetched.

I committed to main by mistake

Bash
git reset --soft HEAD~1           # undo commit, keep changes staged
git switch -c feature/proper-branch
git push origin feature/proper-branch

I need a commit from another branch

Bash
git cherry-pick a3f4c2d   # apply specific commit to current branch

I accidentally deleted a branch

Bash
git reflog                        # find the SHA of the last commit on that branch
git checkout -b recovered-branch SHA

6. Git Hooks for Quality Gates

Hooks run automatically at certain Git events:

Bash
# .git/hooks/pre-commit (runs before each commit)
#!/bin/bash
ruff check .            && \  # lint
mypy src/               && \  # type check
pytest tests/ -q -x         # tests (fast)

Or use pre-commit framework:

YAML
# .pre-commit-config.yaml
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.4.0
    hooks:
      - id: ruff
      - id: ruff-format

  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.6.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-yaml
      - id: detect-private-key       # prevents committing API keys
      - id: check-merge-conflict     # catches unresolved <<<< markers

7. GitHub CLI — Work from Terminal

Bash
# Install
brew install gh   # macOS
winget install GitHub.cli  # Windows

gh auth login

# Create a PR
gh pr create \
  --title "feat: add orders incremental pipeline" \
  --body "See PR template" \
  --reviewer alice,bob \
  --label "data-engineering"

# Review PRs
gh pr list
gh pr view 42
gh pr checkout 42    # check out PR branch locally
gh pr review 42 --approve
gh pr merge 42 --squash --delete-branch

# Issues
gh issue create --title "Bug: null orders on weekends" --label bug
gh issue list --assignee @me

8. Monorepo vs Polyrepo for Data Teams

Polyrepo: separate repos for each service/pipeline

  • Pro: independent deployments, clear ownership
  • Con: hard to share code, multiple CI configurations

Monorepo: everything in one repo (Airflow DAGs, dbt models, Python pipelines)

  • Pro: atomic cross-system changes, single CI, shared tooling
  • Con: larger repo, need tools like Turborepo/Nx for selective builds

For data engineering: monorepo is usually better — dbt models, Airflow DAGs, and Python pipelines are tightly coupled. Change to an ingestion pipeline often requires a dbt model change.

data-platform/
ā”œā”€ā”€ dbt/                  # dbt project
│   ā”œā”€ā”€ models/
│   └── dbt_project.yml
ā”œā”€ā”€ airflow/              # Airflow DAGs
│   ā”œā”€ā”€ dags/
│   └── plugins/
ā”œā”€ā”€ ingestion/            # Python ingestion scripts
│   ā”œā”€ā”€ src/
│   └── tests/
ā”œā”€ā”€ infrastructure/       # Terraform
│   └── main.tf
ā”œā”€ā”€ .github/
│   └── workflows/
└── Makefile

Summary

| Concept | Key Point | |---------|----------| | PR description | What, why, how to test, checklist | | Review comments | BLOCKING / NIT / QUESTION / NICE prefixes | | Keep PRs small | < 400 lines — reviewers can actually read them | | Trunk-based dev | Default for modern data engineering teams | | Branch protection | Require CI pass + review before merging | | --force-with-lease | Safer than --force when rebasing | | pre-commit hooks | Automated quality gates before every commit | | Monorepo | Prefer for data platforms (dbt + Airflow + Python together) |

Next: GitHub Actions CI/CD — automate testing and deployment on every commit.

Enjoyed this article?

Explore the Cloud & DevOps learning path for more.

Found this helpful?

Share:š•

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.