In the previous posts, we covered Git hooks for code
quality and submodules for shared code. Now letâs
tackle the final challenge: repositories that have grown large enough that git clone becomes a
coffee break.
Whether youâre dealing with large binaries, years of history, or a monorepo with dozens of projects, Git has strategies to keep things fast.
The Problem
Large repos slow everything down:
- Large binaries (ISOs, archives, ML models) bloat the repo permanently
- Deep history means cloning downloads every commit ever made
- Monorepos force you to check out code youâll never touch
Letâs fix each one.
Git LFS for Large Binaries
Git LFS (Large File Storage) replaces large files with small pointer files. The actual content lives on a separate server and downloads on demand.
Set up LFS
git lfs install
git lfs track "*.iso"
git lfs track "*.tar.gz"
git add .gitattributes
git commit -m "chore(lfs): track large files"
The .gitattributes file tells Git which patterns to handle with LFS.
CI needs LFS too
- uses: actions/checkout@v6
with:
lfs: true
Or manually:
- run: git lfs install
- run: git lfs pull
When to use LFS
- Binary files over 1MB that change occasionally
- Assets that most developers donât need locally
- Files that would otherwise bloat clone times
Partial Clone: Skip What You Donât Need
Partial clone downloads repository metadata but skips blob content until you actually need it.
git clone --filter=blob:none [email protected]:yourorg/giant-repo.git
Git fetches blobs on demand as you check out files. First checkout is slower, but the initial clone is much faster.
Best for
- CI/CD pipelines that only touch specific paths
- Developers who donât need full history locally
- Repos with large files that arenât tracked by LFS
Sparse Checkout: Work on a Subset
Sparse checkout limits which directories appear in your working tree. Combined with partial clone, you only download what you need.
git clone --filter=blob:none --sparse [email protected]:yourorg/monorepo.git
cd monorepo
git sparse-checkout init --cone
git sparse-checkout set platform/terraform modules/network
Your working directory now contains only those paths. Everything else exists in Git but isnât checked out.
Add more paths later
git sparse-checkout add services/api
See whatâs included
git sparse-checkout list
Shallow Clone for CI
When CI only needs recent history (not the full repo), use shallow clone:
git clone --depth=20 [email protected]:yourorg/app.git
This downloads only the last 20 commits. Fast for pipelines that just need to build and test.
Limitations
git logonly shows shallow history- Some operations (blame, bisect) may need to fetch more
- Canât push from a shallow clone to a branch with deeper history
Hands-On Lab: Configure LFS and CI
Building on the repo from previous articles:
Step 1: Configure LFS
cd git-hooks-lab
git lfs install
git lfs track "*.tar.gz"
git add .gitattributes
git commit -m "chore(lfs): track archives"
Step 2: Create CI workflow
.github/workflows/infra-ci.yaml:
name: infra-ci
on: [pull_request]
jobs:
preflight:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
with:
submodules: recursive
lfs: true
- uses: hashicorp/setup-terraform@v3
- run: terraform fmt -check -recursive
- run: terraform validate || true
This workflow:
- Checks out code with submodules and LFS
- Validates Terraform formatting
- Runs validation (allowing it to fail for now)
Choosing the Right Strategy
| Situation | Solution |
|---|---|
| Large binaries (ISOs, models, archives) | Git LFS |
| Slow clones due to history size | Partial clone (--filter=blob:none) |
| Monorepo, only need some directories | Sparse checkout |
| CI just needs to build, not full history | Shallow clone (--depth=N) |
You can combine these. For a monorepo with large binaries:
git clone --filter=blob:none --sparse [email protected]:yourorg/monorepo.git
cd monorepo
git lfs install
git sparse-checkout set my-project/
Troubleshooting Guide
| Problem | Cause | Fix |
|---|---|---|
| CI fails to fetch LFS | Missing LFS setup | Add lfs: true to checkout action |
| Clone takes forever | Large files in history | Use partial clone or LFS |
| âBlob not foundâ errors | Partial clone needs to fetch | Run git fetch --unshallow or access the file |
| Sparse checkout missing files | Path not in sparse set | git sparse-checkout add <path> |
Quick Reference
# LFS
git lfs install # Enable LFS for repo
git lfs track "*.iso" # Track file type with LFS
git lfs ls-files # List LFS-tracked files
# Partial clone
git clone --filter=blob:none <url> # Clone without blobs
# Sparse checkout
git sparse-checkout init --cone # Enable sparse checkout
git sparse-checkout set path/to/dir # Checkout only specific paths
git sparse-checkout list # Show current sparse paths
# Shallow clone
git clone --depth=20 <url> # Clone with limited history
Whatâs Next
Next week: Ansible Vault - Securing Secrets in Playbooks. Weâll build a workflow for managing credentials without committing them to Git, integrate with CI, and avoid the leaks that make security teams nervous.
Happy automating!