I would like to fetch only the commits of branchA not present in its base branchB.
For example, consider this history:
B1 - B2 - B3 - B4 - B5
\
A1 - A2 - A3
I would like to fetch only A1, A2 and A3.
It's important to note that I don't know up front which commit is A1, and how many commits I need to fetch.
My input is just the heads of the two branches,
in this example branchA=A3 and branchB=B5.
Based on such input I need to identify A1 and fetch everything between A1 and branchA, and ideally nothing more.
Alternatively, fetching a minimal set of commits that include A1, A2 and A3, and enough information to identify A1, can be interesting too.
Why? In a use case where I only need those commits ("what changed in branchA relative to branchB), fetching more than the necessary commits slows down my process. Take for example a large repository with thousands of commits, and feature branches with only a few commits. Fetching the entire history of branchA and branchB fetches a lot of commits I don't need, and takes a lot of time and network bandwidth.
I came up with an ugly hack that avoids fetching the full history, by starting from shallow clones, and incrementally fetching more and more until a common commit is found:
git clone --depth 1 "$repo" --branch "$branchA" shallow
cd shallow
for ((depth = 8; depth <= 1024; depth *= 2)); do
echo "trying depth $depth ..."
git fetch --depth $depth
git fetch --depth $depth origin "$branchB:$branchB"
lastrev=$(git rev-list --reverse "$branchB" | head -n1)
if git merge-base --is-ancestor "$lastrev" HEAD; then
echo "found with depth=$depth"
break
fi
done
This works for my use case: it fetches a large enough subset of commits to identify A1 and include commits until the head of branchA, and it's faster than fetching the complete history of the two branches.
Is there a better way than this? I'm looking for a pure Git solution, but if the GitHub API has something to make this faster and easier, that can be interesting too.