Will a "git-pull develop" fetch all the commits reacheable from develop? -

i have question regarding how git pull changes form remote, , how many history.

i'm considering follow gitflow workflow project. 80 developers, , integrating our changes feature branches develop branch - means of pull requests perform code review first.

we need (locally) rebase our feature branches on (top of) develop, have latest develop changes integrated. hence, pulling develop often. here, don´t want fetch other teammates' feature branches - nor commit history.

now, if pull develop, operation bring commit history happen under other feature branches if reachable (through merge commit) develop?

thanks in advance :-)

edit: might not have been clear enough:

we use rebase locally, pull request on develop branch mergeable. don't use merge might "pollute" feature branches when performing code-review. if pulll request accepted then, merge non fast forward commit.
i know can "git fetch origin develop". here question: git pull origin develop "fetch" blue commits or green ones? see figure git-pull-

i started on complete answer, got way long.

to answer few specifics, concerns real misguided (not fault git documentation terrible). crucial issue not git fetch fetches,¹ it's in commit graph of commits merge git merge; , commits copied when choose run git rebase, depend, again, on commit graph, , on arguments supply git rebase.

the key concept reachability. names origin/master (which git fetch updates) make commits reachable, commits (which git fetch brings in) make other commits reachable. reachable commit makes entire chain of commits "before" commit reachable. merge commits, list more 1 parent commit id, make two (or more) chains of commits reachable.

¹of course, git fetch doesn't fetch, can't possibly reached (in copy of repo), since not exist (in copy of repo). suspect that's aiming here, it's difficult achieve in general, , unnecessary anyway.

remember (1) each commit identified sha-1 hash id, (2) each commit contains hash id(s) of parent commit(s), , (3) branch names names 1 commit id. branch name gets new id stuffed frequently, grow branch (to add regular or merge commit), or point commits copied rebase.

then, remember git rebase works copying commits. copies have new, different ids:

          a--b--c       [original mybranch, before rebase]          / ...--o--o          \           o--o           <-- origin/theirbranch               \                a'-b'-c'   <-- mybranch [after rebase]

this guaranteed fine as long as no 1 else has names (branch or tag names) or commits point of original commits a, b, or c. if have such names, existing names may—or may not—continue point originals, not new copies. even fine long don't use them now. if , when names updated point new commits, old ones become irrelevant long no still-reachable commits point old commits. if existing commits point "outdated" commits, though, commits continue point them forever, since commits permanent.²

²no git object can ever change. fundamental guarantee git makes. however, all git objects, including commits, completely unreachable removed. git has "garbage collector", git gc, this. it's bit complicated there numerous grace period tricks keep objects around: gets 14 days default, , references—including branch, tag, , remote-tracking branch names—may have reflog entries, make otherwise-unreachable commits reachable again. reflog entries persist either 30 days or 90 days default, depending on yet another reachability computation, comparing current hash value in reference hash in reflog entry. garbage collector invoked automatically whenever git thinks might idea.

on `fetch`

for instance, suppose git fetch brings in, repository, origin/bobsbranch , points commits:

          b1-b2-b3    <-- origin/bobsbranch          / ...--o--o             <-- origin/develop          \           c1-c2-c3    <-- my_independent_work

you can rebase work whenever like. meanwhile bob can rebase bobsbranch (though may need force-push result server). let's throws out 3 commits entirely in favor of 1 new b4 commit. run git fetch , pick new, different origin/bobsbranch; repository has:

          b4          <-- origin/bobsbranch          /         | b1-b2-b3    [a reflog entry origin/bobsbranch]         |/ ...--o--o             <-- origin/develop          \           c1-c2-c3    <-- my_independent_work

the reflog-only commits won't show in git log --all or gitk --all views, , long never use of these b* commits, not harm in way (well, do take bit of space in repository).

to avoid bringing them on even though harmless, can run git fetch instructions avoid bringing them over. when run git pull convenience command, git pull runs git fetch instructions bring on one origin/whatever branch's reachable commits, avoids bringing them over—unless, of course, they're reachable git does need, based on 1 branch tip.

on `merge`

a "bad" case occurs when merge in commit "reaches" commit later copied rebase. instance, suppose have this:

...--o--o--a--b   <-- origin/feature_x          \           c--d    <-- feature_y

now decide time merge origin/feature_x's commits (a , b) feature_y, make merge commit:

...--o--o--a--b   <-- origin/feature_x          \     \           c--d--o   <-- feature_y

if else (upstream) decides rebase , force-push feature_x, origin/feature_x points new copies, end this:

          o--a'-b'  <-- origin/feature_x          / ...--o--o--a--b          \     \           c--d--o   <-- feature_y

that can happen if there no name attached rebase-copied commits, if picked else its name. instance, if else pushed feature_f , promised done:

       a----b       /      \ ...--o--o--e--f   <-- origin/feature_f          \           c--d    <-- feature_y

and merge it, this:

       a----b       /      \ ...--o--o--e--f   <-- origin/feature_f          \     \           c--d--o   <-- feature_y

now suppose they, or third person, rebase branch they have points b, without realizing / remembering commit f also points b. is, they start (note not have feature_y):

       a----b     <-- myhacks       /      \ ...--o--o--e--f   <-- feature_f, origin/feature_f

then decide better rebase myhacks onto commit e, run:

$ git checkout myhacks $ git rebase 123e4567    # <id-of-e>

which produces:

       a----b       /      \ ...--o--o--e--f      <-- feature_f, origin/feature_f             \              a'-b'   <-- myhacks

eventually, when fetch (perhaps via git pull) , final version of myhacks—whether or not has name @ time, long has commits a' , b'—you have (and retain) original a--b commits, through commit f, , add a'-b' chain, though may never have seen branch-name myhacks.

conclusion

the "bad" case saw above happened when git fetch brought in commit f, via name (in repository you're fetching from, presumably 1 stored on central server) feature_f. (you , git renamed origin/feature_f.) problem not feature_f (or origin/feature_f) itself, though, rather myhacks: name neither you, nor central server, ever saw! person did have name—or maybe made after fact—used copy commits a , b, without thinking had originals. pushed copies, maybe under yet another name.

the names matter @ fetch , push time because git fetch , git push transfer commits refspecs (mostly pairs of reference names, plus ancillary stuff). before , after point, though, names distractions: it's set of commits, named ids, , reachability status, matters.

Search This Blog

Alcombright