On Thu, Jan 27, 2011 at 12:57 PM, Pauli Virtanen <pa...@ik...> wrote: > 2011年1月27日 12:39:48 -0500, Darren Dale wrote: > [clip] >> Me too. I just posted the latest version of the repository to >> github.com/darrendale/matplotlib.git . Its ~42MB, but it has a bunch of >> unreachable objects. As soon as we figure out how to git rid of them, I >> think we will be ready to freeze the svn repo and wrap this up. > > Unreachable from where? How do you know there are unreachable > objects? > > Note that the snippet > > git fsck --unreachable HEAD $(git for-each-ref --format="%(objectname)" refs/heads) > > only checks for objects unreachable from branches (by definition, > stuff under refs/heads). However, there's also other stuff under refs/: > tags and hidden branches. Especially the postprocess.sh script hides > some branches. > > To see all that is there, check the output from > > git for-each-ref Oh, I didn't understand what I was doing with the git fsck command. Still, Even after removing the the largest blob in the repo with run git filter-branch --index-filter \ 'git rm --cached --ignore-unmatch release/osx/matplotlib-0.98.5.tar.gz' \ -- 750059aa09340^.. the blob still exists, but is not associated with a commit according to git log --pretty=oneline -- release/osx/matplotlib-0.98.5.tar.gz That blob accounts for 1/4 of the total size of the repo. It would be nice to get rid of it, if possible. Darren
to, 2011年01月27日 kello 13:44 -0500, Darren Dale kirjoitti: [clip] > Still, Even after removing the the largest blob in the repo with > > run git filter-branch --index-filter \ > 'git rm --cached --ignore-unmatch release/osx/matplotlib-0.98.5.tar.gz' \ > -- 750059aa09340^.. > > the blob still exists, but is not associated with a commit according to > > git log --pretty=oneline -- release/osx/matplotlib-0.98.5.tar.gz > > That blob accounts for 1/4 of the total size of the repo. It would be > nice to get rid of it, if possible. I think "git log" will show you only the current branch by default. Do git log --pretty=oneline --all -- release/osx/matplotlib-0.98.5.tar.gz to get all branches, and do for branch in `git for-each-ref --format='%(refname)'`; do S=`git log --pretty=oneline $branch -- release/osx/matplotlib-0.98.5.tar.gz`; if test -n "$S"; then echo "$branch"; echo "$S"; fi; done to see which refs have the commits containing it. Similarly, git-filter-branch rewrites only the current branch unless told otherwise. To filter everything, it's best to do git filter-branch --index-filter \ 'git rm --cached --ignore-unmatch release/osx/matplotlib-0.98.5.tar.gz' \ -- `git for-each-ref --format="750059aa09340^..%(refname)"` Note that all branches and tags should be filtered in the same way: since rewriting changes the hashes of all following commits, you end up with incompatible histories otherwise. After that, I get down to 34 MB. -- Pauli Virtanen
On Thu, Jan 27, 2011 at 4:18 PM, Pauli Virtanen <pa...@ik...> wrote: > to, 2011年01月27日 kello 13:44 -0500, Darren Dale kirjoitti: > [clip] >> Still, Even after removing the the largest blob in the repo with >> >> run git filter-branch --index-filter \ >> 'git rm --cached --ignore-unmatch release/osx/matplotlib-0.98.5.tar.gz' \ >> -- 750059aa09340^.. >> >> the blob still exists, but is not associated with a commit according to >> >> git log --pretty=oneline -- release/osx/matplotlib-0.98.5.tar.gz >> >> That blob accounts for 1/4 of the total size of the repo. It would be >> nice to get rid of it, if possible. > > I think "git log" will show you only the current branch by default. Do > > git log --pretty=oneline --all -- release/osx/matplotlib-0.98.5.tar.gz > > to get all branches, and do > > for branch in `git for-each-ref --format='%(refname)'`; do S=`git log --pretty=oneline $branch -- release/osx/matplotlib-0.98.5.tar.gz`; if test -n "$S"; then echo "$branch"; echo "$S"; fi; done > > to see which refs have the commits containing it. > > Similarly, git-filter-branch rewrites only the current branch unless > told otherwise. To filter everything, it's best to do > > git filter-branch --index-filter \ > 'git rm --cached --ignore-unmatch release/osx/matplotlib-0.98.5.tar.gz' \ > -- `git for-each-ref --format="750059aa09340^..%(refname)"` > > Note that all branches and tags should be filtered in the same way: > since rewriting changes the hashes of all following commits, you end up > with incompatible histories otherwise. > > After that, I get down to 34 MB. You are brilliant. If you send me your address off-list, I'll send you a bottle of scotch, or tequila, or a doughnut, or whatever you want.