git filter branch command

A few notes about rewriting commit history: changing user and email for older commits.

lnxbox:~/repo$ git filter-branch --commit-filter '
  if [ "$GIT_COMMITTER_EMAIL" == "<old-email>" ];
      GIT_COMMITTER_NAME="Sarang Baheti";
      GIT_AUTHOR_NAME="Sarang Baheti";
      git commit-tree "$@";
      git commit-tree "$@";
  fi' HEAD

In the above script changing the email and name, but not just as an author but as committer too.

I tried a few things before this:

What above does not fix is signing the older commits. I haven’t had a chance to fiddle with it yet. Some links along those lines:

This turned out to be one of the most useful link on the topic, just quoting a few of details here:

So what is a git commit:

Ultimately, arguing about the “real” mental model is mostly pedantry. There are multiple ways of looking at a commit. The documentation tends to implicitly think of them as “full copies of the entire file tree”, which is where most of the confusion about filter-branch comes from. But often it’s important to picture them as diffs, too.

git rebase:

Rebase does a whole bunch of things. Its core task is, given the current branch and a branch that you want to “rebase onto”, it will take all commits unique to your branch, and apply them in order to the new one. Here, “apply” means “apply the diff of the commit, attempting to resolve any conflicts”. At times, it may ask you to manually resolve the conflicts, using the same tooling you use for conflicts during git merge.

git filter-branch:

What git filter-branch will do is for each commit in the specified branch, apply filters to the snapshot, and create a new commit. The new commit’s parent will be the filtered version of the old commit’s parent. So it creates a parallel commit DAG.

Because the filters apply on the snapshots instead of the diffs, there’s no chance for this to cause conflicts like in git rebase. In git rebase, if I have one commit that makes changes to a file, and I change the previous commit to just remove the area of the file that was changed, I’d have a conflict and git would ask me to figure out how the changes are supposed to be applied.

In git-filter-branch, if I do this, it will just power through. Unless you explicitly write your filters to refer to previous commits, the new commit is created in isolation, so it doesn’t worry about changes to the previous commits. If you had indeed edited the previous commit, the new commit will appear to undo those changes and apply its own on top of that.

filter-branch is generally for operations you want to apply pervasively to a repository. If you just want to tweak a few commits, it won’t work, since future commits will appear to undo your changes. git rebase is for when you want to tweak a few commits.

There are many other filters, like –commit-filter (lets you discard a commit entirely), –msg-filter (rewriting commit messages), and –env-filter (changing things like author metadata or other env vars). You can see a complete list with examples in the docs


Additional Notes (click to expand..)

There is a tool to achieve some of these on github named git-filter-repo. I haven’t tried it yet but seems like very similar to what I was doing.

A few links to follow up on: - Another detailed reply on stackoverflow about this topic using git filter-branch - Similar trick using git rebase with interactivity. I haven’t explored this one yet- perhaps some other time.