git commit archeology: digging in with grep and pickaxe

2020-02-25

 | 

~7 min read

 | 

1229 words

In his talk “A Branch In Time” (which I wrote about previously, Tekin Süleyman discussed how an engineer can explore the git history to understand the context of changes (assuming her peers wrote meaningful commit messages).

Two tools that left me slack-jawed when I saw them in action were grep and the pickaxe.1

I’ve been playing with both over the past few weeks, as well as the --patch option, and wanted to jot down the how so that I wouldn’t forget.

The rest of this post will provide an overview of git log’s:

Without further ado…

Grep

If the team is writing useful commit messages, then grep’ing them will be a valuable exercise.2

Git’s grep option in the git log allows searching of the entire commit history.

From the manual:

--grep=<pattern>
   Limit the commits output to ones with log message that matches the specified pattern (regular expression). With more than one --grep=<pattern>, commits whose message matches any of the given patterns are chosen (but see --all-match).
   When --show-notes is in effect, the message from the notes is matched as if it were part of the log message.

For example, looking through the commit history on this site for

git log --grep "feat:"

I can see a full list of all of the commits which include feat in the subject or body.

This is particularly useful if used in conjunction with Conventional Commits or some other standard commit template.

As the manual suggests, you can also stack patterns to match.

git log --grep "feat:" --grep "major"

By default, the patterns are inclusive, however, they can be made exclusionary (that is all required) with the inclusion of the --all-match option:

git log --grep "feat:" --grep "major" --all-match

Pickaxe

I think of the Pickaxe like grep, but for the diff itself. The pickaxe actually comes from diffcore (the family of commands that are used to compare two files), but is available for use within the git log.

Specifically, the two standard ways to use the pickaxe are with the -S and the -G flags.

  • -S takes a string as an argument
  • -G takes a regular expression

While this sounds intimidating. In practice it’s not.

One key difference between the two options and noted in the manual is the scope of concern for the two options. Whereas the -G looks for changes in the pattern, the -S seems to be concerned only with the number of changes of the pattern.

From the manual:

To illustrate the difference between -S<regex> —pickaxe-regex and -G<regex>, consider a commit with the following diff in the same file:

  +      return !regexec(regexp, two->ptr, 1, &regmatch, 0);
  ...
  -      hit = !regexec(regexp, mf2.ptr, 1, &regmatch, 0);

While git log -G"regexec\(regexp" will show this commit, git log -S"regexec\(regexp" --pickaxe-regex will not (because the number of occurrences of that string did not change).

Secret Weapon: Patch

While grep and the pickaxe are useful tools - they’re made even more useful with the --patch option.

By default, the returned results for a search with git log is… just that: the git log.

However, you can see the full diff if you include the --patch option.

This is another good opportunity to see the differences between --grep, -S and -G.

git log --grep="cachePublic: true" --patch

This returns nothing. There are no git commit messages (subject or body) with this string included in them.

What about the -S option though?

git log -S"cachePublic: true," --patch

This returns the full commit message (in this case, there wasn’t much of one) and the changed file:

commit e01fcf867905f4bcb3eb5ad8c961524547a782eb (origin/285/caching-netlify-builds, 285/caching-netlify-builds)
Author: Stephen <stephencweiss@gmail.com>
Date:   Wed Dec 18 13:20:07 2019 -0600

    add netlify cache

diff --git a/gatsby-config.js b/gatsby-config.js
index 93d491e..c29b253 100644
--- a/gatsby-config.js
+++ b/gatsby-config.js
@@ -12,6 +12,12 @@ module.exports = {
   },
   plugins: [
     'gatsby-plugin-styled-components',
+      {
+        resolve: 'gatsby-plugin-netlify-cache',
+        options: {
+            cachePublic: true,
+        }
+      },
     {
       resolve: `gatsby-source-filesystem`,
       options: {

Finally, there’s the -G option:

git log -G"cachePublic: true," --patch

Like the -S option, this returns the full commit message and the changed file(s). However, in this case it also pulled back a second commit where the number of times cachPublic: true, was present didn’t change, but it was reformatted: 3

commit ee201712e254bfd300b77057344994c1c1bd7663
Author: Stephen <stephencweiss@gmail.com>
Date:   Thu Dec 19 09:35:37 2019 -0600

    adding global styles with styled-components

diff --git a/gatsby-config.js b/gatsby-config.js
index bacedc4..96f74db 100644
--- a/gatsby-config.js
+++ b/gatsby-config.js
@@ -15,8 +15,8 @@ module.exports = {
     {
       resolve: 'gatsby-plugin-netlify-cache',
       options: {
-            cachePublic: true,
-        }
+          cachePublic: true,
+        },
     },
     {
       resolve: `gatsby-source-filesystem`,
@@ -100,21 +100,21 @@ module.exports = {
       resolve: `gatsby-plugin-feed`,
       options: {
         feeds: [
-              {
-                  serialize: ({ query: { site, allMarkdownRemark } }) => {
-                    return allMarkdownRemark.edges.map(edge => {
-                      return Object.assign({}, edge.node.frontmatter, {
-                        date: edge.node.frontmatter.date,
-                        publish: edge.node.frontmatter.publish,
-                        updated: edge.node.frontmatter.updated,
-                        draft: edge.node.frontmatter.draft,
-                        url: site.siteMetadata.siteUrl + edge.node.fields.slug,
-                        guid: site.siteMetadata.siteUrl + edge.node.fields.slug,
-                        custom_elements: [{ 'content:encoded': edge.node.html }],
-                      })
-                    })
-                  },
-                  query: `
+            {
+              serialize: ({ query: { site, allMarkdownRemark } }) => {
+                return allMarkdownRemark.edges.map(edge => {
+                  return Object.assign({}, edge.node.frontmatter, {
+                    date: edge.node.frontmatter.date,
+                    publish: edge.node.frontmatter.publish,
+                    updated: edge.node.frontmatter.updated,
+                    draft: edge.node.frontmatter.draft,
+                    url: site.siteMetadata.siteUrl + edge.node.fields.slug,
+                    guid: site.siteMetadata.siteUrl + edge.node.fields.slug,
+                    custom_elements: [{ 'content:encoded': edge.node.html }],
+                  })
+                })
+              },
+              query: `
                   {
                     allMarkdownRemark(
                       limit: 1000,
@@ -137,9 +137,9 @@ module.exports = {
                     }
                   }
                 `,
-                  output: '/rss.xml',
-                  title: 'Code-Comments RSS Feed',
-                },
+              output: '/rss.xml',
+              title: 'Code-Comments RSS Feed',
+            },
         ],
       },
     },
@@ -210,14 +210,14 @@ module.exports = {
         ],
       },
     },
-      `gatsby-plugin-offline`,
-      `gatsby-plugin-react-helmet`,
     {
-        resolve: `gatsby-plugin-typography`,
+        resolve: `gatsby-plugin-styled-components`,
       options: {
-          pathToConfigModule: `src/utils/typography`,
+          displayName: true,
       },
     },
+      `gatsby-plugin-offline`,
+      `gatsby-plugin-react-helmet`,
     {
       resolve: `@gatsby-contrib/gatsby-plugin-elasticlunr-search`,
       options: {

commit e01fcf867905f4bcb3eb5ad8c961524547a782eb (origin/285/caching-netlify-builds, 285/caching-netlify-builds)
Author: Stephen <stephencweiss@gmail.com>
Date:   Wed Dec 18 13:20:07 2019 -0600

    add netlify cache

diff --git a/gatsby-config.js b/gatsby-config.js
index 93d491e..c29b253 100644
--- a/gatsby-config.js
+++ b/gatsby-config.js
@@ -12,6 +12,12 @@ module.exports = {
   },
   plugins: [
     'gatsby-plugin-styled-components',
+      {
+        resolve: 'gatsby-plugin-netlify-cache',
+        options: {
+            cachePublic: true,
+        }
+      },
     {
       resolve: `gatsby-source-filesystem`,
       options: {

Conclusion

Writing good commit messages is hard(er than not). It requires time and effort. However, the total cost is not that high. Practices like Conventional Commits and templates can make the writing easier. Best of all, useful commit messages help you and your team nearly immediately. Even more so if you learn about the tools that make digging through them easy!

Footnotes


Related Posts
  • Filter Git Commits By Author
  • Git Commit: Fixup And Squash Automatically
  • Git: When was a file introduced?
  • Case Insensitive Grep (AKA Ignoring Case)
  • Sed: Grep's Successor For Substitution
  • Writing Better Commit Messages


  • Hi there and thanks for reading! My name's Stephen. I live in Chicago with my wife, Kate, and dog, Finn. Want more? See about and get in touch!