This is a pretty simple question: as a Git newbie I was wondering if there's a way for me to output my git log to a file, preferably in some kind of serialized format like XML, JSON, or YAML. Any suggestions?
9 Answers 9
to output to a file:
git log > filename.log
To specify a format, like you want everything on one line
git log --pretty=oneline >filename.log
or you want it a format to be emailed via a program like sendmail
git log --pretty=email |email-sending-script.sh
to generate JSON, YAML or XML it looks like you need to do something like:
git log --pretty=format:"%h%x09%an%x09%ad%x09%s"
This gist (not mine) perfectly formats output in JSON: https://gist.github.com/1306223
See also:
-
2This worked like a charm, thanks! For future readers, here's a link to the shortcodes used by "format": kernel.org/pub/software/scm/git/docs/git-log.htmlAndrew– Andrew2011年01月05日 04:47:31 +00:00Commented Jan 5, 2011 at 4:47
-
Hmm. Well, all the references I can find point to that same broken page, so shame on whoever took down the git documentation without a redirect. Boo.Andrew– Andrew2012年01月23日 21:08:05 +00:00Commented Jan 23, 2012 at 21:08
-
2More direct link to format codes: kernel.org/pub/software/scm/git/docs/…Vladimir Panteleev– Vladimir Panteleev2012年09月11日 04:54:04 +00:00Commented Sep 11, 2012 at 4:54
-
2@Heiko Rupp: To print the full commit message use
%b
or%B
and you could also use%N
to get any commit notes. You'll also have to figure out how to escape the line breaks. Fwiw for my purposes the first line of the commit message is always sufficient. If you really want to do full text search on commit messages (and patches, for that matter) maybe you should look at indexing your Git log with Solr as described here: garysieling.com/blog/…Noah Sussman– Noah Sussman2013年11月03日 20:44:54 +00:00Commented Nov 3, 2013 at 20:44 -
7Hey, I'm the author of the referenced gist, dropping by to say that I've added another shell script which supports JSON output for "files changed" data from
git log --numstat
. Plus a couple other notes: first, if you are worried about escaping special characters in commit messages then use%f
instead of%s
in the format string. Secondly, the "format codes" are shown whenever you typegit help log
. They are listed under the heading "placeholders" (if you don't know how to search the Git help page it's easy: just press the/
key, type "placeholders" then hit the RETURN key).Noah Sussman– Noah Sussman2013年11月03日 20:52:42 +00:00Commented Nov 3, 2013 at 20:52
I did something like this to create a minimal web api / javascript widget that would show the last 5 commits in any repository.
If you are doing this from any sort of scripting language, you really want to generate your JSON with something other than "
for your quote character, so that you can escape real quotes in commit messages. (You will have them sooner or later, and it's not nice for that to break things.)
So I ended up with the terrifying but unlikely delimiter ^@^
and this command-line.
var cmd = 'git log -n5 --branches=* --pretty=format:\'{%n^@^hash^@^:^@^%h^@^,%n^@^author^@^:^@^%an^@^,%n^@^date^@^:^@^%ad^@^,%n^@^email^@^:^@^%aE^@^,%n^@^message^@^:^@^%s^@^,%n^@^commitDate^@^:^@^%ai^@^,%n^@^age^@^:^@^%cr^@^},\'';
Then (in node.js) my http response body is constructed from stdout
of the call to git log
thusly:
var out = ("" + stdout).replace(/"/gm, '\\"').replace(/\^@\^/gm, '"');
if (out[out.length - 1] == ',') {
out = out.substring (0, out.length - 1);
}
and the result is nice JSON that doesn't break with quotes.
-
A quick workaround for escaping special characters in commit messages would be to use
%f
instead of%s
in the format string:%f: sanitized subject line, suitable for a filename
Noah Sussman– Noah Sussman2013年11月03日 16:02:51 +00:00Commented Nov 3, 2013 at 16:02 -
FWIW, a project using this approach is hereTim Boudreau– Tim Boudreau2013年11月08日 10:47:36 +00:00Commented Nov 8, 2013 at 10:47
-
Worth mentioning you can now do this using ES6 template strings, negating the need to use the ^@^ delimiter and node string replacement.Gary– Gary2017年12月18日 15:24:48 +00:00Commented Dec 18, 2017 at 15:24
-
@Gary Could you provide a short example of how to accomplish that?qwelyt– qwelyt2019年02月15日 12:54:33 +00:00Commented Feb 15, 2019 at 12:54
I wrote this in Powershell to get git logdata and save it as json or other format:
$header = @("commit","tree","parent","refs","subject","body","author","commiter")
[string] $prettyGitLogDump= (git log MyCoolSite.Web-1.4.0.002..HEAD --pretty=format:'%H|%T|%P|%D|%s|%b|%an|%cn;')
$gldata = foreach ($commit in $prettyGitLogDump.Replace("; ",';') -split ";", 0, "multiline") {
$prop = $commit -split "\|"
$hash = [ordered]@{}
for ($i=0;$i -lt $header.count;$i++) {$hash.add($header[$i],$prop[$i])}
[pscustomobject]$hash
}
$gldata | ConvertTo-Json | Set-Content -Path "GitLog.json"
The headernames:
"commit","tree","parent","refs","subject","body","author","commiter"
have to be in sync with datafields :
--pretty=format:'%H|%T|%P|%D|%s|%b|%an|%cn;'
See prettyformat docs.
I choose pipe | as a separator. I am taking a risc that it is not used in the commit message. I used semicolon ; as a sep for every commit. I should of course chosen something else. You could try to code some clever regular expression to match and check if your separators are used in the commit message. Or you could code more complex regularexpression to match split point or code a powershell scriptblock to define the split.
The hardest line in the code to figure out was.
prettyGitLogDump.Replace("; ",';') -split ";", 0, "multiline"
You have to set option multiline becuase there can be CR/LF in the messages and then split stops - you can only set multiline if nr of split is given. Therefore second paramvalue 0 which means all.
(The Replace("; ",';') is just a hack I get a space after the first commit. So I remove space after commit separator. Probably there is a better solution.)
Anyway i think this could be a workable solution for windows users or powershells fans that want the log from git to see who made the commit and why.
git log has no provisions to escape quotes, backslashes, newlines etc., which makes robust direct JSON output impossible (unless you limit yourself to subset of fields with predictable content e.g. hash & dates).
However, the %w([width[,indent1[,indent2]]])
specifier can indent lines, making it possible to emit robust YAML with arbitrary fields! The params are same as git shortlog -w
:
Linewrap the output by wrapping each line at width. The first line of each entry is indented by indent1 spaces, and the second and subsequent lines are indented by indent2 spaces.
width, indent1, and indent2 default to 76, 6 and 9 respectively.
If width is 0 (zero) then indent the lines of the output without wrapping them.
The YAML syntax that guarantees predictable parsing of indented text (with no other quoting needed!) is YAML block literal, not folded, with explicit number of spaces to strip e.g. |-2
.
- The
-
option strips trailing newlines, which technically is lossy — it loses distinction between 1 vs. 4 trailing newlines, which are possible to produce withgit commit --cleanup=verbatim
— so in theory you might want|+2
for multiline fields. Personally, I prefer treating one-line commit messages as strings without newline characters.
Example:
git log --pretty=format:'- hash: %H
author_date: %aI
committer_date: %cI
author: |-2
%w(0,0,4)%an <%ae>%w(0,0,0)
committer: |-2
%w(0,0,4)%cn <%ae>%w(0,0,0)
message: |-2
%w(0,0,4)%B%w(0,0,0)'
%w
affects later % placeholders but also literal lines so resetting the indent with %w(0,0,0)
is necessary, otherwise text like committer:
will also get indented.
output fragment (on reconstructed unix history):
- hash: 0d54a08ec260f3d554db334092068680cdaff26a
author_date: 1972年11月21日T14:35:16-05:00
committer_date: 1972年11月21日T14:35:16-05:00
author: |-2
Ken Thompson <[email protected]>
committer: |-2
Ken Thompson <[email protected]>
message: |-2
Research V2 development
Work on file cmd/df.s
Co-Authored-By: Dennis Ritchie <[email protected]>
Synthesized-from: v2
- hash: 4bc99f57d668fda3158d955c972b09c89c3725bd
author_date: 1972年07月22日T17:42:11-05:00
committer_date: 1972年07月22日T17:42:11-05:00
author: |-2
Dennis Ritchie <[email protected]>
committer: |-2
Dennis Ritchie <[email protected]>
message: |-2
Research V2 development
Work on file c/nc0/c00.c
Synthesized-from: v2
Note I didn't put quotes around the dates, to show off parsing as native YAML timestamp type. Support may vary (especially since YAML v1.2 spec delegated things like timestamps and booleans to app-dependent schema?), so it may be pragmatic to make them "strings"...
To convert YAML to JSON, you can pipe through yq or similar tools.
Or a short in-place script e.g.
| ruby -e '
require "yaml"
require "json"
# to parse timestamps
require "date"
data = YAML.safe_load(STDIN.read(), permitted_classes: [Date, Time])
puts(JSON.pretty_generate(data))
'
-
1You might also want to delete / transform control characters, which are allowed in subject and body, but not in YAML:
git log --pretty=format... | tr -d '000円-010円013円014円016円-037円'
arve0– arve02025年01月06日 19:24:37 +00:00Commented Jan 6 at 19:24 -
You might also want to split into documents, which makes parsing in
yq
faster:git log --pretty=format:'---%nhash: %H%nauthor_date: %aI%ncommitter_date: %cI%nauthor: |-2%n %w(0,0,4)%an <%ae>%w(0,0,0)%ncommitter: |-2%n %w(0,0,4)%cn <%ae>%w(0,0,0)%nmessage: |-2%n %w(0,0,4)%B%w(0,0,0)'
arve0– arve02025年01月06日 19:25:48 +00:00Commented Jan 6 at 19:25
There is also jc
which can convert the output of many Unix tools into JSON. To get a git log in JSON and then filter it with jq
, you would simply do:
jc git log | jq '.[] | ...'
It seems like, for json, you need to escape quotes. This script works quite good:
git log --pretty=format:'{%n \"commit\": \"%H\",%n \"author\": \"%an\",%n \"date\": \"%ad\",%n \"message\": \"%f\"%n},'
The result is like this:
{
"commit": "e31231231239a67e42175e9201823e70",
"author": "jack",
"date": "Thu Oct 19 11:20:39 2023 +0200",
"message": "commit message 1"
},
{
"commit": "14594ad80d4949b0f1c10b180740083qe",
"author": "jill",
"date": "Wed Oct 18 11:07:02 2023 +0200",
"message": "commit message 2"
},
{
"commit": "123asd1231g2f3gh12f3h12f3",
"author": "john",
"date": "Wed Oct 18 10:56:51 2023 +0200",
"message": "commit message 3"
},
You need to add [
and ]
to start and end
Behold https://github.com/dreamyguy/gitlogg, the last git-log => JSON
parser you will ever need!
Some of Gitlogg's features are:
- Parse the
git log
of multiple repositories into oneJSON
file. - Introduced
repository
key/value. - Introduced
files changed
,insertions
anddeletions
keys/values. - Introduced
impact
key/value, which represents the cumulative changes for the commit (insertions
-deletions
). - Sanitise double quotes
"
by converting them to single quotes'
on all values that allow or are created by user input, likesubject
. - Nearly all the
pretty=format:
placeholders are available. - Easily include / exclude which keys/values will be parsed to
JSON
by commenting out/uncommenting the available ones. - Easy to read code that's thoroughly commented.
- Script execution feedback on console.
- Error handling (since path to repositories needs to be set correctly).
Success, the JSON was parsed and saved. Success, the JSON was parsed and saved.
Error 001 Error 001: path to repositories does not exist.
Error 002 Error 002: path to repositories exists, but is empty.
This will format commits as YAML, including filenames edited in the commits:
#!/usr/bin/env bash
# usage:
# ./commits-as-yaml.sh [rev-list options]
#
# examples, show last 3 commits as yaml:
# ./commits-as-yaml.sh HEAD~3..HEAD
# show all commits in branch as yaml:
# ./commits-as-yaml.sh
set -eo pipefail
# Define yaml_format outside main function, so lines are not indented.
#
# Explicit define text indentation to 2 spaces with |2, such that
# subjects, filenames or bodies that start with spaces does not trip yaml parsers.
# Yaml parsers typically assume identation of text block to be the same
# amount of spaces as the first line of the text block.
yaml_format='---
commit_hash: %H
commit_date: %cI
committer_name: %cn
committer_email: %ce
author_date: %aI
author_name: %an
author_email: %ae
subject: |2
%s
body: |2
%b
files: |2'
main() {
# filename from --name-only as utf-8
git config core.quotepath off
# prefix that does not exist in subjects, bodies or filenames,
# such that we can indent subjects, bodies and filenames without indenting keys
prefix="💈"
format=$(add_prefix_to_keys "$yaml_format")
git log --name-only --pretty=tformat:"$format" "$@" \
| remote_empty_lines \
| indent_values \
| remove_prefix_from_keys \
| remove_invalid_yaml_chars \
| files_to_array
}
remote_empty_lines() {
sed '/^$/d'
}
add_prefix_to_keys() {
# all lines in yaml_format that does not start with % are keys
echo "1ドル" | sed -E "s/^([^%].*)/$prefix1円/"
}
indent_values() {
# indent all lines that does not start with prefix
sed -E "s/^([^$prefix].*)/ 1円/"
}
remove_prefix_from_keys() {
sed "s/^$prefix//"
}
remove_invalid_yaml_chars() {
# removes control characters that are invalid in yaml
tr -d '000円-010円013円014円016円-037円'
}
files_to_array() {
yq '.files |= [(split("\n")[] | select(length > 0))]'
# cat -
}
main "$@"
Convert to JSON or XML:
./git-commits-as-yaml.sh | yq --output-format json
./git-commits-as-yaml.sh | yq --output-format xml # might want to adjust YAML-format for better XML
Inspired by answers:
One nice approach for json output is on https://til.simonwillison.net/jq/git-log-json blog.
It does use jq commandline processor for doing a job.
git log --pretty=format:'%H%x00%an <%ae>%x00%ad%x00%s%x00' | \
jq -R -s '[split("\n")[:-1] | map(split("\u0000")) | .[] | {
"commit": .[0],
"author": .[1],
"date": .[2],
"message": .[3]
}]'
The output looks like this:
[
{
"commit": "3feed1f66e2b746f349ee56970a62246a18bb164",
"author": "Simon Willison <[email protected]>",
"date": "Wed Mar 22 15:54:35 2023 -0700",
"message": "Re-applied Black"
},
{
"commit": "d97e82df3c8a3f2e97038d7080167be9bb74a68d",
"author": "Simon Willison <[email protected]>",
"date": "Wed Mar 22 15:49:39 2023 -0700",
"message": "?_extra= support and TableView refactor to table_view"
},
{
"commit": "56b0758a5fbf85d01ff80a40c9b028469d7bb65f",
"author": "Simon Willison <[email protected]>",
"date": "Wed Mar 8 12:52:25 2023 -0800",
"message": "0.64 release notes, refs #2036"
},
{
"commit": "25fdbe6b27888b7ccf1284c0304a8eb282dbb428",
"author": "Simon Willison <[email protected]>",
"date": "Wed Mar 8 12:33:23 2023 -0800",
"message": "use tmpdir instead of isolated_filesystem, refs #2037"
}
]
Explore related questions
See similar questions with these tags.