When I run
git log --pretty=tformat: --numstat | awk '{ add += 1ドル; subs += 2ドル; loc += 1ドル - 2ドル } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }'
in Linux terminal, the output is correct:
added lines: 23322, removed lines: 8536, total lines: 14786
Since I don't want to remember such a complex command, I write a Python script to do the same thing:
import os
GitCommand = 'git log --pretty=tformat: --numstat | awk "{ add += 1ドル; subs += 2ドル; loc += 1ドル - 2ドル } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }"'
report = os.system(GitCommand)
But when I run it, Git reports syntax error:
awk: cmd. line:1: { add += 1ドル; subs += 2ドル; loc += 1ドル - 2ドル } END { printf "added lines: %s, removed lines: %s, total lines: %s
awk: cmd. line:1: ^ unterminated string
awk: cmd. line:1: { add += 1ドル; subs += 2ドル; loc += 1ドル - 2ドル } END { printf "added lines: %s, removed lines: %s, total lines: %s
awk: cmd. line:1: ^ syntax error
I have also tried using subprocess, and the output is similar. The problem probably lies in coding of the command string, especially quotation marks, but I don't know how to fix it.
3 Answers 3
Python script
(This section directly answers the question, meaning getting a Python script that accomplishes what the questioner wants to accomplish. However, it may be the case that a shell script is more appropriate; see the "Shell script" section for more on this.)
I have made a Python script that accomplishes what you want. I made this script by taking your script, and making two modifications:
- I changed the intended double quotes around
awk's only parameter to be escaped single quotes\'. - I changed the literal newline
\nto be an escaped version of "\n", namely\\n.
Here is an example of output showing the modified script working:
$ git log --pretty=tformat: --numstat | awk '{ add += 1ドル; subs += 2ドル; loc += 1ドル - 2ドル } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }'
added lines: 5, removed lines: 1, total lines: 4
$ cat script.py
import os
GitCommand = 'git log --pretty=tformat: --numstat | awk \'{ add += 1ドル; subs += 2ドル; loc += 1ドル - 2ドル } END { printf "added lines: %s, removed lines: %s, total lines: %s\\n", add, subs, loc }\''
report = os.system(GitCommand)
$ python3 script.py
added lines: 5, removed lines: 1, total lines: 4
The following shows a diff of the Python script, from the question's version to this answer's working version:
$ git diff head~ head --word-diff-regex=. script.py
diff --git a/script.py b/script.py
[...]
--- a/script.py
+++ b/script.py
@@ -1,3 +1,3 @@
import os
GitCommand = 'git log --pretty=tformat: --numstat | awk [-"-]{+\'+}{ add += 1ドル; subs += 2ドル; loc += 1ドル - 2ドル } END { printf "added lines: %s, removed lines: % , total lines: %s\{+\+}n", add, subs, loc }[-"-]{+\'+}'
report = os.system(GitCommand)
Shell script
As mentioned elsewhere in this question, having a shell script file may be the most appropriate way to simply and repeatedly invoke any given shell command that, for whatever reason, is not desired to be stored in something like ~/.bashrc, ~/.bash_profile, or similar.
Specifically for this question, here's an example:
$ git log --pretty=tformat: --numstat | awk '{ add += 1ドル; subs += 2ドル; loc += 1ドル - 2ドル } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }'
added lines: 5, removed lines: 1, total lines: 4
$ cat ./total-lines.sh
git log --pretty=tformat: --numstat | awk '{ add += 1ドル; subs += 2ドル; loc += 1ドル - 2ドル } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }'
$ ./total-lines.sh
added lines: 5, removed lines: 1, total lines: 4
My Environment
$ systeminfo | grep --extended-regexp --regexp="^OS (Name|Version)"
OS Name: Microsoft Windows 10 Pro
OS Version: 10.0.19043 N/A Build 19043
$ bash --version | head --lines=1
GNU bash, version 4.4.23(1)-release (x86_64-pc-msys)
$ git --version
git version 2.33.0.windows.2
$ python3 --version
Python 3.9.7
$ awk --version | head --lines=1
GNU Awk 5.0.0, API: 2.0 (GNU MPFR 4.1.0, GNU MP 6.2.1)
2 Comments
"\\n" is that there are two levels of interpretation happening: first Python will interpret it as a backslash character and an n (two separate characters) to pass to awk then awk will interpret that as a new line character.Using Python here adds an unnecessary layer of complexity. The easiest solution here is to create a file my_fancy_git_command.sh and copy the bash code into it. Now you can run the entire command by using the name of the script.
If you want to be able to run this script from multiple directories, I suggest creating a bin directory in your user folder. Then add $HOME/bin to PATH in .bashrc. Be sure to close your terminals and open a new one to see the change to PATH reflected in the current environment.
3 Comments
alias....your comment reminded me this is a thingawk, because you generally want single-quotes around the alias definition, and also around the awk program, so it leads to a similar nesting-quotes/weird-escaping-rules/etc mess. Function syntax is much cleaner.Doing the same with Python subprocess which was intended to replace os.system.
The first way using shell=True according to Python documentation could have a vulnerability of shell injection.
import subprocess
results = subprocess.check_output("git log --pretty=tformat: --numstat | awk '{ add += 1ドル; subs += 2ドル; loc += 1ドル - 2ドル} END { print \"added lines: \"add\", removed lines: \"subs\", total lines: \"loc }'", stdin=subprocess.PIPE, shell=True)
print(results.decode('utf-8'))
The second way with shell=False
cmd1 = ['git', 'log', '--pretty=tformat:', '--numstat']
cmd2 = ['awk', '{ add += 1ドル; subs += 2ドル; loc += 1ドル - 2ドル} END { print \"added lines: \"add\", removed lines: \"subs\", total lines: \"loc }']
p1 = subprocess.Popen(cmd1, stdout=subprocess.PIPE, shell=False)
p2 = subprocess.Popen(cmd2, stdin=p1.stdout, stdout=subprocess.PIPE, shell=False)
results = p2.communicate()[0].decode()
print(results)
Documentation though recommends to use subprocess.run wherever possible, so another option:
cmd1 = ['git', 'log', '--pretty=tformat:', '--numstat']
cmd2 = ['awk', '{ add += 1ドル; subs += 2ドル; loc += 1ドル - 2ドル} END { print \"added lines: \"add\", removed lines: \"subs\", total lines: \"loc }']
p1 = subprocess.run(cmd1, stdout=subprocess.PIPE, shell=False)
p2 = subprocess.run(cmd2, input=p1.stdout, stdout=subprocess.PIPE, shell=False)
print(p2.stdout.decode('utf-8'))
shell=False is a default so could be dropped. Now I can run git commands from Django web framework by pressing of a button and outputting results back.
.shsuffix and call it a day. There's no reason to bring Python into this as it only causes further complications.printfcommands, the strings aren't separated as you hope they are. Switch back to single quotes, or escape the double quotes around yoruprintfcommands.awkprogram, which contains double-quoted strings. But double-quotes don't nest like that. And I don't know why you're not also having trouble with the1ドルand2ドルbeing prematurely expanded by the shell. I agree with @Code-Apprentice: having Python involved is making this way more complicated than it needs to be.