10
\$\begingroup\$

I have ack (the searching tool) "installed" as a single file at C:\ack\bin\ack.pl on a Windows 10 machine and was wondering how to make it executable from the powershell window and other terminals.

I wanted to do it ideally without touching the PATHEXT variable and associating an interpreter to the extension because that has system-wide consequences. I'm only really after a shebang workalike.

First I wrote a batch script wrapper around it (ack.bat).

@echo off
perl.exe %~dp0\ack.pl %*

But, that has the annoying property of handling interruptions via ^C with an are you sure? prompt.

Terminate batch job (Y/N)? y

I was curious how to make something interruptible without prompting and tried looking for an equivalent of the exec* system calls on Windows. Then I came across this answer on ServerFault which complains about the same problem and suggests generating a C program as a potential solution.

I wrote a C++ wrapper that does the trick, but it turned out quite a bit uglier than I would have liked.


// The strings in this file are UTF-16LE for compat with the win32 api
// The source code itself is in UTF-8.
#include <windows.h> // GetCommandLineW
#include <iostream> // wcout
#include <string> // wstring
#include <cassert> // assert
#include <utility> // pair
#include <deque> // deque, size_type
// always debug. Assert is only used when we actually want to
// crash the wrapper process.
#undef NDEBUG
// unsigned index type, probably good enough for traversing
// a vector or deque
typedef std::deque<char>::size_type uidx;
// interpreter_name must be absolute path
std::wstring interpreter_name = %%%%INTERPETER_NAME%%%% ;
std::wstring script_name = %%%%SCRIPT_NAME%%%% ;
class Reaper {
public:
 HANDLE job_handle;
 JOBOBJECT_EXTENDED_LIMIT_INFORMATION limit_info;
 
 Reaper() {
 job_handle = CreateJobObject(NULL, NULL);
 assert(job_handle != NULL);
 limit_info = { 0 };
 limit_info.BasicLimitInformation.LimitFlags = JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE;
 DWORD set_success = SetInformationJobObject(
 job_handle,
 JobObjectExtendedLimitInformation,
 &limit_info,
 sizeof(limit_info));
 assert(set_success);
 }
};
Reaper& get_reaper(void) {
 static Reaper r;
 return r;
}
// the leading int is the error code.
std::pair<DWORD, std::deque<std::wstring>> argvw_of_cmdline(std::wstring command_line) {
 LPCWSTR cmd_line = command_line.c_str();
 int count = 0;
 LPWSTR *the_processed_args = CommandLineToArgvW(
 cmd_line, &count
 );
 // first we handle the error case
 if (the_processed_args == nullptr) {
 return {GetLastError(), std::deque<std::wstring>()};
 } else {
 std::deque<std::wstring> s;
 for (int i = 0; i < count; ++i) {
 s.push_back(the_processed_args[i]);
 }
 return {0, s};
 }
}
std::wstring escape_string(std::wstring ws) {
 bool contains_suspect_char = (std::wstring::npos != ws.find_first_of(L"\"" L"\\"));
 if (contains_suspect_char) {
 std::wstring out(L"\"");
 for (uidx i = 0; i < ws.size(); ++i) {
 if (ws[i] == L'"' || ws[i] == L'\\') {
 out += L'\\';
 out += ws[i];
 } else {
 out += ws[i];
 }
 }
 out += L'"';
 return out;
 } else {
 return ws;
 }
}
std::wstring cmdline_of_argvw(const std::deque<std::wstring> &argvw) {
 std::wstring the_line(L"");
 // this is okay even if the deque is empty
 // because the loop will be traversed zero times.
 uidx last_index = argvw.size() - 1;
 for (uidx i = 0; i < argvw.size() ; i++) {
 the_line += escape_string(argvw[i]);
 if (i != last_index) {
 the_line += L' ';
 }
 }
 return the_line;
}
struct RawWinProcessCreatorW {
 LPCWSTR app_name = NULL;
 LPWSTR command_line = NULL;
 LPSECURITY_ATTRIBUTES process_attributes = NULL;
 LPSECURITY_ATTRIBUTES thread_attributes = NULL;
 BOOL inherit_handles = false;
 DWORD creation_flags = 0;
 LPVOID environment = NULL;
 LPCWSTR current_directory = NULL;
 LPSTARTUPINFOW startup_info = NULL;
 LPPROCESS_INFORMATION process_information = NULL;
 bool run() {
 return CreateProcessW(
 app_name,
 command_line,
 process_attributes,
 thread_attributes,
 inherit_handles,
 creation_flags,
 environment,
 current_directory,
 startup_info,
 process_information
 );
 }
};
std::wstring current_exe_directory(void) {
 HMODULE h_module = GetModuleHandleW(nullptr);
 WCHAR path[MAX_PATH];
 memset(path, 0, sizeof(path)); 
 GetModuleFileNameW(h_module, path, MAX_PATH);
 std::wstring w_path(path);
 // if the last character is a path separator
 // remove it.
 if (w_path.back() == L'\\') {
 w_path.pop_back();
 }
 // keep popping until the last character is a \ -- thwart line continuation
 while (!w_path.empty()) {
 if (w_path.back() == L'\\') {
 w_path.pop_back();
 return w_path;
 } else {
 w_path.pop_back();
 }
 }
 return w_path;
}
int main(int argc, char **argv)
{
 std::wstring exe_dir(current_exe_directory());
 std::wstring fullpath;
 fullpath += exe_dir;
 fullpath += std::wstring(L"\\");
 fullpath += script_name;
 std::wstring old_command_line(GetCommandLineW());
 std::pair<DWORD, std::deque<std::wstring>> p = argvw_of_cmdline(old_command_line);
 DWORD err = p.first;
 assert(err == 0);
 std::deque<std::wstring> split_cl = p.second;
 // remove old executable (it's the current one)
 split_cl.pop_front();
 // need to push interpreter_name and script_name.
 // but the order is reversed.
 split_cl.push_front(fullpath);
 split_cl.push_front(interpreter_name);
 std::wstring command_line = cmdline_of_argvw(split_cl);
 // make sure to zero-initialize these things.
 STARTUPINFOW si = { 0 };
 PROCESS_INFORMATION pi = { 0 };
 RawWinProcessCreatorW r;
 r.app_name = (interpreter_name.c_str());
 r.command_line = const_cast<LPWSTR>(command_line.c_str());
 r.inherit_handles = true;
 r.startup_info = &si;
 r.process_information = &pi;
 r.creation_flags |= CREATE_SUSPENDED;
 bool success = r.run();
 assert(success);
 // DWORD last_error = GetLastError();
 // assign to the job object whatever.
 DWORD assign_status = AssignProcessToJobObject(
 get_reaper().job_handle,
 pi.hProcess
 );
 assert(assign_status);
 // resume the process.
 DWORD resume_status = ResumeThread(pi.hThread);
 
 // wait for the process we spawned.
 DWORD wait_res = WaitForSingleObject(pi.hProcess, INFINITE);
 assert(wait_res != WAIT_ABANDONED);
 assert(wait_res != WAIT_TIMEOUT);
 assert(wait_res != WAIT_FAILED);
 // after the process is gone, try to figure out whether it succeeded
 // and use that information when deciding how to exit yourself.
 // we're using 10, bad environment, as a sentinel.
 DWORD child_exit_status = 10;
 bool recover_exit_status_success = GetExitCodeProcess(
 pi.hProcess,
 &child_exit_status
 );
 assert(recover_exit_status_success);
 assert(child_exit_status != 10);
 return child_exit_status;
}

And here's a Perl script, make_wrapper.pl, that generates and compiles a wrapper script.

The template is concatenated underneath the end of the __DATA__ token with %%%%INTERPETER_NAME%%%% appearing where the interpreter name is and %%%SCRIPT_NAME%%%% appearing where the script name is.

#! /usr/bin/env perl
use strict;
use warnings;
use utf8;
use Getopt::Long;
use File::Spec;
my $interpreter;
my $script;
my $cxx_compiler;
my $output;
GetOptions(
 "int|i=s" => \$interpreter,
 "script|s=s" => \$script,
 "cxx|c=s" => \$cxx_compiler,
 "output|o=s" => \$output,
);
sub escape_wide_string_literal {
 my ($contents) = @_;
 my $out = q[];
 $out .= 'L"';
 $out .= ($contents =~ s/([\"\\])/\\1ドル/gr);
 $out .= '"';
 return $out;
}
sub defined_and_nonempty {
 my ($x) = @_;
 return (defined $x) && ($x ne q[]);
}
die "need interpreter (--int|-i)" unless defined_and_nonempty($interpreter);
die "need script (--script|-s)" unless defined_and_nonempty($script);
die "need C++ compiler (--cxx|-c)" unless defined_and_nonempty($cxx_compiler);
die "need output file (--output|-o)" unless defined_and_nonempty($output);
die "interpreter must exist (--int|-i)" unless (-f $interpreter);
die "script must exist (--script|-s)" unless (-f $script);
die "C++ compiler must exist (--cxx|-c)" unless (-f $cxx_compiler);
die "intepreter must be absolute path (--int|-i)" unless (File::Spec->file_name_is_absolute($interpreter));
die "script should be relative path with no separators (.\\ is okay) (--script|-s)" if ($script =~ /\\/ and not $script =~ /\A[.][\\][^\\]*\z/);
my $cxx_template;
do {
 local $/;
 $cxx_template = <DATA>;
};
close(DATA);
die "internal error" unless defined $cxx_template;
my $interpreter_literal = escape_wide_string_literal($interpreter);
$cxx_template =~ s/%%%%INTERPETER_NAME%%%%/$interpreter_literal/g;
my $script_literal = escape_wide_string_literal($script);
$cxx_template =~ s/%%%%SCRIPT_NAME%%%%/$script_literal/g;
open my $fh, '>', "temp.cpp";
print $fh $cxx_template;
close($fh);
system($cxx_compiler, "-o", $output, "temp.cpp");
die "did not create file" unless (-f $output);
__DATA__
toolic
15.2k5 gold badges29 silver badges213 bronze badges
asked Oct 26, 2018 at 0:41
\$\endgroup\$
0

4 Answers 4

5
+150
\$\begingroup\$

C++

Globally, your C++ seems good enough.

In argvw_of_cmdline you don't have to wrap your second part in a else since you return in the if.

You can remove a lot of things in current_exe_directory:

std::wstring current_exe_directory(void) {
 HMODULE h_module = GetModuleHandleW(nullptr);
 WCHAR path[MAX_PATH];
 memset(path, 0, sizeof(path)); 
 GetModuleFileNameW(h_module, path, MAX_PATH);
 std::wstring_view w_path(path);
 return std::wstring{ws.begin(), ws.find_last_not_of(L'\\')};
}

(note: you can improve it, but it's already kinda simplified)

You can simplify your escape_string(...) function:

std::wstring escape_string(std::wstring ws) {
 const auto esc = std::wstring{L"\\\""};
 for (size_t index = 0; (index = ws.find_first_of(esc, index)) != std::wstring::npos; index += 2) {
 ws.insert(index, 1, L'\\');
 }
 return ws;
}

Note that I didn't tested these functions, but they should work :)

Toby Speight
87.9k14 gold badges104 silver badges325 bronze badges
answered Nov 2, 2018 at 20:13
\$\endgroup\$
5
\$\begingroup\$

Disclaimer

It isn't clear to me what part you want reviews to focus on. I'm not a C++ on Windows person so I can't say much about that part of things. Your C++ is nicely formatted. The main() function could be broken down more, but it is commented well enough that it wouldn't be a burden to maintain in the current state.

Perl

Perl - even on Windows - is something I can talk about...... and your Perl is in good shape too.

  • Absolutely have strict and warning pragmas, all the time, every day, for 99% of Perl, yes yes.
  • Good formatting.
  • Idiomatic use of die X unless Y. Your die messages are excellent as well.
  • Variables are localized! (There's nothing worse than 50k lines of perl and no localization)
  • Variables and subroutines are named in ways that are easy to follow. Thanks.
  • I like using scalars for filehandles also. I believe this is a Perl Best Practice now.

But - like everything - it could be better:

  • It is funny to me that you commented the C++ and not the Perl at all. Part of me takes this as a compliment to Perl's natural readability, but it still seems like you could comment a bit and not distract from the code.
  • The primary instance where missing comments slowed me down in reading this was in the variables. Your variables names are fine, but having a comment that uses slightly different words can help avoid ambiguities and confusion.
  • The next instance of where a comment would be great is explaining the "hanging" __DATA__ section. This looks really out of place, but your explanation above the code was perfect. Why not paste that into the code above the "hanging" __DATA__ section?

Beyond commentary on comments:

  • The way you slurp data in from the DATA filehandle definitely works I would encapsulate it into a sub or use File::Slurp itself. Hiding the implementation would make it slightly easier to follow the code.
  • my $out = q[]; isn't idiomatic Perl. I would recommend my $out = ''; which uses two single quotes instead of q[].
  • Your defined_and_nonempty is pretty similar to what Perl would give you for scalar($x). Do you care if $x == 0 that it evaluates false? Again the use of q[] feels a bit odd.
answered Oct 30, 2018 at 11:18
\$\endgroup\$
3
\$\begingroup\$

#undef NDEBUG needs to be done before the point at which you #include <cassert>.

answered Nov 2, 2018 at 20:44
\$\endgroup\$
1
\$\begingroup\$

Here are some additional suggestions not addressed in the excellent review of the Perl code in this previous answer.

Documentation

The code needs usage documentation, and it would be helpful to provide a -help option to show the usage information.

The standard way to document code in Perl is to use plain old documentation (POD) in conjunction with the Pod::Usage module.

The POD should include:

  • Summary of the code's purpose
  • A description of the required input
  • A description for each command line option

It also gives you manpage-like help with perldoc:

perldoc make_wrapper.pl

Options

There is absolutely nothing wrong with how you are processing the command line options. However, I have found a different approach to be slightly more scalable in terms of adding or removing options.

The approach is to store options in a hash variable as opposed to individual scalar variables.

I have always found the GetOptions requirement of passing a reference to a scalar variable to be a bit ugly. In general, it is much more common to pass a reference to an array or a hash than it is to a scalar.

Here is what it would look like:

my %opt;
GetOptions(\%opt, qw(
 int|i=s 
 script|s=s
 cxx|c=s 
 output|o=s
)) or die "Option parsing failure\n";

Instead of 4 scalar variables, there is a single variable, %opt, which would be used as follows, for example:

die "script must exist (--script|-s)" unless (-f $opt{script});

I sometimes find it helpful to know that the variable was a command line option, and this makes that clear.

Another thing to notice is that in this format, GetOptions accepts the hash ref as the 1st argument, followed by a list of option specifications. The specs are just a list of text strings:

'int|i=s', 'script|s=s', etc.

We can take advantage of the "quote words" operator, qw, and omit all the quotes and commas, leaving a clean list of specs.

I typically have this code inside a sub, which is preferable as the number of options grows. It is also good being in a sub if there is further option value checking.

Namespace

It is best to import only what is needed to avoid namespace pollution. For example, change:

use Getopt::Long;
use File::Spec;

to:

use Getopt::Long qw(GetOptions);
use File::Spec qw();

Warnings

The previous answer suggested:

use warnings;

This is fine. My preference is to use a very strict version of warnings:

use warnings FATAL => 'all';

In my experience, the warnings have always pointed to a bug in my code. The issue is that, in some common usage scenarios, it is too easy to miss the warning messages unless you are looking for them. They can be hard to spot even if your code generates a small amount of output, not to mention anything that scrolls off the screen. This option will kill your program dead so that there is no way to miss the warnings.

Layout

The subs are in the middle of the code, which interrupts the natural flow of the code (from a human readability standpoint). It is common to move the functions to the bottom of the code, just above the __DATA__ line in your code.

There is some inconsistent use of parentheses around built-in functions. It is cleaner to omit the parens:

close $fh;

There's also inconsistency with quote usage:

open my $fh, '>', "temp.cpp";

In this case, it is more consistent to only use single quotes:

open my $fh, '>', 'temp.cpp';

Since it is hard for most people to remember all of Perl's special variables, like $/, it is a good practice to use the English module:

use English;

$/ becomes $INPUT_RECORD_SEPARATOR

answered Oct 1 at 10:14
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.