Is possible to use script for dependencies in c++?

Question 1

I wanted to make a script that would parse a main file (with int main()) look in its #include "..." local headers, and if they were not in the current dir, then find those headers, then its source files and provided them as implementation in g++. In other words, I wanted to have a script-helper, that would watch for dependencies. I think I made it, perl was used. I would like to get some reviews:

 #!/usr/bin/perl 
use autodie;
use Cwd qw[getcwd abs_path];
use Getopt::Long qw[GetOptions];
use File::Find qw[find];
#global arrays
@src; #source files -> .cpp
@hed; #headers files -> .hpp
@dep; #dependencies -> .hpp + .cpp
$command;
GetOptions(
"s" => \$opt_s, #headers the same as source files
"h" => \$opt_h, #help message
"o=s" => \$opt_o, #output filename
"i=s" => \%opt_i, #dependencies
"debug" => \$opt_debug #output the command
) or die "command options\n";
if($opt_h){
 print "usage: exe [-h][--debug][-s][-o output_file][-i dir=directory target=source]... sources...\n";
 exit 1;
}
die "no args" if !($out=$ARGV[0]);
$out = $opt_o if $opt_o;
#-------------------------------------------------
sub diff {
 my $file = shift;
 $file = "$file.cpp";
 open MAIN, $file;
 opendir CWD, getcwd;
 my @file_dep = map { /#include "([^"]+)"/ ? abs_path(1ドル) : () } <MAIN>;
 my %local = map { abs_path($_) => 1 } grep { !/^\./ } readdir CWD;
 #headers found in the main file
 my @tmp;
 for(@file_dep){
 push @tmp, $_ if ! $local{$_};
 } 
 @tmp = map {/.+\/(.+)/} @tmp;
 
 #finding absolute path for those files
 my @ret;
 for my $i (@tmp){
 find( sub {
 return unless -f;
 return unless /$i/;
 push @ret, $File::Find::name;
 }, '/home/shepherd/Desktop');
 }
 @ret = map { "$_.cpp" } map {/(.+)\./} @ret;
 return \@ret;
}
sub dependencies{
 my $dir=shift; my $target=shift;
 my @ar, my %local;
 #get full names of target files 
 find( sub {
 return unless -f;
 push @ar, $File::Find::name;
 }, $dir);
 %local = map { $_ => 1 } @ar;
 #and compare them againts the file from MAIN
 for(@{diff($target)}){
 push @dep, $_ if $local{$_};
 }
}
sub debug{
 print "final output:\n$command\n\nDependencies:\n";
 print "$_\n" for @dep;
 exit 1;
}
#------------------------------------------------------
#providing source and headers
if($opt_s){
 @src = map { "$_.cpp" } @ARGV;
 @hed = map { !/$out/ and "$_.hpp" } @ARGV;
} else {
 @src = map { !/_h/ and "$_.cpp"} @ARGV;
 @hed = map { /_h/ and s/^(.+)_.+/1ドル/ and "$_.hpp" } @ARGV;
}
if(%opt_i){
 my @dirs; my @targets;
 for(keys %opt_i){
 push @dirs, $opt_i{$_} if $_ eq "dir";
 push @targets, $opt_i{$_} if $_ eq "target";
 }
 if(@dirs!=@targets){
 print "you have to specify both target and directory. Not all targets have their directories\n";
 exit -1;
 }
 my %h;
 @h{@dirs} = @targets;
 dependencies($_, $h{$_}) for keys %h; 
 
 $command = "g++ ";
 $command .= "-I $_ " for keys %h;
 $command .= "-o $out.out @hed @dep @src";
 debug if $opt_debug;
 system $command;
 exec "./$out.out";
} else {
 $command = "g++ -o $out.out @hed @src";
 debug() if $opt_debug;
 system $command;
 exec "./$out.out";
}

Now an example:

$pwd
/home/user/Desktop/bin/2
$ls
main.cpp student2.cpp student2.hpp

Student2.cpp has some dependencies (it uses struct defined in student.cpp and a function defined in grade.cpp), with the script you can see what would it give you: (script is in /usr/local/bin/exe)

$exe -h
usage: exe [-h][--debug][-s][-o output_file][-i dir=directory target=source]... sources...
$exe --debug -i target=student2 -i dir=/home/user/Desktop/bin/1 main student2
final output:
g++ -I /home/user/Desktop/bin/1 -o main.out /home/user/Desktop/bin/1/grade.cpp /home/user/Desktop/bin/1/student.cpp main.cpp student2.cpp
Dependencies:
/home/user/Desktop/bin/1/grade.cpp
/home/user/Desktop/bin/1/student.cpp

As you can see, the script found a dependencies in studen2.cpp which were in another directory and included them to final command. You just have to specify the source files without extension (just file base names). In conclusion I just for each target file (which could have dependencies in its #include "dependecy.hpp" source file), I provide a directory where the dependency (dependency=header+source[implementation]) is, that's it. All the rest does the script

Question 2

"Could it always work" that's a dangerous question, but I assume you've tested it at least on your own system and that it works there?

Question 3

Please provide a description of how the different command line options should work.

Question 4

Also specify why you did not use cmake or make instead of rolling your own solution

Question 5

The modern compilers have a capability to generate dependencies for you. I would rather trust this job to the compiler.

Question 6

> g++ -MMD -MP -MF <FileName>.d <FileName>.cpp; cat <FileName>.d Normally you do this by adding appropriate definitions to your make file.

Question 7

It is not so easy to get a clear picture of what the program is doing and why it is doing what it is doing. I think adding more documentation and comments would help, and also trying to code in a way that is easy to read. That means using function and variable names carefully to enhance readability. Avoid using compact/clever constructs if they are not easy to read, instead prefer more verbose code if it can improve readability and maintainability.

It is not clear why you did not want to use make or cmake to handle dependencies in a more efficient way. Another issue is the purpose of the command line switches. It would help to provide more documentation and background for their usage.

Automatic compilation of dependencies is usually done with make or cmake. But this requires you to write a Makefile or a CMakeLists.txt file that specify dependencies. Another option that avoids this is to use g++ -MMD -MP -MF as mentioned by @MartinYork in the comments. Also note that make and cmake has the added benefit of only recompiling the source files that have changed (i.e. those that are newer than the target file). This can markedly speed up compilation times for a large project. The Perl script on the other hand, will recompile every dependency into a single object each time whether some of the dependencies has changed or not.

On the other hand, an advantage of using the Perl script can be to avoid writing the Makefile (though I would recommend learning to write a Makefile or a CMakeLists.txt as it is the common way of doing it). The script also automatically runs the executable file after compilation, though it does not check if the compilation failed or not (if the compilation fails it does not make sense to run the executable). Another advantage can be that it does not generate multiple .o files (as make and cmake does to to enable recompilation only of changed files).

The Perl script as you named exe (I will rename it to exe.pl for clarity) can be used in many ways. From reading the source code, here is what I found:

Firstly, it can be used to compile specific files in the current directory (and then run the generated executable). For example:

$ exe.pl main student2

This will run g++ -o main.out main.cpp student2.cpp. The -o option can be used to specify another name for the exe (but the suffix will always be .out):

$ exe.pl -o prog main student2

runs g++ -o prog.out main.cpp student2.cpp. The -s option can be used to add headers to the compilation (though I could not see why this is useful, as headers are commonly included from within a .cpp file, and therefore should be included automatically by the g++ preprocessor):

$ exe.pl -s main student2

runs g++ -o main.exe main.cpp student2.cpp student2.hpp. Note that main.hpp is not added. The script considers the first filename on the command line (here main) as the "main" script, and the -s option will not add a header file for the main script. (Please consider clarify why this is done!) Headers can still be added without using the -s option by supplying names that matches "_h":

$ exe.pl main student2 student2_h

runs g++ -o main.exe main.cpp student2.cpp student2.hpp. Next, the the -i switch is used to handle dependencies. A dependency is a .cpp file in another directory, let's call it DD, from the main directory, DM, where the script is run from. If the dependency includes header files, the script checks if the header files are located in DM, if so they are excluded from the later compilation (please consider clarify why this is done).

For example, consider DM=/home/user/Desktop/bin/2. We see that DM is located in a parent directory DT=/home/user/Desktop which the script will use as the top of the source tree. Then if for example the dependency directory is DD=/home/user/Desktop/bin/1 and the dependency file is student.cpp which contains an include statement #include "grade.hpp", the script first checks if grade.hpp already exists in DM. If it does, it is excluded from the later g++ compilation command (please consider explaining why it is done). Next, the script tries to find student.cpp in DT or any of it sub directories recursivly using File:Find. If it finds the file (or more than one file) and it turns out that the file is in DD (and not some other directory in DT), it is assumed that there also exists a .cpp file with the same name in DD and the absolute path of this .cpp file is included in the later g++ compilation command. Also, the absolute path of DD is added as an include search path (-I option) to the g++ command.

I would recommend that the motivation behind the above logic (which is not at all clear to me) be explained carefully in the source code as comments.

To summarize, the above example corresponds to the following command line:

$ exe.pl -i target=student -i dir=/home/user/Desktop/bin/1 main student2

and the script will then produce the following g++ command:

g++ -I /home/user/Desktop/bin/1 -o main.exe /home/user/Desktop/bin/1/student.cpp main.cpp student2.cpp

Logical issues

The -i option does not work with more than one pair of (target, dir)

Currently, the -i option does not work for more than one target. For example, for the command line:

$ exe.pl -i target=student2 -i dir=/home/user/Desktop/bin/1 -i target=student3 -i dir=/home/user/Desktop/bin/3

GetOptions() will return for the hash %opt_i corresponding to the input parameters "i=s" => \%opt_i the following hash

%opt_i = (target => "student3", dir => "/home/user/Desktop/bin/3")

Notice that the first target student2 is missing, this is because both targets use the same hash key target. To fix this, you can try use arrays instead of hashes as parameters to GetOptions(). For example:

"target=s" => \@opt_t,
"dir=s" => \@opt_d,

Dependencies in sub directories are not checked for

As mentioned above, the code tries to exclude dependencies that are present in the main directory. But if a dependency is in a sub directory of that directory it will not find it. This is due to the usage of readdir() :

my %local = map { abs_path($_) => 1 } grep { !/^\./ } readdir CWD;

Here, readdir() will only return the files in CWD, not those in any sub directory below it.

Account for multiple versions of the same dependency file

Currently the code uses the file in the main directory if there are multiple versions of the same file name.

Let's say the dependency file /home/user/Desktop/bin/1/student.hpp contains:

#include "grade.hpp"

and there exists two versions of the corresponding .cpp file. One in the dependency directory /home/user/Desktop/bin/1/

/home/user/Desktop/bin/1/grade.cpp

and one in the CWD (where the script is run from)

/home/user/Desktop/bin/2/grade.cpp

What is the correct file? The script should at least give a warning.

Not checking recursivly for dependencies

Let's say student.hpp has a #include "grade.hpp" and grade.hpp has an include #include "calc.hpp". Then, it will not find and compile calc.cpp.

The `_h` command line trick does not work correctly

The following code is used to check for header files on the command line:

@hed = map { /_h/ and s/^(.+)_.+/1ドル/ and "$_.hpp" } @ARGV;

Notice that the first regex /_h/ matches any file with a _h anywhere in the filename, for example sah_handler. I think you need to add an end-of-string anchor to the regex: /_h$/.

Matching of #include files name in a dependency file

The code uses

my @file_dep = map { /#include "([^"]+)"/ ? abs_path(1ドル) : () } <MAIN>;

to extract the dependencies from a dependency file. Note that this requires that there is no space between # and include. But the assumption is not correct, it is in fact allowed to have spaces there, for example

# include "student.hpp"

is a legal C++ include statement.

Language related issues

Use strict, warnings

It is recommended to include use strict; use warnings at the top of your program. This will help you catch errors at an early stage.

Try to limit the use of global variables

Extensive use of global variables makes it harder to reason about a program. It is crucial that a program is easy to read (and understand) in order to maintain and extend it effectively (at a later point). It also makes it easier to track down bugs.

Note that if you add use strict at the top of the program, global variable needs to be declared similar to lexical variables. You declare a global variable with our.

Old style open() and opendir()

Modern perl uses the three-argument form of open and avoids global bareword filehandle names. Instead use lexical filehandles. So instead of this:

open MAIN, $file;

do this (assuming no autodie):

open (my $MAIN, '<', $file) or die "could not open $file: $!";

See Three-arg open() from the book "Modern Perl" for more information.

Shebang

See this blog for more information. Consider replacing #!/usr/bin/perl with #!/usr/bin/env perl Most systems have /usr/bin/env. It will also allow your script to run if you e.g.have multiple perls on your system. For example if you are using perlbrew.

Clever use of map()

The code uses map to produce very concise code, but such code can be difficult to understand and make it harder to maintain your code in the future.

Also note that returning false from the map {} code block like in

@src = map { !/_h/ and "$_.cpp"} @ARGV;

produces an empty string element in @src, if you want to not produce an element you must return an empty list () instead of false:

@src = map { !/_h/ ? "$_.cpp" : () } @ARGV;

Use good descriptive names for the subs.

The sub diff() is supposed to find dependency files that are not present in the current directory. But the name diff() does not clarify what the sub is doing. On the other hand, the following name might be too verbose:

find_abs_path_of_dep_files_that_does_not_exist_in_curdir()

but it is at least easier to understand.

Use positive return values with `exit`

The exit code from a linux process is usually an integer between zero (indicating success) and 125, see this answer for more information.

Check the return value of `system $command`

You should check the return value from the system() call for g++. The compilation may fail, and then the exit code will be nonzero. In that case, there is no point in running the executable after the compilation has finished.

Use `say` instead of `print`

You can avoid typing a final newline character for print statements by using say instead of print. The say function was introduced in perl 5.10, and is mad available by adding use v5.10 or use use feature qw(say) to the top of your script.

Example code

Here is an example of how you can write the code, following some of the principles I discussed above. I use an object oriented approach to avoid passing too many variables around in the parameter lists of the subs. It also avoids using global variables.

#! /usr/bin/env perl
package Main;
use feature qw(say);
use strict;
use warnings;
use Cwd qw(getcwd);
use File::Spec;
use Getopt::Long ();
use POSIX ();
{ # <--- Introduce scope so lexical variables do not "leak" into the subs below..
 my $self = Main->new( rundir => getcwd() );
 $self->parse_command_line_options();
 $self->parse_command_line_arguments();
 $self->find_dependencies();
 $self->compile();
 $self->run();
}
# ---------------------------------------
# Methods, alphabetically
# ---------------------------------------
sub check_run_cmd_result {
 my ( $self, $res ) = @_;
 my $signal = $res & 0x7F;
 if ( $res == -1 ) {
 die "Failed to execute command: $!";
 }
 elsif ( $signal ) {
 my $str;
 if ( $signal == POSIX::SIGINT ) {
 die "Aborted by user.";
 }
 else {
 die sprintf(
 "Command died with signal %d, %s coredump.",
 $signal, ( $res & 128 ) ? 'with' : 'without'
 );
 }
 }
 else {
 $res >>= 8;
 die "Compilation failed.\n" if $res != 0;
 }
}
sub compile {
 my ( $self ) = @_;
 my @command = ('g++');
 push @command, ("-I", $_) for @{$self->{inc}};
 push @command, "-o", "$self->{out}.out";
 push @command, @{$self->{hed}}, @{$self->{deps}}, @{$self->{src}};
 $self->debug( "@command" ) if $self->{opt_debug};
 my $res = system @command;
 $self->check_run_cmd_result( $res );
}
sub debug{
 my ( $self, $cmd ) = @_;
 say "final output:\n$cmd\n\nDependencies:";
 say for @{$self->{dep}};
 exit 1;
}
sub find_dependency {
 my ( $self, $target, $dir ) = @_;
 $target .= '.cpp';
 my $fn = File::Spec->catfile($dir, $target);
 open ( my $fh, '<', $fn ) or die "Could not open file '$fn': $!";
 my @include_args = map { /^#\s*include\s*"([^"]+)"/ ? 1ドル : () } <$fh>;
 close $fh;
 my @deps;
 for (@include_args) {
 my $fn = File::Spec->catfile( $dir, $_ );
 # TODO: In your program you checked if file also existed in
 # $self->{rundir}, and excluded it if so. Do you really need to check that?
 if (-e $fn) { # the file exists in target dir
 my ($temp_fn, $ext) = remove_file_extension( $fn );
 if (defined $ext) {
 check_valid_header_file_extension( $ext, $fn );
 push @deps, "$temp_fn.cpp";
 # TODO: Here you could call $self->find_dependency() recursively
 # on basename($temp_fn)
 }
 }
 }
 if (@deps) {
 push @{$self->{deps}}, @deps;
 push @{$self->{inc}}, $dir;
 }
}
sub find_dependencies {
 my ( $self ) = @_;
 $self->{deps} = [];
 $self->{inc} = [];
 my $targets = $self->{opt_t};
 my $dirs = $self->{opt_d};
 for my $i (0..$#$targets) {
 my $target = $targets->[$i];
 my $dir = $dirs->[$i];
 $self->find_dependency( $target, $dir );
 }
}
sub parse_command_line_arguments {
 my ( $self ) = @_;
 check_that_name_does_not_contain_suffix($_) for @ARGV;
 # TODO: Describe the purpose of -s option here!!
 if($self->{opt_s}){
 $self->{src} = [ map { "$_.cpp" } @ARGV ];
 # NOTE: exclude header file for main program name ($self->{out})
 # So if main program name is "main", we include main.cpp, but not main.hpp
 # TODO: describe why it is excluded
 $self->{hed} = [ map { !/^$self->{out}$/ ? "$_.hpp" : () } @ARGV];
 }
 else {
 # TODO: Describe what is the purpose of "_h" here!!
 $self->{src} = [ map { !/_h$/ ? "$_.cpp" : () } @ARGV ];
 $self->{hed} = [ map { /^(.+)_h$/ ? "1ドル.hpp" : () } @ARGV ];
 }
}
sub parse_command_line_options {
 my ( $self ) = @_;
 Getopt::Long::GetOptions(
 "s" => \$self->{opt_s}, # headers the same as source files
 "h" => \$self->{opt_h}, # help message
 "o=s" => \$self->{opt_o}, # output filename
 "target=s" => \@{$self->{opt_t}}, # target name for dependency
 "dir=s" => \@{$self->{opt_d}}, # target dir for dependency
 "debug" => \$self->{opt_debug} # output the generated command
 ) or die "Failed to parse options\n";
 usage() if $self->{opt_h};
 usage("Bad arguments") if @ARGV==0;
 $self->{out} = $self->{opt_o} // $ARGV[0];
 check_that_name_does_not_contain_suffix( $self->{out} );
 $self->validate_target_and_dir_arrays();
}
sub run {
 my ( $self ) = @_;
 exec "./$self->{out}.out";
}
sub validate_target_and_dir_arrays {
 my ( $self ) = @_;
 my $target_len = scalar @{$self->{opt_t}};
 my $dir_len = scalar @{$self->{opt_d}};
 die "Number of targets is different from number of target dirs!\n"
 if $target_len != $dir_len;
 $_ = make_include_dir_name_absolute($_) for @{$self->{opt_d}};
}
#-----------------------------------------------
# Helper routines not dependent on $self
#-----------------------------------------------
sub check_that_name_does_not_contain_suffix {
 my ($name) = @_;
 if ($name =~ /\.(?:hpp|cpp)$/ ) {
 die "Argument $name not accepted: Arguments should be without extension\n";
 }
}
sub check_valid_header_file_extension {
 my ( $ext, $fn ) = @_;
 warn "Unknown header file extension '$ext' for file '$fn'"
 if $ext !~ /^(?:hpp|h)/;
}
sub make_include_dir_name_absolute {
 my ($path ) = @_;
 if ( !File::Spec->file_name_is_absolute( $path )) {
 warn "Warning: Converting include path '$path' to absolute path: \n";
 $path = Cwd::abs_path( $path );
 warn " $path\n";
 }
 return $path;
}
sub new {
 my ( $class, %args ) = @_;
 return bless \%args, $class;
}
sub remove_file_extension {
 my ( $fn ) = @_;
 if ( $fn =~ s/\.([^.]*)$//) {
 return ($fn, 1ドル);
 }
 else {
 warn "Missing file extension for file '$fn'";
 return ($fn, undef);
 }
}
sub usage {
 say $_[0] if defined $_[0];
 say "usage: exe.pl [-h][--debug][-s][-o output_file][[-dir=directory -target=source]] <main source> <other sources>...";
 # TODO: Please add more explanation of the options here!!
 exit 0;
}

Question 8

Very thanks for reply. There are many idioms I did not even know perl is capable of. Helped a lot.

Question 9

I think this is that kind of scripts, where perl is still beter then python, for these kind of tasks (finding files in system, calling system commands, regexes, etc.)

score 3 · Accepted Answer · 2020-08-10 18:39:04Z

It is not so easy to get a clear picture of what the program is doing and why it is doing what it is doing. I think adding more documentation and comments would help, and also trying to code in a way that is easy to read. That means using function and variable names carefully to enhance readability. Avoid using compact/clever constructs if they are not easy to read, instead prefer more verbose code if it can improve readability and maintainability.

It is not clear why you did not want to use make or cmake to handle dependencies in a more efficient way. Another issue is the purpose of the command line switches. It would help to provide more documentation and background for their usage.

Automatic compilation of dependencies is usually done with make or cmake. But this requires you to write a Makefile or a CMakeLists.txt file that specify dependencies. Another option that avoids this is to use g++ -MMD -MP -MF as mentioned by @MartinYork in the comments. Also note that make and cmake has the added benefit of only recompiling the source files that have changed (i.e. those that are newer than the target file). This can markedly speed up compilation times for a large project. The Perl script on the other hand, will recompile every dependency into a single object each time whether some of the dependencies has changed or not.

On the other hand, an advantage of using the Perl script can be to avoid writing the Makefile (though I would recommend learning to write a Makefile or a CMakeLists.txt as it is the common way of doing it). The script also automatically runs the executable file after compilation, though it does not check if the compilation failed or not (if the compilation fails it does not make sense to run the executable). Another advantage can be that it does not generate multiple .o files (as make and cmake does to to enable recompilation only of changed files).

The Perl script as you named exe (I will rename it to exe.pl for clarity) can be used in many ways. From reading the source code, here is what I found:

Firstly, it can be used to compile specific files in the current directory (and then run the generated executable). For example:

$ exe.pl main student2

This will run g++ -o main.out main.cpp student2.cpp. The -o option can be used to specify another name for the exe (but the suffix will always be .out):

$ exe.pl -o prog main student2

runs g++ -o prog.out main.cpp student2.cpp. The -s option can be used to add headers to the compilation (though I could not see why this is useful, as headers are commonly included from within a .cpp file, and therefore should be included automatically by the g++ preprocessor):

$ exe.pl -s main student2

runs g++ -o main.exe main.cpp student2.cpp student2.hpp. Note that main.hpp is not added. The script considers the first filename on the command line (here main) as the "main" script, and the -s option will not add a header file for the main script. (Please consider clarify why this is done!) Headers can still be added without using the -s option by supplying names that matches "_h":

$ exe.pl main student2 student2_h

runs g++ -o main.exe main.cpp student2.cpp student2.hpp. Next, the the -i switch is used to handle dependencies. A dependency is a .cpp file in another directory, let's call it DD, from the main directory, DM, where the script is run from. If the dependency includes header files, the script checks if the header files are located in DM, if so they are excluded from the later compilation (please consider clarify why this is done).

For example, consider DM=/home/user/Desktop/bin/2. We see that DM is located in a parent directory DT=/home/user/Desktop which the script will use as the top of the source tree. Then if for example the dependency directory is DD=/home/user/Desktop/bin/1 and the dependency file is student.cpp which contains an include statement #include "grade.hpp", the script first checks if grade.hpp already exists in DM. If it does, it is excluded from the later g++ compilation command (please consider explaining why it is done). Next, the script tries to find student.cpp in DT or any of it sub directories recursivly using File:Find. If it finds the file (or more than one file) and it turns out that the file is in DD (and not some other directory in DT), it is assumed that there also exists a .cpp file with the same name in DD and the absolute path of this .cpp file is included in the later g++ compilation command. Also, the absolute path of DD is added as an include search path (-I option) to the g++ command.

I would recommend that the motivation behind the above logic (which is not at all clear to me) be explained carefully in the source code as comments.

To summarize, the above example corresponds to the following command line:

$ exe.pl -i target=student -i dir=/home/user/Desktop/bin/1 main student2

and the script will then produce the following g++ command:

g++ -I /home/user/Desktop/bin/1 -o main.exe /home/user/Desktop/bin/1/student.cpp main.cpp student2.cpp

Logical issues

The -i option does not work with more than one pair of (target, dir)

Currently, the -i option does not work for more than one target. For example, for the command line:

$ exe.pl -i target=student2 -i dir=/home/user/Desktop/bin/1 -i target=student3 -i dir=/home/user/Desktop/bin/3

GetOptions() will return for the hash %opt_i corresponding to the input parameters "i=s" => \%opt_i the following hash

%opt_i = (target => "student3", dir => "/home/user/Desktop/bin/3")

Notice that the first target student2 is missing, this is because both targets use the same hash key target. To fix this, you can try use arrays instead of hashes as parameters to GetOptions(). For example:

"target=s" => \@opt_t,
"dir=s" => \@opt_d,

Dependencies in sub directories are not checked for

As mentioned above, the code tries to exclude dependencies that are present in the main directory. But if a dependency is in a sub directory of that directory it will not find it. This is due to the usage of readdir() :

my %local = map { abs_path($_) => 1 } grep { !/^\./ } readdir CWD;

Here, readdir() will only return the files in CWD, not those in any sub directory below it.

Account for multiple versions of the same dependency file

Currently the code uses the file in the main directory if there are multiple versions of the same file name.

Let's say the dependency file /home/user/Desktop/bin/1/student.hpp contains:

#include "grade.hpp"

and there exists two versions of the corresponding .cpp file. One in the dependency directory /home/user/Desktop/bin/1/

/home/user/Desktop/bin/1/grade.cpp

and one in the CWD (where the script is run from)

/home/user/Desktop/bin/2/grade.cpp

What is the correct file? The script should at least give a warning.

Not checking recursivly for dependencies

Let's say student.hpp has a #include "grade.hpp" and grade.hpp has an include #include "calc.hpp". Then, it will not find and compile calc.cpp.

The `_h` command line trick does not work correctly

The following code is used to check for header files on the command line:

@hed = map { /_h/ and s/^(.+)_.+/1ドル/ and "$_.hpp" } @ARGV;

Notice that the first regex /_h/ matches any file with a _h anywhere in the filename, for example sah_handler. I think you need to add an end-of-string anchor to the regex: /_h$/.

Matching of #include files name in a dependency file

The code uses

my @file_dep = map { /#include "([^"]+)"/ ? abs_path(1ドル) : () } <MAIN>;

to extract the dependencies from a dependency file. Note that this requires that there is no space between # and include. But the assumption is not correct, it is in fact allowed to have spaces there, for example

# include "student.hpp"

is a legal C++ include statement.

Language related issues

Use strict, warnings

It is recommended to include use strict; use warnings at the top of your program. This will help you catch errors at an early stage.

Try to limit the use of global variables

Extensive use of global variables makes it harder to reason about a program. It is crucial that a program is easy to read (and understand) in order to maintain and extend it effectively (at a later point). It also makes it easier to track down bugs.

Note that if you add use strict at the top of the program, global variable needs to be declared similar to lexical variables. You declare a global variable with our.

Old style open() and opendir()

Modern perl uses the three-argument form of open and avoids global bareword filehandle names. Instead use lexical filehandles. So instead of this:

open MAIN, $file;

do this (assuming no autodie):

open (my $MAIN, '<', $file) or die "could not open $file: $!";

See Three-arg open() from the book "Modern Perl" for more information.

Shebang

See this blog for more information. Consider replacing #!/usr/bin/perl with #!/usr/bin/env perl Most systems have /usr/bin/env. It will also allow your script to run if you e.g.have multiple perls on your system. For example if you are using perlbrew.

Clever use of map()

The code uses map to produce very concise code, but such code can be difficult to understand and make it harder to maintain your code in the future.

Also note that returning false from the map {} code block like in

@src = map { !/_h/ and "$_.cpp"} @ARGV;

produces an empty string element in @src, if you want to not produce an element you must return an empty list () instead of false:

@src = map { !/_h/ ? "$_.cpp" : () } @ARGV;

Use good descriptive names for the subs.

The sub diff() is supposed to find dependency files that are not present in the current directory. But the name diff() does not clarify what the sub is doing. On the other hand, the following name might be too verbose:

find_abs_path_of_dep_files_that_does_not_exist_in_curdir()

but it is at least easier to understand.

Use positive return values with `exit`

The exit code from a linux process is usually an integer between zero (indicating success) and 125, see this answer for more information.

Check the return value of `system $command`

You should check the return value from the system() call for g++. The compilation may fail, and then the exit code will be nonzero. In that case, there is no point in running the executable after the compilation has finished.

Use `say` instead of `print`

You can avoid typing a final newline character for print statements by using say instead of print. The say function was introduced in perl 5.10, and is mad available by adding use v5.10 or use use feature qw(say) to the top of your script.

Example code

Here is an example of how you can write the code, following some of the principles I discussed above. I use an object oriented approach to avoid passing too many variables around in the parameter lists of the subs. It also avoids using global variables.

#! /usr/bin/env perl
package Main;
use feature qw(say);
use strict;
use warnings;
use Cwd qw(getcwd);
use File::Spec;
use Getopt::Long ();
use POSIX ();
{ # <--- Introduce scope so lexical variables do not "leak" into the subs below..
 my $self = Main->new( rundir => getcwd() );
 $self->parse_command_line_options();
 $self->parse_command_line_arguments();
 $self->find_dependencies();
 $self->compile();
 $self->run();
}
# ---------------------------------------
# Methods, alphabetically
# ---------------------------------------
sub check_run_cmd_result {
 my ( $self, $res ) = @_;
 my $signal = $res & 0x7F;
 if ( $res == -1 ) {
 die "Failed to execute command: $!";
 }
 elsif ( $signal ) {
 my $str;
 if ( $signal == POSIX::SIGINT ) {
 die "Aborted by user.";
 }
 else {
 die sprintf(
 "Command died with signal %d, %s coredump.",
 $signal, ( $res & 128 ) ? 'with' : 'without'
 );
 }
 }
 else {
 $res >>= 8;
 die "Compilation failed.\n" if $res != 0;
 }
}
sub compile {
 my ( $self ) = @_;
 my @command = ('g++');
 push @command, ("-I", $_) for @{$self->{inc}};
 push @command, "-o", "$self->{out}.out";
 push @command, @{$self->{hed}}, @{$self->{deps}}, @{$self->{src}};
 $self->debug( "@command" ) if $self->{opt_debug};
 my $res = system @command;
 $self->check_run_cmd_result( $res );
}
sub debug{
 my ( $self, $cmd ) = @_;
 say "final output:\n$cmd\n\nDependencies:";
 say for @{$self->{dep}};
 exit 1;
}
sub find_dependency {
 my ( $self, $target, $dir ) = @_;
 $target .= '.cpp';
 my $fn = File::Spec->catfile($dir, $target);
 open ( my $fh, '<', $fn ) or die "Could not open file '$fn': $!";
 my @include_args = map { /^#\s*include\s*"([^"]+)"/ ? 1ドル : () } <$fh>;
 close $fh;
 my @deps;
 for (@include_args) {
 my $fn = File::Spec->catfile( $dir, $_ );
 # TODO: In your program you checked if file also existed in
 # $self->{rundir}, and excluded it if so. Do you really need to check that?
 if (-e $fn) { # the file exists in target dir
 my ($temp_fn, $ext) = remove_file_extension( $fn );
 if (defined $ext) {
 check_valid_header_file_extension( $ext, $fn );
 push @deps, "$temp_fn.cpp";
 # TODO: Here you could call $self->find_dependency() recursively
 # on basename($temp_fn)
 }
 }
 }
 if (@deps) {
 push @{$self->{deps}}, @deps;
 push @{$self->{inc}}, $dir;
 }
}
sub find_dependencies {
 my ( $self ) = @_;
 $self->{deps} = [];
 $self->{inc} = [];
 my $targets = $self->{opt_t};
 my $dirs = $self->{opt_d};
 for my $i (0..$#$targets) {
 my $target = $targets->[$i];
 my $dir = $dirs->[$i];
 $self->find_dependency( $target, $dir );
 }
}
sub parse_command_line_arguments {
 my ( $self ) = @_;
 check_that_name_does_not_contain_suffix($_) for @ARGV;
 # TODO: Describe the purpose of -s option here!!
 if($self->{opt_s}){
 $self->{src} = [ map { "$_.cpp" } @ARGV ];
 # NOTE: exclude header file for main program name ($self->{out})
 # So if main program name is "main", we include main.cpp, but not main.hpp
 # TODO: describe why it is excluded
 $self->{hed} = [ map { !/^$self->{out}$/ ? "$_.hpp" : () } @ARGV];
 }
 else {
 # TODO: Describe what is the purpose of "_h" here!!
 $self->{src} = [ map { !/_h$/ ? "$_.cpp" : () } @ARGV ];
 $self->{hed} = [ map { /^(.+)_h$/ ? "1ドル.hpp" : () } @ARGV ];
 }
}
sub parse_command_line_options {
 my ( $self ) = @_;
 Getopt::Long::GetOptions(
 "s" => \$self->{opt_s}, # headers the same as source files
 "h" => \$self->{opt_h}, # help message
 "o=s" => \$self->{opt_o}, # output filename
 "target=s" => \@{$self->{opt_t}}, # target name for dependency
 "dir=s" => \@{$self->{opt_d}}, # target dir for dependency
 "debug" => \$self->{opt_debug} # output the generated command
 ) or die "Failed to parse options\n";
 usage() if $self->{opt_h};
 usage("Bad arguments") if @ARGV==0;
 $self->{out} = $self->{opt_o} // $ARGV[0];
 check_that_name_does_not_contain_suffix( $self->{out} );
 $self->validate_target_and_dir_arrays();
}
sub run {
 my ( $self ) = @_;
 exec "./$self->{out}.out";
}
sub validate_target_and_dir_arrays {
 my ( $self ) = @_;
 my $target_len = scalar @{$self->{opt_t}};
 my $dir_len = scalar @{$self->{opt_d}};
 die "Number of targets is different from number of target dirs!\n"
 if $target_len != $dir_len;
 $_ = make_include_dir_name_absolute($_) for @{$self->{opt_d}};
}
#-----------------------------------------------
# Helper routines not dependent on $self
#-----------------------------------------------
sub check_that_name_does_not_contain_suffix {
 my ($name) = @_;
 if ($name =~ /\.(?:hpp|cpp)$/ ) {
 die "Argument $name not accepted: Arguments should be without extension\n";
 }
}
sub check_valid_header_file_extension {
 my ( $ext, $fn ) = @_;
 warn "Unknown header file extension '$ext' for file '$fn'"
 if $ext !~ /^(?:hpp|h)/;
}
sub make_include_dir_name_absolute {
 my ($path ) = @_;
 if ( !File::Spec->file_name_is_absolute( $path )) {
 warn "Warning: Converting include path '$path' to absolute path: \n";
 $path = Cwd::abs_path( $path );
 warn " $path\n";
 }
 return $path;
}
sub new {
 my ( $class, %args ) = @_;
 return bless \%args, $class;
}
sub remove_file_extension {
 my ( $fn ) = @_;
 if ( $fn =~ s/\.([^.]*)$//) {
 return ($fn, 1ドル);
 }
 else {
 warn "Missing file extension for file '$fn'";
 return ($fn, undef);
 }
}
sub usage {
 say $_[0] if defined $_[0];
 say "usage: exe.pl [-h][--debug][-s][-o output_file][[-dir=directory -target=source]] <main source> <other sources>...";
 # TODO: Please add more explanation of the options here!!
 exit 0;
}

Very thanks for reply. There are many idioms I did not even know perl is capable of. Helped a lot.
I think this is that kind of scripts, where perl is still beter then python, for these kind of tasks (finding files in system, calling system commands, regexes, etc.)

Stack Exchange Network

Is possible to use script for dependencies in c++?

1 Answer 1

Logical issues

The -i option does not work with more than one pair of (target, dir)

Dependencies in sub directories are not checked for

Account for multiple versions of the same dependency file

Not checking recursivly for dependencies

The `_h` command line trick does not work correctly

Matching of #include files name in a dependency file

Language related issues

Use strict, warnings

Try to limit the use of global variables

Old style open() and opendir()

Shebang

Clever use of map()

Use good descriptive names for the subs.

Use positive return values with `exit`

Check the return value of `system $command`

Use `say` instead of `print`

Example code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Is possible to use script for dependencies in c++?

1 Answer 1

Logical issues

The -i option does not work with more than one pair of (target, dir)

Dependencies in sub directories are not checked for

Account for multiple versions of the same dependency file

Not checking recursivly for dependencies

The _h command line trick does not work correctly

Matching of #include files name in a dependency file

Language related issues

Use strict, warnings

Try to limit the use of global variables

Old style open() and opendir()

Shebang

Clever use of map()

Use good descriptive names for the subs.

Use positive return values with exit

Check the return value of system $command

Use say instead of print

Example code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

The `_h` command line trick does not work correctly

Use positive return values with `exit`

Check the return value of `system $command`

Use `say` instead of `print`