Friday, January 9, 2009
Questions Regarding Perl References In Linux And Unix
Hey There,
Today's post is a follow up to our post, from earlier this week, regarding understanding Perl variable references . The feedback was generally positive, but one particular question was asked by quite a few people. And that question was:
If there's nothing new under the sun, then how can there be more things in heaven and earth than are dreamt of in anyone's philosophy?
To be quite honest, I have no idea. I quit pondering the meaning of "it all" when I was introduced to nihilism at the age of 18, which, consequently, fit in perfectly with my intensive practical application of Onanism. My philosophy was both fun "and" meaningless ;)
But, that wasn't the real question. I just went off on a tangent there for a moment, as it's my wont to do. Apologies for any trauma caused by the high-brow self-abuse joke. It has nothing to do with Linux, Unix or Perl unless you, like Sammy Hagar, consider it all mental masturbation ;)
The real question on people's minds was this:
What purpose, at all, do references in Perl serve, and why would I ever want to complicate things by using them?
This is a valid question; especially if you aren't hard-core into Perl or only use it as a work aide. Quite frankly, I, myself, rarely have any use for references. But here's a quick rundown of a few reasons why it's good to know them and understand how they work (Note that, in Perl, references and pointers - as they're more commonly known in programming languages like C - are terms that often get used interchangeably. For our purposes, they both mean the same thing).
1. For folks who've been using Perl since version 4 and earlier, references were an absolute "must" to understand if you wanted to do any semi-complex work with subroutines. At that point in time, you could only pass scalar variables as arguments to a subroutine. This made it impossible for you to, for instance, pass an array, hash or more complex variable as an argument to your subroutine. Thus, the reference made it possible for you to pass an array or hash as an argument to your subroutine, since the Perl reference to either was (and is) always a scalar variable. I think this is the one issue that is really being addressed when folks question the usefulness of Perl references these days. As the below scriptlet shows, the days of only being able to pass a scalar variable to your subroutines are long since gone. The language has been improved upon and, now, you can pass arrays, hashes, etc, directly to your subroutines as arguments, without having to make use of scalar references to them (more on that in point 2). Check out the following for a demonstration of how Perl can now handle these types of arguments (a few of you actually already have this in your mail, as I put it together to demonstrate this principle earlier this week in response to several emails - I aim to please :):host # cat shell.pl
#!/usr/bin/perl
$bfile = "what the heck";
@bfile = qw(what the heck);
scalarsub($bfile);
arraysub(@bfile);
sub scalarsub {
my $file=shift;
print "SCALAR: F $file\n";
}
sub arraysub {
my @file=@_;
foreach $item (@file) {
print "ARRAY: F $item\n";
}
}
--- test run of script ---------
host # ./shell.pl
SCALAR: F what the heck
ARRAY: F what
ARRAY: F the
ARRAY: F heck
As you can see, above, today's Perl doesn't require you to reference non-scalar variables if you want to pass them as arguments to a subroutine.
2. References still do have a place in Perl, even when it comes to subroutines. If it sounds like I'm contradicting what I just wrote, please allow me to dig myself further into a hole... I mean, explain ;)
Consider, if you will, the following situation in which making use of Perl references would actually save you considerable time and, probably, a gray hair or two. If you have a simple subroutine (like in our scriptlet above), it is very easy to pass it several scalar arguments (assuming no restrictions on the amount of arguments you can pass) and, logically, any amount, and/or combination, of different types of variables. But, what happens when you call your subroutine with arguments like this?:
@array1 = qw(1 2 3 4 5);
@array2 = qw(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19);
@array3 = qw(1 2 3);
mynewsubroutine(@array1, @array2, @array3);
@_ = 1 2 3 4 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 1 2 3; <-- within your subroutine.
my (@array1,@array2,@array3) = @_;
Now, while there are more than several clever ways you can get around parsing this input (which is one of the great things about Perl), it can be difficult to process this sort of input within the subroutine; especially if the arrays, themselves, are of variable and/or undetermined size (perhaps @array1 contains 5 scalar variables, @array2 contains 19 scalar variables and @array3 contains an additional 3 scalar variables - we'll, again, stay away from passing arrays of hashes of arrays for now ;). This can constitute a lot of work on the programmer's part to craft the subroutine in such a way that it handles all of the input and keeps it separated, correctly, into the three groups, since the totality of the arguments passed to a subroutine can be referenced by @_ and, if three separate arrays are passed, @_ will contain all 3 of them (27 scalar variables) and you can't separate them by index @_[0], etc, since the input to your subroutine is, in effect, one large array, rather than a collection of 3 smaller arrays. In effect, and in reality, your subroutine would put all of the members of all of the 3 arrays into @array1.
Hopefully, I'm not drifting too far off of the reservation, here, as I fear I'm wandering into the land of overly convoluted explanation again ;) The advantage to you, the programmer, in this instance, would be that you could use Perl references to pass scalar references to each array to your subroutine. As difficult as it would be to manage the above three arrays using straight passing to your subroutine (you would have to determine their size before passing them in and then keep track of that by passing even more information, not to mention re-allocating those members back into the subroutine's local arrays), it would be a bit easier to pass your subroutine three references to your three arrays, like so:
mynewsubroutine(\@array1, \@array2, \@array3);
Now, the reason this makes life easier for you is that you're only passing your subroutine 3 scalar variables, rather than an indeterminate amount of any other kind. Inside the subroutine, you can easily work with the variables using the dereferencing techniques we went over in our understanding Perl references post. For instance, you could do "foreach" on all 3 arrays separately, and process the internal array variables that way, effectively making the size of any of those arrays inconsequential (less work for you), like so:my $array1 = shift;
foreach $thingy (@{$array1}) {
print "Thingy $thingy\n";
}
I'm oversimplifying to a certain degree, but this is getting out of hand just swimmingly in spite of my best efforts ;)
3. Although I promised I wouldn't go here, I must (but I'll keep it theoretical so we can dispense with it post-haste). If you ever do any super-complex programming, like building matrices, doing crazy manipulation of filehandles (remember that "any" Perl data type can be referenced) or things along the nature of manipulating arrays of hashes of multiple nested arrays of hashes, using references can make it a lot simpler for you to keep your head from exploding (which is a lousy way to end the day ;)
Hopefully, these answers have been somewhat helpful, and/or illustrative regarding the question at hand (why use references at all, and are they even really necessary?) If you're interested in reading more on the subject, check out Chapter 18 of bjnet.edu.cn's introduction to Perl . It's written very well and is probably simpler to follow than my meandering over-explanations ;) Plus, all the other chapters are available online, so you could download them all and have yourself a very nice free Perl reference in however long it takes you to download the HTML.
Cheers to all, and thank you for your response to the post that spawned this one. There's nothing quite so rewarding as knowing that I'm actually writing something at least a few people are really interested in reading :)
, Mike
Please note that this blog accepts comments via email only . See our Mission And Policy Statement for further details.
Wednesday, January 7, 2009
Understanding Perl Variable References On Linux And Unix.
Hey there,
Today we're going to take a look at a part of Perl that a lot of folks shy away from; mostly because (from my experience) they feel it's too abstract a notion or too complicated to understand. For today, I'm referring to Perl references ;) And here's the thing; nothing could be farther from the truth. It's just about as simple as the sentence preceding the last. When I referred to Perl references I was, for the most part, laying the foundation for easily understanding the entire concept. If you attack the problem semantically, and try not to think of it as a bunch of backslashes and arrows and symbols, it makes perfect sense :) If I'm wrong, and this post leaves you reeling in confusion and pain, please let me know so I'll stop being so cavalier with my prose :)
So, let's take a look at Perl references and how they can be used, most basically, in a step-by-step fashion; from the simplest of beginnings to the not-so-complex middle (We'll leave references to hashes of arrays of references to other hashes for some other day ;)
For all of these examples, we'll use command line examples, so you can cut and paste them to try them out, rather than pretend that we're inside a Perl script.
1. A simple way to look at Perl References: The basis of any Perl reference is the variable, or value that you're referring to. At the most basic level, any variable assignment is a reference. For instance, look at these basic statements:
host # perl -e '$a = "bob";'
host # perl -e '@a = qw(bob joe);'
host # perl -e '%a = (bob => joe);
These are all just simple variable assignments (with $ indicating a scalar variable, @ representing an array and % representing a hash). However, you can think of them as references (which will make the transition to understanding textbook references much more smooth. The variable $a, for example, has an assigned value of the string "bob." So, if you look at that in a different way, the $a variable refers to the string "bob," or (another way) the variable $a is NOT the string "bob," but a reference to that string (or scalar) value.
BTW, if this part of the post is beyond where you're at with Perl, take a look back at some of our older posts on simple arithmetic and simple variables in Perl that deal with these more basic principles. There should be enough links on those two pages to connect you to all the other ones on this site. If not, the blog search feature (although it's very generous in its interpretations -- search for the letter "a" to see what I mean ;) should help you find what you need.
2. Looking at actual Perl References: A textbook Perl reference is the same thing as we discussed in point 1, except taken up (or out) one meta-level. So instead of having the relationship of reference ($a) and referent ("bob") that we had before, we're going to assign one scalar variable a reference to any of the three variables from before, rather than from the variables directly to the values. So, to reference any of these three we could do the following (note that for this basic lesson, the Perl reference will always be a scalar since, at its core, it always is; even if that scalar value is a part of a larger array or hash). The symbol that denotes that you're setting your variable's value to a reference is the backslash (\) character:
host # perl -e '$a_ref1 = \$a;'
host # perl -e '$a_ref2 = \@a;'
host # perl -e '$a_ref3 = \%a;'
So now we have three very simple Perl references. $a_ref1 has the value of a reference to the $a scalar variable, $a_ref2 has the value of a reference to the @a array and $a_ref3 has the value of a reference to the %a hash. (Note that you can have a Perl variable refer to itself, although the uses for this are somewhat limited and generally not necessary for basic Perl scripting. Ex: $a_ref4 = \$a_ref4 <-- $a_ref4 has the value of a reference to itself.
3. Extracting values from Perl References: This is just as easy as extracting values from regular variables, except, as before, you have think one more hop. Whereas, with a regular variable, you would extract the value of that variable directly, with a Perl reference, you need to extract the value of the variable that is being referenced by your reference. It sounds worse than it is ;) For instance, if we accept that the scalar variable $a is equal to "bob," we know that we can extract the value of $a by doing the following (as before):
host # perl -e '$a = "bob";print "$a\n";'
Whereas, if we create a reference (another scalar variable) to the variable $a, and call that $a_ref1, we need to extract the value from the variable that we are referencing. A simple and comfortable approach to extracting this value would be the following:
host # perl -e '$a = "bob"; $a_ref1 = \$a; print "${$a_ref1}\n";'
In this instance we've simply peeled the onion, so to speak (insert your favorite peelable vegetable or fruit here ;). In order to extract the variable of $a from the Perl reference $a_ref1 variable, we just stripped it layer by layer. To deconstruct the print statement above, we'll go backward from the statement we used to print the value of the $a_ref1 Perl reference:
a. ${$a_ref1} is what we call to print the value of the variable $a.
b. ${$a_ref1} is actually equal to ${\$a} since $a_ref1's value is a reference to $a (as denoted by "$a_ref1 = \$a;")
c. ${\$a} is equal to ${a} since the we're dealing directly with the referent. $a_ref1 (the variable with the value of the reference actually points to a hex address in memory (usually associated with a Perl file type). You can see the difference in the output of the two commands below:
host # # perl -e '$a = "bob"; $a_ref1 = \$a; print "$a_ref1\n";'
SCALAR(0x2e250) <-- This is the hexadecimal memory space that the $a_ref1 reference refers to. Your results may vary :)
perl -e '$a = "bob"; $a_ref1 = \$a; print "${\$a}\n";'
bob <-- This is the value of the referenced variable $a, which we know (from before) is equal to "bob" ($a = "bob" from what seems like so far up the page ;)
d. And, even though we don't need to tell you this, just for completeness' sake: ${a} (or $a - same thing) equals "bob".
4. Extracting values from Perl References that aren't scalar: Finally, some good news :) The principles above apply to all sorts of variable dereferencing. So, for instance, if you wanted to extract the value of the array reference $a_ref2, you could get it by doing:
host # perl -e '@a = qw(bob joe); $a_ref2 = \@a; print "@{$a_ref2}\n";'
bob joe <-- The whole thing
host # perl -e '@a = qw(bob joe); $a_ref2 = \@a; print "@{$a_ref2}[0]\n";'
bob <-- array index 0
host # perl -e '@a = qw(bob joe); $a_ref2 = \@a; print "@{$a_ref2}[1]\n";'
joe <-- array index1
and the same basic principle applies to hashes (%{$a_ref3} would get you all those values). Basically, all you need to do to extract the value of a one-level-deep Perl Reference is to wrap the reference-variable in a curly brackets and preface that with the appropriate symbol ($ for scalar, @ for array, % for hash, etc).
5. What to do if you have no idea what kind of Perl Reference you're dealing with: Fortunately, there exists - in the very heart of Perl - a function to deal with just this sort of predicament. It's called, for some strange reason, "ref" ;) On many systems, doing something like this:
host # perl -e '@a = qw(bob joe); $ref_type = ref(\@a); print "$ref_type\n";'
ARRAY
is all you need to do to get back the type of reference you're dealing with (Obviously, we knew it was an array since we're doing these self-contained command line scripts, but you could use the ref function against any Perl Reference and get the value from it. One thing to note about the ref function is that it doesn't always work as expected. For instance, if you call the function ref on a straight-up scalar, array or hash variable, it should return "undefined." This is normal, since those straight-up variables are "not" references. However, sometimes, even when you are dealing with a reference, you won't get any feedback on your command line. This isn't to say that ref doesn't know what kind of reference you're working with; just that it's not in the mood to tell you ;)
You can get around this little hassle pretty simply by just writing a simple type-check. So if you run the following:
host # perl -e '%a = (bob => joe); $ref_type = ref(\%a); print "$ref_type\n";'
and you don't get the return of
HASH
as you would expect to, you can figure out what the return from ref was anyway. The two most basic ways to do this range from cowboy to academic ;)
a. Cowboy: Just print the variable that points to the reference, like we did above, to get the hexadecimal address (instead of the value of the referenced variable), since this is accompanied by the reference type:
host # perl -e '%a = (bob => joe); $ref_type = \%a; print "$ref_type\n";'
HASH(0x2e26c)
b. Academic: Use a simple if-condition to test and see what kind of output ref returns
host # perl -e '%a = (bob => joe); $ref_type = ref(\%a); if ($ref_type eq "HASH") {print "HaSH FOUND!\n"};'
HaSH FOUND!
Of course, you could check just to see if the value is even "defined," since, if it isn't, you're not dealing with a reference. 99% of the time, Perl will do the right thing and tell you what the ref function returns. For all I know, the 1% of the time it doesn't work for me is because I completely screwed up ;)
Perl also deals with references to a lot of different file and object types to which these same basic principles apply. So, if you're dealing with a pipe or another type of file or variable, you can still use the principles above to help you out. And, for simplicity's sake (until you get used to being utterly confused while in a state of mostly-understanding ;), for every level of referencing that gets added on, you just need to derefence that many times backward (as shown above) to make your way back to the original value of the original variable(s)!
And that wraps up that :)
Hope that helps shed some light on basic Perl References and, again, I'd love to hear what you think about this post; especially with regards to how you felt about it (Was it too simplistic? Too Complicated? Hard to understand? Easy Peasy? ;)
Cheers,
, Mike
Please note that this blog accepts comments via email only . See our Mission And Policy Statement for further details.
Sunday, May 4, 2008
Reversing All Lines In A File On Linux Or Unix Using Perl
Good evening-day-morning-afternoon :)
In response to some positive feedback, and some additional questions, prompted by our earlier post on using Perl to mirror lines in a file on Linux or Unix , today we're dispensing with yet another Perl script to do almost the same thing, but with an extra twist. While our original mirror file script reversed each line, this one will attempt to do the same thing while also reversing the order of the lines. If you have a nervous condition you should probably quit reading this is, as it is only going to get more intense ;)
The good news is that I already know it works. The bad news is available at your local newsstand or in that meeting you should really try to blow off ;) Sorry... but only because the bad jokes are intentional.
So, when we're done crunching a file with today's script it will come out with all the lines reading from left to right and with the first line being the last, the second line being the second to last, and so on until the last line, which will be the first. End of story... or is it? ;)
Again, we're going to take a quick look at another useful function in this script. Since the rest of them are the same as in the script that prompted this one (split, join, undef and push), I should refer you to that post on mirroring file lines so that we don't waste too much space with duplicate content. It's interesting to look at the two side-by-side to really see the difference.
In today's post we'll check out the use of this one function (Note that, this time, we'll assume that the array @array consists of no members at all:
unshift - This function does the same thing as the "push" function we used in our last post, except that it will "unshift" a variable on to the "right side" of an array. The name isn't as intuitive as "push," but, if it helps, when you "shift" a variable from an array, you're pulling it out of the "left side." So, naturally, "unshifting" is adding to the "right side." ...No matter how I explain it, it will never make sense. It's just one of those words you have to accept on faith. Like "defenestration." <--- Apologies for the cheap-shot at Microsoft. That word is officially defined as "the act of throwing someone or something out of a window." It's only a few small steps to the jab I was going for ;)
Ex:
unshift(@array, "a"); <--- @array now equals "a"
unshift(@array, "b"); <--- @array now equals "a" "b"
unshift(@array, "c"); <--- @array now equals "a" "b" "c"
Again, enjoy, and I hope this helps you out with whatever you're doing that it might help you out with ;)
Cheers,
SAMPLE RUN:
host # cat words|head -3;cat words|tail -3 <--- The beginning and end of the file we're going to "reverse"
Aarhus
Aaron
Ababa
Zulu
Zulus
Zurich
host # ./mirror.pl words
host # ls
. .. reverse.pl words words.reverse <--- Our newly created file is called "words.reverse"
host # cat words.reverse|head -3;cat words.reverse|tail -3 <--- And, once again (since I tested this a few times already), the reversal of the file seems to have worked!
hciruZ
suluZ
uluZ
ababA
noraA
suhraA
host # wc -l words* <--- Just double checking here to make sure that the number of characters in our original file and the reverse file are exactly the same, which they should be.
45378 words
45378 words.reverse
90756 total
Creative Commons License
This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License #!/usr/bin/perl
# reverse.pl - reverse each line, and its position in a file
#
# 2008 - Mike Golvach - eggi@comcast.net
#
# Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License
#
if ( $#ARGV != 0 ) {
print "Usage: 0ドル text_file\n";
exit(1);
}
$text_file=$ARGV[0];
open(TXT, "<$text_file");
@txt = <TXT>;
close(TXT);
foreach $ln (@txt) {
chomp($ln);
@ln = split(//, $ln);
$ln_len = @ln;
undef(@rev);
while ( $ln_len > 0 ) {
push(@rev, $ln[${ln_len}-1]);
$ln_len--;
}
$rev = join(/ /, @rev);
unshift(@reverse, "$rev");
}
open(RTXT, ">${text_file}.reverse");
foreach $backward (@reverse) {
print RTXT "$backward\n";
}
close(RTXT);
exit(0);
, Mike
linux unix internet technology
Saturday, May 3, 2008
Perl Script To Mirror Lines In A File On Linux Or Unix
Good afternoon-morning-day-evening :)
In much the same vein as our previously posted script to do weak encryption with Octal Dump , today we're throwing out another Perl script to do something possibly equally worthless, but still somewhat entertaining ;) Since it's written in Perl, and uses that language's specific constructs (i.e. No "system" calls), it should run equally well on Linux or Unix. The logic in the script is simple enough that it should probably work going back a few major versions.
As much as this may not seem to be of any use to you now, knowing how to mirror (or reverse, since many people, including myself, don't have any mirror-friendly fonts ;) each line in a file can be beneficial. Although no one in the office is likely to come up to you and say "Mr. sysadmin, sir (of course, I'm exaggerating. People are much more formal than that normally ;), Can you print every single line of this report reading from right-to-left instead of the standard left-to-right?", it's even more unlikely that, if this were to happen, any business/war jargon would be used. I can't imagine something like this ever being "mission critical" or required for any sort of "code red" situation. If it ever is, you'll be really glad you know how to do this...
But, at a fundamental level, it's good to understand the basic functionality of Perl and how to deal with scalar variables (or string variables), arrays and how to muck around with them (or make them work for you ;) In this particular script, the building blocks of some very useful functions of Perl are employed (to a dubious end, I'll admit) and, hopefully, presented in an easy to understand fashion. Over time, we'll dig into every little thing there is to know (If that can be cranked out in one life time ;) with regards to each function. In the mean time, check out the use of these four functions (Note that, for all, we'll assume that the variable $variable is defined as "abcd" and the array @array consists of four members: a, b, c and d:
split - This function will take a scalar variable and "split" it, on a delimiter, into an array:
Ex: @array = split(//, $variable); <--- Now @array has four members (a, b, c and d) that it got from $variable
join - This function will take an array and "join" it into one scalar variable:
Ex: $variable = join(/ /, @array); <--- Now $variable equals "abcd" since it contains all 4 members of @array joined together by an empty delimiter "/ /" (that is, each character, or space, is considered a separate member of the array)
undef - This function will "undefine" our array, in this case. It can be used on scalar, array and hash variables as well.
Ex: undef(@array); <--- Now @array is not just empty, it isn't even defined. It may as well not exist.
push - This function will "push" a variable (scalar, array, hash, or references to same, and more -- getting way off-topic ;) onto the left side of an array. Subsequent pushes of extra variables are added from the left, so if you push three variables into an array (a, b and c, for instance) in one order, they'll actually end up in the array in the opposite order.
Ex:
push(@array, "a"); <--- @array now equals "a"
push(@array, "b"); <--- @array now equals "b" "a"
push(@array, "c"); <--- @array now equals "c" "b" "a"
In any event, enjoy, and I hope this helps you out if you're beginning to learn the Perl scripting language!
Best wishes,
SAMPLE RUN:
host # cat words|head -3;cat words|tail -3 <--- The beginning and end of the file we're going to "mirror"
Aarhus
Aaron
Ababa
Zulu
Zulus
Zurich
host # ./mirror.pl words
host # ls
. .. mirror.pl words words.mirror <--- Our newly created file is called "words.mirror"
host # cat words.mirror|head -3;cat words.mirror|tail -3 <--- And (good deal), the mirroring seems to have worked!
suhraA
noraA
ababA
uluZ
suluZ
hciruZ
host # wc -l words* <--- Just double checking here to make sure that the number of characters in our original file and the mirror file are exactly the same, which they should be.
45378 words
45378 words.mirror
90756 total
Creative Commons License
This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License #!/usr/bin/perl
#
# mirror.pl - print an entire file backward, line by line
#
# 2008 - Mike Golvach - eggi@comcast.net
#
# Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License
#
if ( $#ARGV != 0 ) {
print "Usage: 0ドル text_file\n";
exit(1);
}
$text_file=$ARGV[0];
open(TXT, "<$text_file");
@txt = <TXT>;
close(TXT);
open(RTXT, ">${text_file}.mirror");
foreach $ln (@txt) {
chomp($ln);
@ln = split(//, $ln);
$ln_len = @ln;
undef(@rev);
while ( $ln_len > 0 ) {
push(@rev, $ln[${ln_len}-1]);
$ln_len--;
}
$rev = join(/ /, @rev);
print RTXT "$rev\n";
}
close(RTXT);
exit(0);
, Mike
linux unix internet technology