Given a string, find the first non-repeated character in it.
E.g., "yellow" should return "y"
There are several solutions for this in other languages, but I haven't seen one written in Perl.
Can this be done in a more Perl-ish way or generally in an even shorter way?
use strict;
use warnings "all";
while (<DATA>)
{
while (m,(.),g)
{
my $c = 1ドル;
if (s,1,,ドルg < 2)
{
print "$c\n";
last;
}
}
}
__DATA__
yellow
tooth
2 Answers 2
Don't reinvent the wheel. There is a function (i.e., singleton
) in the package MoreUtils
that does the job:
#!/usr/bin/perl
use Modern::Perl;
use List::MoreUtils qw(singleton);
while (<DATA>) {
chomp; # don't forget it, it removes the linebreak.
# split explodes the string in character
# singleton keeps characters that appear only once
# ($first) contains the first character that appears only once.
my ($first) = singleton split//, $_;
say $first;
}
__DATA__
yellow
tooth
Output:
y
h
-
\$\begingroup\$ This is a nice solution but what if you work in an environment where you can't install extra dependencies? \$\endgroup\$yuri– yuri2017年05月24日 10:31:05 +00:00Commented May 24, 2017 at 10:31
-
1\$\begingroup\$ @yuri: Inspect the module and copy the code of the function. \$\endgroup\$Toto– Toto2017年05月24日 10:33:07 +00:00Commented May 24, 2017 at 10:33
The previous answer is great because it poses an alternate solution which leverages existing code. The advantages of using CPAN modules are that the code tends to be:
- well-documented
- well-tested
- of known coverage
But, there is also value in reviewing the code you posted.
Overview
The code is already quite "Perl-ish". I like how you break out of the loop early, as soon as you find no repeated characters; that is efficient.
Fatal
It is great that you used strict
and warnings
.
My preference is to use a very strict version of warnings:
use warnings FATAL => 'all';
In my experience, the warnings have always pointed to a bug in my code. The issue is that, in some common usage scenarios, it is too easy to miss the warning messages unless you are looking for them. They can be hard to spot even if your code generates a small amount of output, not to mention anything that scrolls off the screen. This option will kill your program dead so that there is no way to miss the warnings.
chomp
It would be good to use chomp
in the outer while
loop
to remove the newline character. While this will not alter the behavior
of the code, it more explicitly conveys the intent because you don't really
want the newline character to be part of your analysis.
Regex
It is much more common to use the //
regular expression delimiters
than the ,,
delimiters that you used. I think most people would
find the code easier to understand with //
. There is nothing
wrong with your code; this is merely a style issue.
Naming
It is great that you immediately gave a name to the special regex
match variable 1ドル
:
while (m,(.),g)
{
my $c = 1ドル;
This is widely considered a good coding practice. However,
the variable name $c
is not very descriptive in this context. $char
would be better:
my $char = 1ドル;
Once you set this variable, you should no longer use 1ドル
. Change:
if (s,1,,ドルg < 2)
to:
if (s,$char,,g < 2)
quotemeta
If your string can contain any character, not just letters, then the
code does not work if the character is a regular expression
metacharacter, such as the period (.
). For example, try the code
with this input string:
.point
The code prints nothing for that, but it should print .
quotemeta can be used to support metacharacters:
my $char = quotemeta 1ドル;
Documentation
You should either add a comment near the top of your code to describe its purpose, or use plain old documentation (POD) and get manpage-like help with perldoc.
Function
The DATA
block is great for creating self-contained code like this.
However, you really want to create a sub
which makes your code easier reuse
and test. And adding tests gives you greater confidence that the code
works as intended.
Layout
I prefer "cuddled" braces, where the opening brace is on the same line as the code, instead of on its own line. This saves on valuable vertical space. For example:
while (<DATA>) {
Here is new code with many of the suggestions above:
use strict;
use warnings FATAL => 'all';
while (<DATA>) {
chomp;
while (/(.)/g) {
my $char = quotemeta 1ドル;
if (s/$char//g < 2) {
print "$char\n";
last;
}
}
}
__DATA__
yellow
tooth
.point
llama
Outputs:
y
h
\.
m
aaabbc
? \$\endgroup\$(.)1円*
and get the first character of the resulting string. \$\endgroup\$1円
must be used in the regex part, but in the replacement part you have to use1ドル
. \$\endgroup\$perl -E '$_ = "aabbbcd"; s/(.)1円+//g; /(.)/ && say 1ドル'
\$\endgroup\$