7
\$\begingroup\$

Given a string, find the first non-repeated character in it.

E.g., "yellow" should return "y"

There are several solutions for this in other languages, but I haven't seen one written in Perl.

Can this be done in a more Perl-ish way or generally in an even shorter way?

use strict;
use warnings "all";
while (<DATA>)
{
 while (m,(.),g)
 {
 my $c = 1ドル;
 if (s,1,,ドルg < 2)
 {
 print "$c\n";
 last;
 }
 }
}
__DATA__
yellow
tooth
toolic
15.2k5 gold badges29 silver badges213 bronze badges
asked May 23, 2017 at 17:49
\$\endgroup\$
6
  • \$\begingroup\$ What should be output for string aaabbc? \$\endgroup\$ Commented May 24, 2017 at 10:08
  • 1
    \$\begingroup\$ @Tushar the output would be "c". \$\endgroup\$ Commented May 24, 2017 at 10:33
  • \$\begingroup\$ One way is to replace all repeating characters using regex (.)1円* and get the first character of the resulting string. \$\endgroup\$ Commented May 24, 2017 at 11:01
  • 1
    \$\begingroup\$ @yuri: 1円 must be used in the regex part, but in the replacement part you have to use 1ドル. \$\endgroup\$ Commented May 25, 2017 at 9:58
  • 1
    \$\begingroup\$ perl -E '$_ = "aabbbcd"; s/(.)1円+//g; /(.)/ && say 1ドル' \$\endgroup\$ Commented May 25, 2017 at 11:11

2 Answers 2

4
\$\begingroup\$

Don't reinvent the wheel. There is a function (i.e., singleton) in the package MoreUtils that does the job:

#!/usr/bin/perl
use Modern::Perl;
use List::MoreUtils qw(singleton);
while (<DATA>) {
 chomp; # don't forget it, it removes the linebreak.
 # split explodes the string in character
 # singleton keeps characters that appear only once
 # ($first) contains the first character that appears only once.
 my ($first) = singleton split//, $_;
 say $first;
}
__DATA__
yellow
tooth

Output:

y
h
toolic
15.2k5 gold badges29 silver badges213 bronze badges
answered May 24, 2017 at 10:00
\$\endgroup\$
2
  • \$\begingroup\$ This is a nice solution but what if you work in an environment where you can't install extra dependencies? \$\endgroup\$ Commented May 24, 2017 at 10:31
  • 1
    \$\begingroup\$ @yuri: Inspect the module and copy the code of the function. \$\endgroup\$ Commented May 24, 2017 at 10:33
1
\$\begingroup\$

The previous answer is great because it poses an alternate solution which leverages existing code. The advantages of using CPAN modules are that the code tends to be:

  • well-documented
  • well-tested
  • of known coverage

But, there is also value in reviewing the code you posted.

Overview

The code is already quite "Perl-ish". I like how you break out of the loop early, as soon as you find no repeated characters; that is efficient.

Fatal

It is great that you used strict and warnings.

My preference is to use a very strict version of warnings:

use warnings FATAL => 'all';

In my experience, the warnings have always pointed to a bug in my code. The issue is that, in some common usage scenarios, it is too easy to miss the warning messages unless you are looking for them. They can be hard to spot even if your code generates a small amount of output, not to mention anything that scrolls off the screen. This option will kill your program dead so that there is no way to miss the warnings.

chomp

It would be good to use chomp in the outer while loop to remove the newline character. While this will not alter the behavior of the code, it more explicitly conveys the intent because you don't really want the newline character to be part of your analysis.

Regex

It is much more common to use the // regular expression delimiters than the ,, delimiters that you used. I think most people would find the code easier to understand with //. There is nothing wrong with your code; this is merely a style issue.

Naming

It is great that you immediately gave a name to the special regex match variable 1ドル:

while (m,(.),g)
{
 my $c = 1ドル;

This is widely considered a good coding practice. However, the variable name $c is not very descriptive in this context. $char would be better:

 my $char = 1ドル;

Once you set this variable, you should no longer use 1ドル. Change:

 if (s,1,,ドルg < 2)

to:

 if (s,$char,,g < 2)

quotemeta

If your string can contain any character, not just letters, then the code does not work if the character is a regular expression metacharacter, such as the period (.). For example, try the code with this input string:

.point

The code prints nothing for that, but it should print .

quotemeta can be used to support metacharacters:

 my $char = quotemeta 1ドル;

Documentation

You should either add a comment near the top of your code to describe its purpose, or use plain old documentation (POD) and get manpage-like help with perldoc.

Function

The DATA block is great for creating self-contained code like this. However, you really want to create a sub which makes your code easier reuse and test. And adding tests gives you greater confidence that the code works as intended.

Layout

I prefer "cuddled" braces, where the opening brace is on the same line as the code, instead of on its own line. This saves on valuable vertical space. For example:

while (<DATA>) {

Here is new code with many of the suggestions above:

use strict;
use warnings FATAL => 'all';
while (<DATA>) {
 chomp;
 while (/(.)/g) {
 my $char = quotemeta 1ドル;
 if (s/$char//g < 2) {
 print "$char\n";
 last;
 }
 }
}
__DATA__
yellow
tooth
.point
llama

Outputs:

y
h
\.
m
answered Jan 29 at 17:10
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.