Given this text
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
write the shortest program that produces the same text justified at 80 character. The above text must look exactly as:
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.
Rules:
- words must not be cut
- extra spaces must be added
- after a dot.
- after a comma
- after the shortest word (from left to right)
- the result must not have more than 2 consecutive spaces
- last line is not justified.
- lines must not begin with comma or dot.
- provide the output of your program
winner: The shortest program.
note: The input string is provided on STDIN as one line (no line feed or carriage return)
update:
The input string can be any text with word length reasonnable (ie. not more than 20~25 char) such as:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed non risus. Suspendisse lectus tortor, dignissim sit amet, adipiscing nec, ultricies sed, dolor. Cras elementum ultrices diam. Maecenas ligula massa, varius a, semper congue, euismod non, mi. Proin porttitor, orci nec nonummy molestie, enim est eleifend mi, non fermentum diam nisl sit amet erat. Duis semper. Duis arcu massa, scelerisque vitae, consequat in, pretium a, enim. Pellentesque congue. Ut in risus volutpat libero pharetra tempor. Cras vestibulum bibendum augue. Praesent egestas leo in pede. Praesent blandit odio eu enim. Pellentesque sed dui ut augue blandit sodales. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Aliquam nibh. Mauris ac mauris sed pede pellentesque fermentum. Maecenas adipiscing ante non diam sodales hendrerit. Ut velit mauris, egestas sed, gravida nec, ornare ut, mi. Aenean ut orci vel massa suscipit pulvinar. Nulla sollicitudin. Fusce varius, ligula non tempus aliquam, nunc turpis ullamcorper nibh, in tempus sapien eros vitae ligula. Pellentesque rhoncus nunc et augue. Integer id felis. Curabitur aliquet pellentesque diam. Integer quis metus vitae elit lobortis egestas. Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Morbi vel erat non mauris convallis vehicula. Nulla et sapien. Integer tortor tellus, aliquam faucibus, convallis id, congue eu, quam. Mauris ullamcorper felis vitae erat. Proin feugiat, augue non elementum posuere, metus purus iaculis lectus, et tristique ligula justo vitae magna. Aliquam convallis sollicitudin purus. Praesent aliquam, enim at fermentum mollis, ligula massa adipiscing nisl, ac euismod nibh nisl eu lectus. Fusce vulputate sem at sapien. Vivamus leo. Aliquam euismod libero eu enim. Nulla nec felis sed leo placerat imperdiet. Aenean suscipit nulla in justo. Suspendisse cursus rutrum augue. Nulla tincidunt tincidunt mi. Curabitur iaculis, lorem vel rhoncus faucibus, felis magna fermentum augue, et ultricies lacus lorem varius purus. Curabitur eu amet.
-
4\$\begingroup\$ Why ask people to provide the output of their program? Are you that worried about people failing to check their results before posting? \$\endgroup\$Peter Taylor– Peter Taylor2011年12月10日 14:05:31 +00:00Commented Dec 10, 2011 at 14:05
-
1\$\begingroup\$ I'm tempted to provide a php program which consists of the output text. ;-) Seriously though, the spaces on the second line of the output text seem to have been added to the spaces at random? Is there some pattern to it that I'm not seeing, and if not, how can we be expected to produce exactly that output for the given input? \$\endgroup\$Gareth– Gareth2011年12月10日 14:23:34 +00:00Commented Dec 10, 2011 at 14:23
-
\$\begingroup\$ @Gareth: Sorry, my bad. I made a mistake, is after the comma, not after incididunt. Question edited. \$\endgroup\$Toto– Toto2011年12月10日 15:36:46 +00:00Commented Dec 10, 2011 at 15:36
-
1\$\begingroup\$ Does the program have to work also for inputs other than that one paragraph of Lipsum? \$\endgroup\$Ilmari Karonen– Ilmari Karonen2011年12月10日 17:03:12 +00:00Commented Dec 10, 2011 at 17:03
-
1\$\begingroup\$ @Ilmari Karonen: Yes, the input string can be anything. \$\endgroup\$Toto– Toto2011年12月10日 17:06:52 +00:00Commented Dec 10, 2011 at 17:06
3 Answers 3
Perl, 94 chars
for(/(.{0,80}\s)/g){$i=1;$i+=!s/^(.*?\.|.*?,|(.*? )??\S{$i}) \b/1ドル /until/
|.{81}/;chop;say}
Run with perl -nM5.01
. (The n
is included in the character count.)
The code above is the shortest I could make that could handle any curveballs I threw at it (such as one-letter words at the beginning of a line, input lines exactly 80 chars long, etc.) exactly according to spec:
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.
I'm tempted to provide a php program which consists of the output text. ;-)
Seriously though, the spaces on the second line of the output text seem to have
been added to the spaces at random? Is there some pattern to it that I'm not
seeing, and if not, how can we be expected to produce exactly that output for
the given input?
(With apologies to Gareth for using his comment as additional test input.)
The following 75-char version works well enough to produce the sample output from the sample input, but can fail for other inputs. Also, it leaves an extra space character at the end of each the output line.
for(/(.{0,80}\s)/g){s/(.*?\.|.*?,|.*? ..) \b/1ドル /until/.{81}/||s/
//;say}
Both versions will loop forever if they encounter input that they can't justify correctly. (In the longer version, replacing until
with until$i>80||
would fix that at the cost of seven extra chars.)
-
\$\begingroup\$ Ah, I should have started with a perl solution ;-) This language is of course really good for such a task. \$\endgroup\$Howard– Howard2011年12月10日 19:03:24 +00:00Commented Dec 10, 2011 at 19:03
-
\$\begingroup\$ I got
Quantifier in {,} bigger than 32766 in regex; marked by <-- HERE in m/^(.*?\.|.*?,|(.*? )??\S{ <-- HERE 32767}) \b/
for the second text. \$\endgroup\$Toto– Toto2011年12月11日 13:43:07 +00:00Commented Dec 11, 2011 at 13:43 -
\$\begingroup\$ @M42: That's because the second example text cannot be justified according to the rules. If I add in the
$i>80
check, it expands the 11th line topede pellentesque fermentum. Maecenas adipiscing ante non diam sodales
, which is only 78 chars long, and then gives up since each word (except the last) is followed by two spaces. \$\endgroup\$Ilmari Karonen– Ilmari Karonen2011年12月13日 17:50:50 +00:00Commented Dec 13, 2011 at 17:50
Ruby, 146 characters
$><<gets.gsub(/(.{,80})( |$)/){2ドル>""?(s=1ドル+$/;(['\.',?,]+(1..80).map{|l|"\\b\\w{#{l}}"}).any?{|x|s.sub! /#{x} (?=\w)/,'\& '}while s.size<81;s):1ドル}
It prints exactly the desired output (see below) if the given text is fed into STDIN.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.
Edit: Just after submitting my first solution I saw in the comments that it is required that any input string can be processed. The previous answer was only 95 characters but did not fulfill this requirement:
r=gets.split;l=0;'49231227217b6'.chars{|s|r[l+=s.hex]+=' '};(r*' ').gsub(/(.{,80}) ?/){puts 1ドル}
-
\$\begingroup\$ If I'm not mistaken, you're using the same cheat as I thought of (encoding the locations of the double-spaced words in the example output). Note that M42 has clarified that the programs should cope with other inputs too. \$\endgroup\$Ilmari Karonen– Ilmari Karonen2011年12月10日 17:36:46 +00:00Commented Dec 10, 2011 at 17:36
-
\$\begingroup\$ @Ilmari Karonen Yes, I saw that after submitting. See my edit and comments above. Going back to the golf course... \$\endgroup\$Howard– Howard2011年12月10日 17:38:05 +00:00Commented Dec 10, 2011 at 17:38
there's a lot of rules here that add many characters without changing the output much, so I'm providing options. also, 80 seems like such an arbitrary number, so I'll also provide TIO links and bytecounts for a more general version that takes length as its first argument and does the same thing
sed 4.2.2 -r
, (削除) 93 (削除ここまで) 80 bytes
s/.{,80}\>.? /&\n/g
s/(.*)\n/sed -r ':;$q;\\|^.{,80}$|s|\\> \\<|\& |;t'<<<'1円'/e
first wraps words to newline if longer than 80 chars; then justifies to 80 chars per line, adding at most one space between each word from right to left. if there's not enough spaces on one line it just gives up atfer adding 2 spaces everywhere. general case, (削除) 145 (削除ここまで) 131 bytes
sed 4.2.2 -r
, (削除) 147 (削除ここまで) 136 bytes
s/.{,80}\>.? /&\n/g
s/(.*)\n/sed -r ':;$q;\\@^.{,80}$@s@.*@sed -r "s%(,|\\\\.) \\\\<%\\\& %;t;s%\\\\> \\\\<%\\\& %"<<<"\&"@e;t'<<<'1円'/e
this one prefers adding after periods or commas first, then does the same as the last one general case, (削除) 227 (削除ここまで)219 bytes
sed 4.2.2, (削除) 214 (削除ここまで) 211 bytes
s/.{,80}\>.? /&\n/g
s/(.*)\n/sed -r ':;$q;G;\\@^.{,80}\\n@s@(.*)\\n(.*)@sed -r "s%(,|\\\\.) \\\\<%\\\& %;t;s%( [^ ]\2円 )([^ ])%\\\1円 \\\2円%;t;i#"<<<"\1円"@e;\\@#@{s@..@@;x;s@^@[^ ]@;x};t;s@\\n.*@@;x;z;x'<<<'1円'/e
finally, this one prefers commas and periods, then adds spaces after the shortest word from left to right. if a line needs more than two spaces after a word, it gets stuck in a loop. general case, (削除) 304 (削除ここまで) 300 bytes