4
\$\begingroup\$

Replace all c / C / v / V in stdin with lower case, upper case consonants / vowels respectively. All other input is passed through.

Anything that can be improved? Looking at maintainability first, performance second.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
static const char lower_consonants[] = "bcdfghjklmnpqrstvwxyz";
static const char upper_consonants[] = "BCDFGHJKLMNPQRSTVWXYZ";
static const char lower_vowels[] = "aeiou";
static const char upper_vowels[] = "AEIOU";
#define RND_CHAR(str) ((str)[rand() % (sizeof(str) - 1)])
char to_char(char c)
{
 switch (c) {
 case 'c': return RND_CHAR(lower_consonants); break;
 case 'C': return RND_CHAR(upper_consonants); break;
 case 'v': return RND_CHAR(lower_vowels); break;
 case 'V': return RND_CHAR(upper_vowels); break;
 default: return c;
 }
}
int main(void)
{
 srand(time(NULL));
 for (int c; (c = getchar()) != EOF;)
 putchar(to_char(c));
 fflush(stdout);
}

Makefile:

CFLAGS:=-std=c11 -W -Wall -pedantic -O2 -g
.PHONY: all
all: wordify
.PHONY: clean
clean::
 $(RM) wordify *.[adios] *.gcda *.gcno *.gcov
asked Feb 10, 2016 at 21:49
\$\endgroup\$

1 Answer 1

4
\$\begingroup\$

This is generally quite pleasant. Only a few observations:

Character input loop

The construct

for (int c; (c = getchar()) != EOF;)

is a little easy to misread.

I'm always very tempted to use a for loop for this kind of thing (also various kinds of pointer manipulation of a similar form) but it tends to end up being rather awkward.

while ((c = getchar()) != EOF)

Is an overwhelmingly common C idiom, and I'd be inclined to use that form instead (even if you do end up with a slightly lonely-looking variable declaration at the top).

Redundant reference data

As another answer pointed out, it seems a bit superfluous to maintain completely separate sets of reference data for upper- and lower- case letters. Using toupper in to_char would be one approach; personally, I would be tempted to wrap to_char in another function which checked the case of its argument and called toupper as appropriate---but that might feel like overkill for only a couple of character classes.

The explicit list of consonants also feels a little funny to me. I think of consonants as "those letters which are not vowels", and there enough of them that I'd have to think far too hard if I were to type them all out. I'd be certainly tempted to generate the consonant set at runtime, iterating over the range 'a' to 'z' and excluding the vowels (which I would explicitly enumerate).

Naming: to_char

The signature char to_char(char c) looks a little odd to me. Converting characters to characters? I'd probably instinctively reach for something like

char gen_char(char kind)

(for generate character), but that's open to debate.

Redundant breaks

The break statements in to_char don't hurt, but they are dead code, so they take up mental space. I would be inclined to drop them---and to align the return from the default case with the others, to emphasise that all of the cases return directly.

No return from main

I know that the C standard allows it, but it will still always look wrong to me. Depending on the context for this program, other people might look at the source and have mental space occupied by the sense that the lack of a return from main looks wrong. But that is essentially entirely a matter of opinion, and arguably I'm the one that's wrong there.

As I said though, overall it's really rather pleasant.

answered Feb 10, 2016 at 22:35
\$\endgroup\$
5
  • \$\begingroup\$ Ha! The breaks and no return from main really is good findings! TY! \$\endgroup\$ Commented Feb 10, 2016 at 22:39
  • \$\begingroup\$ I should have run splint. Unlike CLion and gcc -W -Wall -pedantic, it found both of these issues. \$\endgroup\$ Commented Feb 10, 2016 at 23:06
  • \$\begingroup\$ Checking the case of the input to eliminate the extra upper-case characters is often false economy. islower (or isupper) frequently use a table of 256 values (i.e., all possible values of char) with bits set for the characteristics of each character (e.g. bit 0 = lower case, bit 1 = upper case, bit 2 = digit, bit 3 = white-space, etc.) so you're using 256 characters (and sometimes 512) to eliminate 26. \$\endgroup\$ Commented Feb 10, 2016 at 23:21
  • \$\begingroup\$ @JerryCoffin I was thinking in terms of maintainability rather than code size---just saying "match the case of the input" rather than enumerating all the possibilities. Although with only four cases you could argue it both ways. You're right about the constant size, but I figure a few hundred bytes isn't a big deal on most systems (particularly when different optimisation flags will change the size by a few kilobytes). If you really cared about code size, you could have fun exploiting the structure of ASCII, and you could even just have a bitmap for vowels, but that way lies madness. \$\endgroup\$ Commented Feb 11, 2016 at 10:13
  • \$\begingroup\$ @ChristianHujer the "no return from main" is legal post-C99, so gcc isn't warning about it because of the -std=c11 compiler flag you have. \$\endgroup\$ Commented Feb 11, 2016 at 10:18

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.