Generate String with Random Consonants and Vowels in C

Question 1

Replace all c / C / v / V in stdin with lower case, upper case consonants / vowels respectively. All other input is passed through.

Anything that can be improved? Looking at maintainability first, performance second.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
static const char lower_consonants[] = "bcdfghjklmnpqrstvwxyz";
static const char upper_consonants[] = "BCDFGHJKLMNPQRSTVWXYZ";
static const char lower_vowels[] = "aeiou";
static const char upper_vowels[] = "AEIOU";
#define RND_CHAR(str) ((str)[rand() % (sizeof(str) - 1)])
char to_char(char c)
{
 switch (c) {
 case 'c': return RND_CHAR(lower_consonants); break;
 case 'C': return RND_CHAR(upper_consonants); break;
 case 'v': return RND_CHAR(lower_vowels); break;
 case 'V': return RND_CHAR(upper_vowels); break;
 default: return c;
 }
}
int main(void)
{
 srand(time(NULL));
 for (int c; (c = getchar()) != EOF;)
 putchar(to_char(c));
 fflush(stdout);
}

Makefile:

CFLAGS:=-std=c11 -W -Wall -pedantic -O2 -g
.PHONY: all
all: wordify
.PHONY: clean
clean::
 $(RM) wordify *.[adios] *.gcda *.gcno *.gcov

Question 2

This is generally quite pleasant. Only a few observations:

Character input loop

The construct

for (int c; (c = getchar()) != EOF;)

is a little easy to misread.

I'm always very tempted to use a for loop for this kind of thing (also various kinds of pointer manipulation of a similar form) but it tends to end up being rather awkward.

while ((c = getchar()) != EOF)

Is an overwhelmingly common C idiom, and I'd be inclined to use that form instead (even if you do end up with a slightly lonely-looking variable declaration at the top).

Redundant reference data

As another answer pointed out, it seems a bit superfluous to maintain completely separate sets of reference data for upper- and lower- case letters. Using toupper in to_char would be one approach; personally, I would be tempted to wrap to_char in another function which checked the case of its argument and called toupper as appropriate---but that might feel like overkill for only a couple of character classes.

The explicit list of consonants also feels a little funny to me. I think of consonants as "those letters which are not vowels", and there enough of them that I'd have to think far too hard if I were to type them all out. I'd be certainly tempted to generate the consonant set at runtime, iterating over the range 'a' to 'z' and excluding the vowels (which I would explicitly enumerate).

Naming: `to_char`

The signature char to_char(char c) looks a little odd to me. Converting characters to characters? I'd probably instinctively reach for something like

char gen_char(char kind)

(for generate character), but that's open to debate.

Redundant `break`s

The break statements in to_char don't hurt, but they are dead code, so they take up mental space. I would be inclined to drop them---and to align the return from the default case with the others, to emphasise that all of the cases return directly.

No `return` from main

I know that the C standard allows it, but it will still always look wrong to me. Depending on the context for this program, other people might look at the source and have mental space occupied by the sense that the lack of a return from main looks wrong. But that is essentially entirely a matter of opinion, and arguably I'm the one that's wrong there.

As I said though, overall it's really rather pleasant.

Question 3

Ha! The breaks and no return from main really is good findings! TY!

Question 4

I should have run splint. Unlike CLion and gcc -W -Wall -pedantic, it found both of these issues.

Question 5

Checking the case of the input to eliminate the extra upper-case characters is often false economy. islower (or isupper) frequently use a table of 256 values (i.e., all possible values of char) with bits set for the characteristics of each character (e.g. bit 0 = lower case, bit 1 = upper case, bit 2 = digit, bit 3 = white-space, etc.) so you're using 256 characters (and sometimes 512) to eliminate 26.

Question 6

@JerryCoffin I was thinking in terms of maintainability rather than code size---just saying "match the case of the input" rather than enumerating all the possibilities. Although with only four cases you could argue it both ways. You're right about the constant size, but I figure a few hundred bytes isn't a big deal on most systems (particularly when different optimisation flags will change the size by a few kilobytes). If you really cared about code size, you could have fun exploiting the structure of ASCII, and you could even just have a bitmap for vowels, but that way lies madness.

Question 7

@ChristianHujer the "no return from main" is legal post-C99, so gcc isn't warning about it because of the -std=c11 compiler flag you have.

David Morris David Morris 1,0287 silver badges12 bronze badges · Answer 1 · 2016-02-10 22:35:22Z

This is generally quite pleasant. Only a few observations:

Character input loop

The construct

for (int c; (c = getchar()) != EOF;)

is a little easy to misread.

I'm always very tempted to use a for loop for this kind of thing (also various kinds of pointer manipulation of a similar form) but it tends to end up being rather awkward.

while ((c = getchar()) != EOF)

Is an overwhelmingly common C idiom, and I'd be inclined to use that form instead (even if you do end up with a slightly lonely-looking variable declaration at the top).

Redundant reference data

As another answer pointed out, it seems a bit superfluous to maintain completely separate sets of reference data for upper- and lower- case letters. Using toupper in to_char would be one approach; personally, I would be tempted to wrap to_char in another function which checked the case of its argument and called toupper as appropriate---but that might feel like overkill for only a couple of character classes.

The explicit list of consonants also feels a little funny to me. I think of consonants as "those letters which are not vowels", and there enough of them that I'd have to think far too hard if I were to type them all out. I'd be certainly tempted to generate the consonant set at runtime, iterating over the range 'a' to 'z' and excluding the vowels (which I would explicitly enumerate).

Naming: `to_char`

The signature char to_char(char c) looks a little odd to me. Converting characters to characters? I'd probably instinctively reach for something like

char gen_char(char kind)

(for generate character), but that's open to debate.

Redundant `break`s

The break statements in to_char don't hurt, but they are dead code, so they take up mental space. I would be inclined to drop them---and to align the return from the default case with the others, to emphasise that all of the cases return directly.

No `return` from main

I know that the C standard allows it, but it will still always look wrong to me. Depending on the context for this program, other people might look at the source and have mental space occupied by the sense that the lack of a return from main looks wrong. But that is essentially entirely a matter of opinion, and arguably I'm the one that's wrong there.

As I said though, overall it's really rather pleasant.

Ha! The breaks and no return from main really is good findings! TY!
I should have run splint. Unlike CLion and gcc -W -Wall -pedantic, it found both of these issues.
Checking the case of the input to eliminate the extra upper-case characters is often false economy. islower (or isupper) frequently use a table of 256 values (i.e., all possible values of char) with bits set for the characteristics of each character (e.g. bit 0 = lower case, bit 1 = upper case, bit 2 = digit, bit 3 = white-space, etc.) so you're using 256 characters (and sometimes 512) to eliminate 26.
@JerryCoffin I was thinking in terms of maintainability rather than code size---just saying "match the case of the input" rather than enumerating all the possibilities. Although with only four cases you could argue it both ways. You're right about the constant size, but I figure a few hundred bytes isn't a big deal on most systems (particularly when different optimisation flags will change the size by a few kilobytes). If you really cared about code size, you could have fun exploiting the structure of ASCII, and you could even just have a bitmap for vowels, but that way lies madness.
@ChristianHujer the "no return from main" is legal post-C99, so gcc isn't warning about it because of the -std=c11 compiler flag you have.

Stack Exchange Network

Generate String with Random Consonants and Vowels in C

1 Answer 1

Character input loop

Redundant reference data

Naming: `to_char`

Redundant `break`s

No `return` from main

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Generate String with Random Consonants and Vowels in C

1 Answer 1

Character input loop

Redundant reference data

Naming: to_char

Redundant breaks

No return from main

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

Naming: `to_char`

Redundant `break`s

No `return` from main