Encode message by alphabets - Follow Up

Question 1

This code is a revised version of implementation which asked for a advice. Original question is here: Encode message by alphabets

#include <stdio.h>
#include <stdlib.h>
#define MIN_ALPH 1
#define MAX_ALPH 26
unsigned int my_decode(unsigned int input)
{
 unsigned int count = 0;
 unsigned int ddigit;
 int i;
 //check double digit decoding
 //TODO: make macro for (num >= MIN_ALPH && num <= MAX_ALPH)
 if (input % 100 >= MIN_ALPH && input % 100 <= MAX_ALPH)
 count++;
 if (input / 10 >= MIN_ALPH && input / 10 <= MAX_ALPH)
 {
 if (input % 10 > 0)
 count++;
 }
 //check single digit decoding
 for (i=1; i <= 100; i*=10)
 {
 if (input % (i *10) / i == 0)
 break;
 }
 if (i == 1000)
 count++;
 return count;
}
int main(void)
{
 /*Given the mapping a = 1, b = 2, ... z = 26, and an encoded message,
 count the number of ways it can be decoded.
 For example, the message '111' would give 3,
 since it could be decoded as 'aaa', 'ka', and 'ak'.
 You can assume that the messages are decodable.
 For example, '001' is not allowed.*/
 printf("result: %u\n", my_decode(512));
 printf("result: %u\n", my_decode(542));
 printf("result: %u\n", my_decode(112));
}

Question 2

Can you explain why you have discarded most of the suggestions given in the original review? At least mention them, so you don't just get a lot of answers raising the same issues. (That's the main reason I'm not answering this one).

Question 3

@TobySpeight My fault. Will consider it in future questions.

Question 4

Test clarity

A sample such as with my_decode(512) deserves a break down of how, as letters, it might be encoded.

Add as a comment, or integrate in test, the expected outputs.

Posting input as well as output is useful.

printf("%u --> result: %u\n", 512, my_decode(512));

Format

Other code and below hints OP is not using an auto formatter as break is not indented. 1) Recommend to use an auto-formatter 2) Prefer { }

 if (input % (i *10) / i == 0)
 break;
 // vs.
 if (input % (i *10) / i == 0) {
 break;
 }

Macro vs. code

Consider a helper function

bool alph_in_range(unsigned num) {
 return num >= MIN_ALPH && num <= MAX_ALPH;
}

Function

my_decode(102) is 2 and my_decode(1002) is 1. I'd expect 2 for the first is allowed, 10,2 and 1,02 then 10,02 and 1,002 allowed for the 2nd.

Not much else to say.

Minor: _MAX

...._MAX more common in C such as INT_MAX

// #define MIN_ALPH 1
// #define MAX_ALPH 26
#define ALPH_MIN 1
#define ALPH_MAX 26

Minor: unsigned vs. unsigned int

Either works. unsigned is shorter.

As with such style issues, code to your group's coding standard.

Minor: Mixed types

Some coding standards dislike unsigned % int.

Could use input % 100u vs. input % 100.

Question 5

There should be an empty line between the #include lines and the macro definitions. Sure, these lines all start with #, which makes them look similar, but their purpose is completely different. Therefore each of these groups should get its own paragraph.

#include <stdio.h>
#define MIN_ALPH 1
#define MAX_ALPH 26

Since your program only uses features from stdio.h, you don't need to include stdlib.h. That's why I omitted it from the above code.

Now to the interesting part of your code, the function my_decode. This function should rather be called possible_encodings since that matches better what the function actually does. This suggestion already appeared in a review of the original question, therefore in a follow-up request for review, you should at least write some text about the original reviews, what you liked about them and what you didn't like, and why you wrote your code in the way you did. You did nothing of all this.

The function my_decode should take its argument as a string of characters. This way it will be easy to test it with large digit sequences, not only 9 or 10 digits. Since this is C and not Python, the data type int is quite limited in which numbers it can represent. Typically it's from -2147483648 to 2147483647.

The function my_decode is completely undocumented. Each function should have at least a one-liner comment that describes its purpose. Instead, you have a really good comment in main, but that comment doesn't belong there. It belongs right above the function my_decode.

In my_decode, there is no need for a macro. Don't use macros, use static functions instead. Macros are for textual replacement, functions are for computation. Here's an example function:

#include <stdbool.h>
static bool is_in_range(int n)
{
 return MIN_ALPH <= n && n <= MAX_ALPH;
}

The C programming language doesn't have a between operator. This operator can be approximated using the above form, which has the benefit of using only a single kind of comparison operator, thus reducing any confusion.

Usually comparisons are written as subject <=> object, and the subject in this case would be n. Only in the case of the between operator should this guideline be violated.

Still in my_decode, the % 100 looks suspicious, as if your code would only work for 3-digit numbers. To prove this assumption wrong, your test data should include a few test cases for longer digit sequences as well.

Stylistically, your code looks completely inconsistent. Sometimes you write count = 0, at other times you write i=1 without any spaces around the =. Don't do this formatting yourself, as it is boring. Let your editor or IDE do this work for you. Search for "auto format code", and you will find instructions for doing this.

The special case i == 1000 is wrong. Why did you even write this additional if statement? Since the whole function my_decode is a tricky little piece of code, you should explain to the reader of the code why you added each statement. Imagine that you needed to explain this code to someone who can program but who knows only the problem description and the code. Everything else that you want to explain should go into the comments.

As others already said, don't use printf-only tests. Make the tests check their results themselves. For example, during this review I solved the same problem in Go, another programming language, and I came up with this simple list of tests:

 tests := []struct {
 input string
 want uint64
 }{
 {"", 1},
 {"1", 1},
 {"11", 2},
 {"111", 3},
 {"1111", 5},
 {"11111", 8},
 {"10", 1},
 {"201", 1},
 {"11111011111", 40}, // == 5 * 8
 {"1000", 0},
 }

This list is easy to extend, and that's how you should write your tests. Of course, in C this looks a bit different, but the basic rule is to have one test per line, plus any additional comments needed.

Question 6

I come back to the algorithm itself. I feel there is a lot of combinatorics in this. I took a step back and did the grouping into "permutation groups".

The "ABC" as input illustrates these groups or regions:

$ ./a.out
1234567891011121314151617181920212223242526 [Code]
(ABC)DEFGHIJ(AAABAC)(AD)(AE)(AF)(AG)(AH)(AI)T(BABBBC)(BD)(BE)(BF) [Decoded: single, grouped]

"123" can be "ABC", but also "LC" and "AW". Just like "111" can be "AAA", "AK" or "KA" in OP.v1.

A longer group is "212223", which here is singlified to "BABBBC". It ia also "UVW", plus "BLBW", and many more.

#include <stdio.h>
void parse_msg(char *msg) {
 char c, cprev, cnext;
 int i;
 /* Start in a state like after a high digit 3..9 */
 cprev = '9';
 for (i = 0; msg[i] != '0円'; i++) {
 
 c = msg[i];
 cnext = msg[i+1];
 /* "10" and "20" are special cases, get rid of them */ 
 if (cnext == '0') {
 if (cprev <= '2')
 printf(")"); 
 if (c == '1')
 printf("J"); 
 if (c == '2')
 printf("T"); 
 if (c >= '3') {
 printf("******* Error: zero 30-90\n");
 return; 
 }
 cprev = '9'; // reset 
 i++; // extra skip in msg
 continue;
 }
 /* 1: No matter what cnext is (1-9), open a "(" group */
 /* But don't open if next is the null byte */ 
 /* Problem: makes "(" even if "10" follows */
 if (c == '1') {
 if (cprev >= '3') 
 if (cnext == '0円') 
 cprev = '9'; 
 else {
 printf("("); 
 cprev = c; 
 }
 printf("A");
 continue;
 }
 /* 2: Open before or close after */
 if (c == '2') {
 /* new group only if 321-326 */
 if (cprev >= '3' && cnext <= '6')
 if (cnext == '0円') {
 cprev = '9'; 
 printf("B"); 
 continue;
 }
 else
 printf("("); 
 /* "2" is "B" in any case */ 
 printf("B");
 /* "127", "229": was open, must close */ 
 if (cprev <= '2' && cnext >= '7') {
 printf(")"); 
 cprev = '9'; 
 continue;
 }
 cprev = c;
 continue;
 }
 
 /* c == 3 or higher are left */
 /* if open, then close group ")" after printing */
 if (cprev == '1' || 
 c <= '6' && cprev == '2') {
 printf("%c", c + 0x10);
 printf(")");
 cprev = c; 
 continue; 
 }
 printf("%c", c + 0x10);
 cprev = c;
 }
 /* Finish: maybe group is opened */ 
 if (cprev <= '2')
 printf(")");
 printf(" [Decoded: single, grouped] \n"); 
 return;
}
int main(void) {
 
 char *msg = "1234567891011121314151617181920212223242526";
 printf("%s [Code]\n", msg);
 parse_msg(msg);
 msg = "2102102";
 printf("\n%s [Code]\n", msg);
 parse_msg(msg);
 msg = "1181";
 printf("\n%s [Code]\n", msg);
 parse_msg(msg);
 return 0;
}

This gives three test decodings:

1234567891011121314151617181920212223242526 [Code]
(ABC)DEFGHIJ(AAABAC)(AD)(AE)(AF)(AG)(AH)(AI)T(BABBBC)(BD)(BE)(BF) [Decoded: single, grouped] 
2102102 [Code]
(B)J(B)JB [Decoded: single, grouped] 
1181 [Code]
(AAH)A [Decoded: single, grouped]

Maybe the code works now, except for the wrong parens in front of "J" and "T". The "10" and "20 really should be filtered out first, otherwise you need a 2-character look-ahead.

chux chux 36.4k2 gold badges43 silver badges96 bronze badges · Answer 1 · 2020-11-17 18:18:32Z

Test clarity

A sample such as with my_decode(512) deserves a break down of how, as letters, it might be encoded.

Add as a comment, or integrate in test, the expected outputs.

Posting input as well as output is useful.

printf("%u --> result: %u\n", 512, my_decode(512));

Format

Other code and below hints OP is not using an auto formatter as break is not indented. 1) Recommend to use an auto-formatter 2) Prefer { }

 if (input % (i *10) / i == 0)
 break;
 // vs.
 if (input % (i *10) / i == 0) {
 break;
 }

Macro vs. code

Consider a helper function

bool alph_in_range(unsigned num) {
 return num >= MIN_ALPH && num <= MAX_ALPH;
}

Function

my_decode(102) is 2 and my_decode(1002) is 1. I'd expect 2 for the first is allowed, 10,2 and 1,02 then 10,02 and 1,002 allowed for the 2nd.

Not much else to say.

Minor: _MAX

...._MAX more common in C such as INT_MAX

// #define MIN_ALPH 1
// #define MAX_ALPH 26
#define ALPH_MIN 1
#define ALPH_MAX 26

Minor: unsigned vs. unsigned int

Either works. unsigned is shorter.

As with such style issues, code to your group's coding standard.

Minor: Mixed types

Some coding standards dislike unsigned % int.

Could use input % 100u vs. input % 100.

Roland Illig Roland Illig 21.8k2 gold badges36 silver badges83 bronze badges · Answer 2 · 2020-11-17 23:32:58Z

There should be an empty line between the #include lines and the macro definitions. Sure, these lines all start with #, which makes them look similar, but their purpose is completely different. Therefore each of these groups should get its own paragraph.

#include <stdio.h>
#define MIN_ALPH 1
#define MAX_ALPH 26

Since your program only uses features from stdio.h, you don't need to include stdlib.h. That's why I omitted it from the above code.

Now to the interesting part of your code, the function my_decode. This function should rather be called possible_encodings since that matches better what the function actually does. This suggestion already appeared in a review of the original question, therefore in a follow-up request for review, you should at least write some text about the original reviews, what you liked about them and what you didn't like, and why you wrote your code in the way you did. You did nothing of all this.

The function my_decode should take its argument as a string of characters. This way it will be easy to test it with large digit sequences, not only 9 or 10 digits. Since this is C and not Python, the data type int is quite limited in which numbers it can represent. Typically it's from -2147483648 to 2147483647.

The function my_decode is completely undocumented. Each function should have at least a one-liner comment that describes its purpose. Instead, you have a really good comment in main, but that comment doesn't belong there. It belongs right above the function my_decode.

In my_decode, there is no need for a macro. Don't use macros, use static functions instead. Macros are for textual replacement, functions are for computation. Here's an example function:

#include <stdbool.h>
static bool is_in_range(int n)
{
 return MIN_ALPH <= n && n <= MAX_ALPH;
}

The C programming language doesn't have a between operator. This operator can be approximated using the above form, which has the benefit of using only a single kind of comparison operator, thus reducing any confusion.

Usually comparisons are written as subject <=> object, and the subject in this case would be n. Only in the case of the between operator should this guideline be violated.

Still in my_decode, the % 100 looks suspicious, as if your code would only work for 3-digit numbers. To prove this assumption wrong, your test data should include a few test cases for longer digit sequences as well.

Stylistically, your code looks completely inconsistent. Sometimes you write count = 0, at other times you write i=1 without any spaces around the =. Don't do this formatting yourself, as it is boring. Let your editor or IDE do this work for you. Search for "auto format code", and you will find instructions for doing this.

The special case i == 1000 is wrong. Why did you even write this additional if statement? Since the whole function my_decode is a tricky little piece of code, you should explain to the reader of the code why you added each statement. Imagine that you needed to explain this code to someone who can program but who knows only the problem description and the code. Everything else that you want to explain should go into the comments.

As others already said, don't use printf-only tests. Make the tests check their results themselves. For example, during this review I solved the same problem in Go, another programming language, and I came up with this simple list of tests:

 tests := []struct {
 input string
 want uint64
 }{
 {"", 1},
 {"1", 1},
 {"11", 2},
 {"111", 3},
 {"1111", 5},
 {"11111", 8},
 {"10", 1},
 {"201", 1},
 {"11111011111", 40}, // == 5 * 8
 {"1000", 0},
 }

This list is easy to extend, and that's how you should write your tests. Of course, in C this looks a bit different, but the basic rule is to have one test per line, plus any additional comments needed.

user232636user232636 · Answer 3 · 2020-11-18 13:21:48Z

I come back to the algorithm itself. I feel there is a lot of combinatorics in this. I took a step back and did the grouping into "permutation groups".

The "ABC" as input illustrates these groups or regions:

$ ./a.out
1234567891011121314151617181920212223242526 [Code]
(ABC)DEFGHIJ(AAABAC)(AD)(AE)(AF)(AG)(AH)(AI)T(BABBBC)(BD)(BE)(BF) [Decoded: single, grouped]

"123" can be "ABC", but also "LC" and "AW". Just like "111" can be "AAA", "AK" or "KA" in OP.v1.

A longer group is "212223", which here is singlified to "BABBBC". It ia also "UVW", plus "BLBW", and many more.

#include <stdio.h>
void parse_msg(char *msg) {
 char c, cprev, cnext;
 int i;
 /* Start in a state like after a high digit 3..9 */
 cprev = '9';
 for (i = 0; msg[i] != '0円'; i++) {
 
 c = msg[i];
 cnext = msg[i+1];
 /* "10" and "20" are special cases, get rid of them */ 
 if (cnext == '0') {
 if (cprev <= '2')
 printf(")"); 
 if (c == '1')
 printf("J"); 
 if (c == '2')
 printf("T"); 
 if (c >= '3') {
 printf("******* Error: zero 30-90\n");
 return; 
 }
 cprev = '9'; // reset 
 i++; // extra skip in msg
 continue;
 }
 /* 1: No matter what cnext is (1-9), open a "(" group */
 /* But don't open if next is the null byte */ 
 /* Problem: makes "(" even if "10" follows */
 if (c == '1') {
 if (cprev >= '3') 
 if (cnext == '0円') 
 cprev = '9'; 
 else {
 printf("("); 
 cprev = c; 
 }
 printf("A");
 continue;
 }
 /* 2: Open before or close after */
 if (c == '2') {
 /* new group only if 321-326 */
 if (cprev >= '3' && cnext <= '6')
 if (cnext == '0円') {
 cprev = '9'; 
 printf("B"); 
 continue;
 }
 else
 printf("("); 
 /* "2" is "B" in any case */ 
 printf("B");
 /* "127", "229": was open, must close */ 
 if (cprev <= '2' && cnext >= '7') {
 printf(")"); 
 cprev = '9'; 
 continue;
 }
 cprev = c;
 continue;
 }
 
 /* c == 3 or higher are left */
 /* if open, then close group ")" after printing */
 if (cprev == '1' || 
 c <= '6' && cprev == '2') {
 printf("%c", c + 0x10);
 printf(")");
 cprev = c; 
 continue; 
 }
 printf("%c", c + 0x10);
 cprev = c;
 }
 /* Finish: maybe group is opened */ 
 if (cprev <= '2')
 printf(")");
 printf(" [Decoded: single, grouped] \n"); 
 return;
}
int main(void) {
 
 char *msg = "1234567891011121314151617181920212223242526";
 printf("%s [Code]\n", msg);
 parse_msg(msg);
 msg = "2102102";
 printf("\n%s [Code]\n", msg);
 parse_msg(msg);
 msg = "1181";
 printf("\n%s [Code]\n", msg);
 parse_msg(msg);
 return 0;
}

This gives three test decodings:

1234567891011121314151617181920212223242526 [Code]
(ABC)DEFGHIJ(AAABAC)(AD)(AE)(AF)(AG)(AH)(AI)T(BABBBC)(BD)(BE)(BF) [Decoded: single, grouped] 
2102102 [Code]
(B)J(B)JB [Decoded: single, grouped] 
1181 [Code]
(AAH)A [Decoded: single, grouped]

Maybe the code works now, except for the wrong parens in front of "J" and "T". The "10" and "20 really should be filtered out first, otherwise you need a 2-character look-ahead.

Stack Exchange Network

Encode message by alphabets - Follow Up

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Encode message by alphabets - Follow Up

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions