Given a 40 character string representing outcomes of 40 coin tosses, find the frequency destribution of all the possible outcome triplets. So, for string like : HHHH....40 Hs ,the result is HHH : 38 and the rest 0. The exact input output format is given in the link.
My approach, as is clear from the code, is to iterate through the string and match all consecutive sub-strings of length 3, while increasing the frequency count.
My solution got accepted, but I am not really happy with what I have done here. As you can see, there is a lot of redundant code I could not remove.
For instance, if string "TTT" could be directly mapped to the macro TTT, somehow, and so on, a large number of lines of code could have been reduced. I know technically strings and macros are two different things, but is there some trick that I am missing?
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<limits.h>
#define TTT 0
#define TTH 1
#define THT 2
#define THH 3
#define HTT 4
#define HTH 5
#define HHT 6
#define HHH 7
void getCount(char str[] , int count[]){
int i;
for(i=0;i<8;++i)
count[i]=0;
int l = strlen(str);
for(i=2;i<l;++i){
if(strncmp(str+i-2 , "TTT" , 3)==0)
++count[TTT];
if(strncmp(str+i-2 , "TTH" , 3)==0)
++count[TTH];
if(strncmp(str+i-2 , "THT" , 3)==0)
++count[THT];
if(strncmp(str+i-2 , "THH" , 3)==0)
++count[THH];
if(strncmp(str+i-2 , "HTT" , 3)==0)
++count[HTT];
if(strncmp(str+i-2 , "HTH" , 3)==0)
++count[HTH];
if(strncmp(str+i-2 , "HHT" , 3)==0)
++count[HHT];
if(strncmp(str+i-2 , "HHH" , 3)==0)
++count[HHH];
}
}
int main()
{
char str[41];
int t,n,count[8],i;
scanf("%d",&t);
while(t--){
scanf("%d",&n);
printf("%d ",n);
scanf("%s",str);
getCount(str,count);
for(i=0;i<8;++i){
printf("%d ",count[i]);
}
printf("\n");
}
return 0;
}
-
\$\begingroup\$ Please give details about the problem rather than posting where's described. We would also appreciate if you told us how you decided to solve it. \$\endgroup\$edmz– edmz2014年05月26日 14:30:55 +00:00Commented May 26, 2014 at 14:30
-
\$\begingroup\$ The problem is: given a 40 character string representing outcomes of 40 coin tosses, find the frequency destribution of all the possible outcome triplets. So, for string like : HHHH....40 Hs ,the result is HHH : 38 and the rest 0. The exact input output format is given in the link. \$\endgroup\$user43115– user431152014年05月26日 14:34:33 +00:00Commented May 26, 2014 at 14:34
-
\$\begingroup\$ My approach : as is clear from the code, iterate through the string and match all consecutive sub-strings of length 3, while increasing the frequency count \$\endgroup\$user43115– user431152014年05月26日 14:36:45 +00:00Commented May 26, 2014 at 14:36
-
\$\begingroup\$ In the main post, so that there's no need to go through comments to find out it. \$\endgroup\$edmz– edmz2014年05月26日 14:40:38 +00:00Commented May 26, 2014 at 14:40
3 Answers 3
Associating a set of constants with corresponding strings is difficult in C. One approach is to define a structure, such as
struct patterns {
const char combination[4];
int count;
};
Here we have space for a string and an integer to count occurrences. Creating an array of these and initializing it gives us this:
struct patterns patterns[N_PATTERNS] = {
{"HHH", 0}, {"HHT", 0}, {"HTH", 0}, {"HTT", 0},
{"THH", 0}, {"THT", 0}, {"TTH", 0}, {"TTT", 0},
};
Then you can loop through the strings by accessing each array member and incrementing the counts. The count and string are kept together so that indexing errors cannot happen.
Also, your main
is too complicated. Just reading the string from the
command line is easier.
Putting it together, gives me this:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define N_PATTERNS 8
struct patterns {
const char combination[4];
int count;
};
static int increment_occurance(struct patterns *p, const char *s)
{
for (int i = 0; i < N_PATTERNS; ++i) {
if (!memcmp(s, p[i].combination, 3)) {
p[i].count += 1;
return 1;
}
}
return 0;
}
static void count_occurances(struct patterns *p, const char *s)
{
const char *end = s + strlen(s) - 2;
for (; s < end; ++s) {
increment_occurance(p, s);
}
}
int main(int argc, char **argv)
{
struct patterns patterns[N_PATTERNS] = {
{"HHH", 0}, {"HHT", 0}, {"HTH", 0}, {"HTT", 0},
{"THH", 0}, {"THT", 0}, {"TTH", 0}, {"TTT", 0},
};
if (argc < 2) {
fprintf(stderr, "Usage: %s string\n", argv[0]);
return EXIT_FAILURE;
}
count_occurances(patterns, argv[1]);
for (int i = 0; i < N_PATTERNS; ++i) {
printf("%s %d\n", patterns[i].combination, patterns[i].count);
}
return EXIT_SUCCESS;
}
EDIT: note that we can use the more efficient memcmp
instead of strncmp
in increment_occurance
.
-
\$\begingroup\$ Could you cite a source that points to
memcmp()
being more efficient thanstrncmp()
? \$\endgroup\$syb0rg– syb0rg2014年07月27日 20:14:28 +00:00Commented Jul 27, 2014 at 20:14 -
1\$\begingroup\$ @syb0rg
memcmp
is going to check char pairs only for equivalence, whereasstrncmp
must also check each char against0円
. Having said that, I don't really like the embedded length (3) so I'm not sure that was good advice - and the efficiency difference will be trivial. \$\endgroup\$William Morris– William Morris2014年07月29日 13:43:11 +00:00Commented Jul 29, 2014 at 13:43
Style
- Variable names: it's ok naming variables with a meaningless one character if and only if that program will not be read from another human.
Otherwise just don't: use proper names, that will help both you and future readers. - Take out headers you don't use: I haven't seen any function which belongs to
stdlib and limit
being used. - Spaces: Do not underestimate them. They are very useful to make the code cleaner. Moreover, the have no drawback: compilers ignore them.
- Messages: Warn the user about what kind of input you are actually waiting for.
Code
I have noticed an attempt to avoid unsafe functions with
strncmp
, but the issue is located before you compare them. That is, when you acquire them.
Althoughscanf
can be used for strings, and in this case it is the right choice, it can easily overflow the buffer that's filling. Format the string so:%(character_to_read)s
in order to prevent buffer overflows.
I'd use it rather thanfgets
becasefgets
reads spaces as well, which we would have to get rid from later.You can declare a variable inside the
for
loop since C99.
Optimization
An associative container would be the best option. The website doesn't have any constraints about the language: C does not have any associative container (or hash table) in its stdlib, while C++ does.
A lookup table (string -> array index) is likely faster than strncmp
ing, but I wouldn't recommend it unless needed, especially if you have to implement it from scratch. There might be other dark tricks though.
You could start by replacing the #define
s with an enum
:
typedef enum { TTT, TTH, THT, THH, HTT, HTH, HHT, HHH } Combination;
As for mapping strings to macros, I suppose this could work with a toString()
function. It could have a switch
statement that receives an enum
value and returns its associated string. This may not be too concise, but C doesn't give you very many concise options.
-
\$\begingroup\$ I was thinking of a more general solution. So, that I could keep adding macros in a header file and without changing the c code, access the macros through strings. \$\endgroup\$user43115– user431152014年05月26日 15:04:45 +00:00Commented May 26, 2014 at 15:04
-
\$\begingroup\$ @user43115: That may become messy, and I'm not sure if online challenges involve having multiple files. You may just need to try for something within one file. \$\endgroup\$Jamal– Jamal2014年05月26日 15:08:12 +00:00Commented May 26, 2014 at 15:08
-
\$\begingroup\$ I was just curious as I remember having to write over a 100 lines of code in a compilers course assignment in a similar scenario. I did not have the exact statement of that so, I used this problem statement instead. \$\endgroup\$user43115– user431152014年05月26日 15:10:43 +00:00Commented May 26, 2014 at 15:10
-
\$\begingroup\$ Also, how does using enum guarantee that the actual int values of TTT ,..,HHH will be in the range [0,8). I could have other enum types in the program as well. \$\endgroup\$user43115– user431152014年05月26日 16:10:33 +00:00Commented May 26, 2014 at 16:10
-
1\$\begingroup\$ @user43115: The numerical range may be specified somewhere in the C standard (there may be some compilers that use a different range, but
enum
s almost always start at 0). It has also been stated thatenum
s are preferred over#define
s. \$\endgroup\$Jamal– Jamal2014年05月26日 16:26:22 +00:00Commented May 26, 2014 at 16:26
Explore related questions
See similar questions with these tags.