Iterate over a file multiple times

Question 1

Aim of the code is to print all of the strings from one file that match strings from another file. Names of both of the files are provided by command line arguments.

Code:

int main(int argc, char *argv[]){ 
 ifstream answers(argv[1]);
 ifstream candidates(argv[2]);
 for (string s; getline(answers,s);){
 for (string h; getline(candidates,h);){
 if (!s.compare(h)){
 cout << h << ":" << s << endl;
 }
 }
 candidates.close(); //I know theres better than this
 candidates.open(argv[2], ios::in); 
 }
}

but I feel like reloading the file into memory every time is redundant. Is there anything that could be improved?

Question 2

I removed the c tag. Either you compile it as C++, or as C, never both.

Question 3

@John, could you please add problem statement? It is really hard to see what you're trying to accomplish, especially after very confusing if (!h.compare(h)), which is false all the time.

Question 4

@Incomputable there you go, sorry

Question 5

@John, please add what you want accomplish as well. It seems like you want to print mismatches in files. I don't want to edit, since I'm not 100% sure.

Question 6

@Incomputable I want to print the strings that match, the compare() method returns 0 if two strings are equal, so !0 = True. if equal, print.

Question 7

Slightly better way:

 candidates.close(); //I know theres better than this

If improvement is based on this statement, then something like stringstream would be good fit.

Best way:

std::map<std::string, std::size_t> will fit the job nicely. After you've done with it, just iterate through and see if anything has counter equal to 2 or higher, and print those.

Roughly this:

std::map<std::string, std::size_t> appearance_count;
while (std::getline(answers, s)) 
{
 ++appearance_count[s];
}
while (std::getline(candidates, s)) 
{
 ++appearance_count[s];
}
for (const auto& reading: appearance_count)
{
 if (reading.second > 1)
 {
 std::cout << reading.first << '\n';
 }
}

Some edge cases:

There might be duplications in the first file, so it will require first adding into std::set, then adding to the map. Second file is unaffected by that, since any string with appearance count larger than 1 is already wanted. Though if the second file contains duplicates as well, you'll need two sets.

Question 8

Hey! this looks promising! will try it right now. sorry again, bug caused when moving code here.

Question 9

@John, don't expect it to be working like a charm :) I've written it right into the answer box, so it might not even compile. In the future, just copy paste, highlight, hit ctrl-k. It will format it for you.

Question 10

tanks for the ctrl-k tip! by the way, this thing is super fast! its great! now, I am trying to do a hash lookup table. so what I am comparing is md5(candidates) vs answers. is there a way I could store md5(candidates):candidates ie. as to know what was the plaintext?

Question 11

@John Sorry, but what you say doesn't make any sense to me. I believe there is std::unordered_map, which is a hashmap. So you hash every string, not a file.

Question 12

lets say I have a file full of md5 hashes, and a file full of words. I want to know what hash = md5(word).

Incomputable Incomputable 9,7143 gold badges34 silver badges73 bronze badges · Accepted Answer · 2017-03-02 14:51:26Z

Slightly better way:

 candidates.close(); //I know theres better than this

If improvement is based on this statement, then something like stringstream would be good fit.

Best way:

std::map<std::string, std::size_t> will fit the job nicely. After you've done with it, just iterate through and see if anything has counter equal to 2 or higher, and print those.

Roughly this:

std::map<std::string, std::size_t> appearance_count;
while (std::getline(answers, s)) 
{
 ++appearance_count[s];
}
while (std::getline(candidates, s)) 
{
 ++appearance_count[s];
}
for (const auto& reading: appearance_count)
{
 if (reading.second > 1)
 {
 std::cout << reading.first << '\n';
 }
}

Some edge cases:

There might be duplications in the first file, so it will require first adding into std::set, then adding to the map. Second file is unaffected by that, since any string with appearance count larger than 1 is already wanted. Though if the second file contains duplicates as well, you'll need two sets.

Hey! this looks promising! will try it right now. sorry again, bug caused when moving code here.
@John, don't expect it to be working like a charm :) I've written it right into the answer box, so it might not even compile. In the future, just copy paste, highlight, hit ctrl-k. It will format it for you.
tanks for the ctrl-k tip! by the way, this thing is super fast! its great! now, I am trying to do a hash lookup table. so what I am comparing is md5(candidates) vs answers. is there a way I could store md5(candidates):candidates ie. as to know what was the plaintext?
@John Sorry, but what you say doesn't make any sense to me. I believe there is std::unordered_map, which is a hashmap. So you hash every string, not a file.
lets say I have a file full of md5 hashes, and a file full of words. I want to know what hash = md5(word).

Stack Exchange Network

Iterate over a file multiple times

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Iterate over a file multiple times

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions