$\begingroup$
$\endgroup$
5
This is in relation to this post I made. I eventually solved this by the following approach:
- Take the un-ordered file with all the purchasing data and use the UNIX
sort
utility to order the file on the customer identifier. Since the file can be 500 million lines (and grows over time) I used the-T
option and directed the utility to use my hard drive to carry out it's sorting computation. Thesort
utility uses external sort to do the sorting. - I created a program which took this ordered file and in one pass computed the required values (a single for-loop, no nested for-loops)
Step 2 is O(n) clearly. But what about Step 1? What is the complexity here and more to the point of the question, what is the overall complexity of this two-stage solution?
sort
uses for "external sorting", whatever that means. $\endgroup$Ω((N/M)log(N/M)/log(R))
. But how do I use this to compute the overall complexity? $\endgroup$