###General
General
###General
General
After using poor man profiling (using Stopwatch) I figured that the provided StripHighBitCharacters()
method using linq took around 39 seconds.
Using just a loop and an if
like so
public void StripHighBitCharacters(string value)
{
foreach (var c in value)
{
if (c > 31 && c < 176)
{
sb.Append(c);
}
}
sb.AppendLine();
}
the measurements went better. It took only 22 seconds.
Both tests had been done using a file with 1.3 GB.
After using poor man profiling (using Stopwatch) I figured that the provided StripHighBitCharacters()
method using linq took around 39 seconds.
Using just a loop and an if
like so
public void StripHighBitCharacters(string value)
{
foreach (var c in value)
{
if (c > 31 && c < 176)
{
sb.Append(c);
}
}
sb.AppendLine();
}
the measurements went better. It took only 22 seconds.
Both tests had been done using a file with 1.3 GB.
###General
- You are using
using
statements which is always good. - You have small and well named methods which is good as well but the
StripHighBitCharacters()
method doesn't do what the name implies. TheBuildClearString()
method is doing whatStripHighBitCharacters()
should do based on its name. - The method parameter of
StripHighBitCharacters
is poorly named. Why don't you name itfileName
? - You should be consistent with the usage of the
var
type. Why didn't you use it e.g for thestring writeFile
?
@1201ProgramAlarm mentioned in his/her answer reusing the StringBuilder
which is the way to go for a performance boost but I would take this further.
I would initialize the
StringBuilder
with a starting capacity of at least 4 kb, because usually your filesystem is storing its data in 4 kb blocks. But because you expect to get real big files you should increase the capacity to e.g 4mb.Instead of creating a new file with a filename of
fn + "B"
you should usePath.GetTempFileName()
and after the content is written delete the original and move the temp file to the original destination.
Implementing the mentioned points will lead to
private const int maxCapacity = 4096 * 1024;
private StringBuilder sb = new StringBuilder(maxCapacity);
public void CleanFile(string fileName)
{
var tempFileName = Path.GetTempFileName();
using (var reader = new StreamReader(fileName))
using (var writer = new StreamWriter(tempFileName))
{
sb.Length = 0;
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (line.Length + sb.Length > maxCapacity)
{
writer.Write(sb.ToString());
sb.Length = 0;
}
StripHighBitCharacters(line);
}
}
File.Delete(fileName);
File.Move(tempFileName, fileName);
}
public void StripHighBitCharacters(string value)
{
foreach (var c in value.Where(c => c > 31 && c < 176))
{
sb.Append(c);
}
sb.AppendLine();
}