Skip to main content
Code Review

Return to Answer

Commonmark migration
Source Link

###General

General

###General

General

added 550 characters in body
Source Link
Heslacher
  • 50.9k
  • 5
  • 83
  • 177

After using poor man profiling (using Stopwatch) I figured that the provided StripHighBitCharacters() method using linq took around 39 seconds.

Using just a loop and an if like so

public void StripHighBitCharacters(string value)
{
 foreach (var c in value)
 {
 if (c > 31 && c < 176)
 {
 sb.Append(c);
 }
 }
 sb.AppendLine();
} 

the measurements went better. It took only 22 seconds.

Both tests had been done using a file with 1.3 GB.

After using poor man profiling (using Stopwatch) I figured that the provided StripHighBitCharacters() method using linq took around 39 seconds.

Using just a loop and an if like so

public void StripHighBitCharacters(string value)
{
 foreach (var c in value)
 {
 if (c > 31 && c < 176)
 {
 sb.Append(c);
 }
 }
 sb.AppendLine();
} 

the measurements went better. It took only 22 seconds.

Both tests had been done using a file with 1.3 GB.

Source Link
Heslacher
  • 50.9k
  • 5
  • 83
  • 177

###General

  • You are using using statements which is always good.
  • You have small and well named methods which is good as well but the StripHighBitCharacters() method doesn't do what the name implies. The BuildClearString() method is doing what StripHighBitCharacters() should do based on its name.
  • The method parameter of StripHighBitCharacters is poorly named. Why don't you name it fileName ?
  • You should be consistent with the usage of the var type. Why didn't you use it e.g for the string writeFile ?

@1201ProgramAlarm mentioned in his/her answer reusing the StringBuilder which is the way to go for a performance boost but I would take this further.

  • I would initialize the StringBuilder with a starting capacity of at least 4 kb, because usually your filesystem is storing its data in 4 kb blocks. But because you expect to get real big files you should increase the capacity to e.g 4mb.

  • Instead of creating a new file with a filename of fn + "B" you should use Path.GetTempFileName() and after the content is written delete the original and move the temp file to the original destination.

Implementing the mentioned points will lead to

private const int maxCapacity = 4096 * 1024;
private StringBuilder sb = new StringBuilder(maxCapacity);
public void CleanFile(string fileName)
{
 var tempFileName = Path.GetTempFileName();
 using (var reader = new StreamReader(fileName))
 using (var writer = new StreamWriter(tempFileName))
 {
 sb.Length = 0;
 while (!reader.EndOfStream)
 {
 var line = reader.ReadLine();
 if (line.Length + sb.Length > maxCapacity)
 {
 writer.Write(sb.ToString());
 sb.Length = 0;
 }
 StripHighBitCharacters(line);
 }
 }
 
 File.Delete(fileName);
 File.Move(tempFileName, fileName);
}
public void StripHighBitCharacters(string value)
{
 foreach (var c in value.Where(c => c > 31 && c < 176))
 {
 sb.Append(c);
 }
 sb.AppendLine();
}
lang-cs

AltStyle によって変換されたページ (->オリジナル) /