Restucture code to wrap optional resources in a using block

Question 1

I have some C# I/O code that can take one or two input files. I would like to wrap the Stream objects in a using block, but cannot find a way to succinctly express this in code.

Currently, I have two large files (>1GB), that contain concatenated TIFF files and PDF files respectively that need to be extracted into individual files. The metadata of the individual files (TIFF Tags and PDF Keywords) cross-reference one another, so the processing rules are different if I receive both files at once, i.e. there is more information and verification logic.

I have a FileProcessor class than implements an IEnumerable<Range> to return the byte ranges of each individual file within the archive. My current implementation just wraps the main processing loop in a try/finally block and calls Dispose manually.

// Struct to represent the byte range of a file in the archive
struct Range
{
 public long start;
 public long end;
}
// Custom class that takes a Stream object in the constructor and implements IEnumerable<Range> to return the individual files in sequence
public class FileProcessor : IDisposable, IEnumerable, IEnumerable<Range>
{
 private Stream _stream;
 public FileProcessor(Stream stream) { this._stream = stream; }
 public virtual void Dispose()
 {
 if (_stream != null) { _stream.Dispose(); _stream = null; }
 }
}
public class TiffProcessor : FileProcessor { ... }
public class PdfProcessor : FileProcessor { ... }
// Snippet of the dispatching/processing logic
TiffProcessor infile1 = null,
PdfProcessor infile2 = null;
try
{
 if (HasFirstInputFile)
 {
 infile1 = new TiffProcessor(File.OpenRead(FirstInputFileName));
 }
 if (HasSecondInputFile)
 {
 infile2 = new PdfProcessor(File.OpenRead(SecondInputFileName));
 }
 if (infile1 != null && infile2 == null) 
 {
 foreach (var range in infile1) { ... }
 }
 if (infile1 == null && infile2 != null) 
 {
 foreach (var range in infile2) { ... }
 }
 if (infile1 != null && infile2 != null) 
 {
 foreach (var ranges in infile1.Zip(infile2, (tiff, pdf) => new Range[] { tiff, pdf })) { ... }
 }
}
finally
{
 if (infile1 != null) { infile1.Dispose(); }
 if (infile2 != null) { infile2.Dispose(); }
}

Is there any construct that can help clean up and organize this code structure. It's not too bad now, but could become exponentially more complex if additional input streams are required in the future.

Edit

Based on the comments received, perhaps creating an intermediate Strategy object that managed resources would work?

using (var strategy = StrategyFactory.Create(CommandLineArgs))
{
 strategy.Process();
}
public static class StrategyFactory
{
 public static IStrategy Create(CommandLineArguments args)
 {
 if (args.FirstInputFile != null && args.SecondInputFile == null)
 { 
 return new FirstStrategy(args.FirstInputFile);
 }
 if (args.FirstInputFile == null && args.SecondInputFile != null)
 { 
 return new SecondStrategy(args.SecondInputFile);
 }
 if (args.FirstInputFile != null && args.SecondInputFile != null)
 { 
 return new ThirdStrategy(args.FirstInputFile, args.SecondInputFile);
 }
 }
}

Question 2

If you expect additional input streams in the future, then you should treat the input as a collection of files, not something like "maybe one, maybe two, maybe three, possibly four files".

Question 3

Unfortunately, not possible. Each input file is in a different, semi-structured text format. The processing needs are different based on combination of the files. Ideally, I'd like to do some sort of dynamic dispatch or Strategy pattern based on the file format combination.

Question 4

How about having an abstract superclass that implements the shared code and have three subclasses for the first, second or both files?

Question 5

Well, after some thought I think my solution will end up being four-fold: 1) have an abstract superclass that implements IDisposable. 2) Create a set of small classes to use as stongly-typed input parameters, 3) Have each strategy inherit from the superclass and implement a constructor using the classes from (2), and 4) Implement the Factory.Create method to use reflection to iterate over all the inherited class and their constructors and then instantiate the class that can consume the most non-null input parameter types via Activator.CreateInstance.

Question 6

If each input file is in a different format, how come you use the same FileProcessor class to parse them? And parsed items are also of the same type, since you Zip them, right? It would be helpful if you would add some background on what you are trying to do.

Question 7

The only think that I would do different is that I wouldn't dispose streams inside your FileProcessor classes. Disposing of resources should be done at the same level (or scope) where they were initialized. Unfortunately, even BCL classes like StreamWriter and BinaryWriter dispose underlying streams when disposed, so I cannot claim that this is unexpected behavior either.

IMO, there is nothing wrong with your code. Going out of your way simply to save two lines of code might make your code less readable, so your first version might easily be the simplest one.

Having said that, if you really want to wrap it in a single using block, one idea (and I don't actually find it "better" than yours in any way) might be to wrap all your disposable resources into a single class:

public class InputFiles : IDisposable
{
 private readonly Dictionary<string, Stream> _files 
 = new Dictionary<string, Stream>();
 public Stream GetStream(string type)
 {
 Stream input = null;
 _files.TryGetValue(type, out input);
 return input;
 }
 public InputFiles(string[] args)
 {
 // this is just an idea, you probably don't
 // use the extension to determine the type,
 // but it still seems a bit more general than
 // having strongly typed properties
 foreach (var path in args)
 {
 var ext = Path.GetExtension(path); // pdf or tif?
 _files[ext] = File.OpenRead(path);
 }
 }
 public void Dispose()
 {
 foreach (var stream in _files)
 stream.Value.Dispose();
 }
}

And then dispose the entire object when done:

static void Main(string[] args)
{
 using (var input = new InputFiles(args))
 {
 var pdf = input.GetStream("pdf");
 var tif = input.GetStream("tif");
 // use the appropriate strategy
 // (parsing strategy should not be concerned with
 // disposing of resources)
 Process(pdf, tif);
 }
}

Question 8

I like this since it's a totally different approach. I had not considered moving the management of the files into a stand-alone class. I might give this a try....Also, your comment nails the core problem on the head -- the parsing strategy should not have to manage the resources.

Question 9

I ended up using this approach and am pretty happy with it. It's nice that there's no resource management in the different strategy implementations. I did go with a strongly-typed implementation for stream lookups, i.e. var pdf = input.GetStream(typeof(PdfArchiveFile));

vgru vgru 1,39811 silver badges18 bronze badges · Accepted Answer · 2012-07-12 15:17:45Z

The only think that I would do different is that I wouldn't dispose streams inside your FileProcessor classes. Disposing of resources should be done at the same level (or scope) where they were initialized. Unfortunately, even BCL classes like StreamWriter and BinaryWriter dispose underlying streams when disposed, so I cannot claim that this is unexpected behavior either.

IMO, there is nothing wrong with your code. Going out of your way simply to save two lines of code might make your code less readable, so your first version might easily be the simplest one.

Having said that, if you really want to wrap it in a single using block, one idea (and I don't actually find it "better" than yours in any way) might be to wrap all your disposable resources into a single class:

public class InputFiles : IDisposable
{
 private readonly Dictionary<string, Stream> _files 
 = new Dictionary<string, Stream>();
 public Stream GetStream(string type)
 {
 Stream input = null;
 _files.TryGetValue(type, out input);
 return input;
 }
 public InputFiles(string[] args)
 {
 // this is just an idea, you probably don't
 // use the extension to determine the type,
 // but it still seems a bit more general than
 // having strongly typed properties
 foreach (var path in args)
 {
 var ext = Path.GetExtension(path); // pdf or tif?
 _files[ext] = File.OpenRead(path);
 }
 }
 public void Dispose()
 {
 foreach (var stream in _files)
 stream.Value.Dispose();
 }
}

And then dispose the entire object when done:

static void Main(string[] args)
{
 using (var input = new InputFiles(args))
 {
 var pdf = input.GetStream("pdf");
 var tif = input.GetStream("tif");
 // use the appropriate strategy
 // (parsing strategy should not be concerned with
 // disposing of resources)
 Process(pdf, tif);
 }
}

I like this since it's a totally different approach. I had not considered moving the management of the files into a stand-alone class. I might give this a try....Also, your comment nails the core problem on the head -- the parsing strategy should not have to manage the resources.
I ended up using this approach and am pretty happy with it. It's nice that there's no resource management in the different strategy implementations. I did go with a strongly-typed implementation for stream lookups, i.e. var pdf = input.GetStream(typeof(PdfArchiveFile));

Stack Exchange Network

Restucture code to wrap optional resources in a using block

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Restucture code to wrap optional resources in a using block

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions