I have an object that represents a PDF file. In the constructor, I pull out various information about the file name and make it available via properties:
public class Invoice
{
//public properties
public string FullPath {get { return this.fullPath;} }
public string FileNameWithoutExtension { get { return this.fileNameWithoutExtension;} }
public string FileName { get { return this.fileName; } }
public string BatchSequenceNumber { get { return this.batchSequenceNumber; } }
//private fields
private string fullPath;
private string fileNameWithoutExtension;
private string fileName;
private string batchSequenceNumber;
//Default construtor
public Invoice(string filePath)
{
this.fullPath = filePath;
this.fileName = Path.GetFileName(filePath);
this.fileNameWithoutExtension = Path.GetFileNameWithoutExtension(filePath);
this.batchSequenceNumber = Path.GetFileNameWithoutExtension(filePath).Split('_').LastOrDefault();
}
}
One of the attributes I would like to get is the number of pages in the PDF.
I am using iTextSharp
and I have a PageCount
method in the above Invoice
class that looks like this:
public int PageCount()
{
PdfReader reader = null;
int pageCount;
try
{
reader = new PdfReader(this.FullPath);
pageCount = reader.NumberOfPages;
}
catch (Exception ex)
{
throw new Exception("Error reading pdf! " + ex.Message);
}
finally
{
if (reader != null) { reader.Close(); }
}
return pageCount;
}
This works fine, except now I need to get the count of pages at several times during the object lifetime. I do not want to keep opening the PDF each time to get the count so these are my ideas:
Rename method
PageCount
toCountPages
and store the result in field/propertyPageCount
.Count the pages in the constructor and store the result in a field/propertry
PageCount
and don't expose aCountPages
method.
Number 1 seems preferable to me, but what if someone tries to read PageCount
before running method CountPages
? How best to handle this situation, set PageCount
to 0 in the constructor?
Number 2 seems bad because it feels wrong to open the PDF file in the constructor, is it? (I have been reading this)
-
1\$\begingroup\$ Some others beat me to answering, but I wanted to mention that this is some of the cleaner code we see around here. Good names and simple code. I like it. \$\endgroup\$RubberDuck– RubberDuck2014年09月10日 14:37:04 +00:00Commented Sep 10, 2014 at 14:37
3 Answers 3
Your intuition is correct. A constructor shouldn't be doing much, other than constructing an object.
I also agree that PageCount
sounds much more like a property than a method... and CountPages()
would be more appropriate for a method that actually counts pages.
Now, the problem would be that the PageCount
getter would return 0
until CountPages()
is called - setting it to 0
in the constructor would only be redundant, since PageCount
would be an int
and an int
gets initialized to default(int)
, which is 0
.
I think your problem stems from the class doing too many things. I'd introduce a InvoicePdfLoader
class exposing some Load(string)
method that returns an immutable struct
:
public class InvoicePdfLoader
{
public InvoiceInfo Load(string path)
{
int pageCount;
using (var reader = new PdfReader(path)) // assuming PdfReader : IDisposable
{
pageCount = reader.NumberOfPages
}
return new InvoiceInfo(path, pageCount);
}
}
Notice the using
block around the reader
instance: if PdfReader
implements the IDisposable
interface, you need to properly dispose it. If it doesn't, the way you have it (manually closing it in a finally
clause) is perfect.
And then InvoiceInfo
is just a simple, lightweight value type:
public struct InvoiceInfo
{
private readonly string _fullPath;
private readonly string _fileNameWithoutExtension;
private readonly string _fileName;
private readonly string _batchSequenceNumber;
private readonly int _pageCount;
public InvoiceInfo(string path, int pageCount)
{
_fullPath = path;
_fileNameWithoutExtension = Path.GetFileNameWithoutExtension(path);
_fileName = Path.GetFileName(path);
_batchSequenceNumber = Path.GetFileNameWithoutExtension(path)
.Split('_')
.LastOrDefault();
_pageCount = pageCount;
}
public string FullPath { get { return _fullPath; } }
public string FileNameWithoutExtension { get { return _fileNameWithoutExtension; } }
public string FileName { get { return _fileName; } }
public string BatchSequenceNumber { get { return _batchSequenceNumber; } }
public int PageCount { get { return _pageCount; } }
}
I find the name InvoiceInfo
better conveys the essence of what you're having here - it's not really an Invoice
, rather just some metadata about an invoice.
-
\$\begingroup\$ Ah yes - I have a habit of trying to squeeze as much as possible into a class! Would it make sense for
InvoicePdfLoader
to become a static class? \$\endgroup\$chazjn– chazjn2014年09月10日 14:51:41 +00:00Commented Sep 10, 2014 at 14:51 -
\$\begingroup\$ As is, it would, but I personally don't like
static
classes much. Being non-static makes classes that depend on it easier to test. \$\endgroup\$Mathieu Guindon– Mathieu Guindon2014年09月10日 14:53:22 +00:00Commented Sep 10, 2014 at 14:53 -
1\$\begingroup\$ Also weirdly
PdfReader
does not implementIDisposable
. I may do try though:class CustomPDFReader : PdfReader, IDisposable { public void Dispose() { this.Close(); } }
But that's for another question... \$\endgroup\$chazjn– chazjn2014年09月10日 15:58:17 +00:00Commented Sep 10, 2014 at 15:58 -
1\$\begingroup\$ I wouldn't do that, unless the
PdfReader
instance was a private field (which it doesn't need to be) - there's nothing wrong with manually closing it in afinally
block ;) \$\endgroup\$Mathieu Guindon– Mathieu Guindon2014年09月10日 16:01:45 +00:00Commented Sep 10, 2014 at 16:01
Assuming you want to avoid opening the file unless necessary (ie until someone requests the PageCount), I would use a nullable private field, as follows:
private int? _pageCount;
public int PageCount
{
get
{
if (!_pageCount.HasValue) {
//existing code to determine page count
_pageCount = /* result */
}
return _pageCount;
}
}
The first time the property is accessed, the private member will be null, so the code to update the value will be run. Subsequent accesses will use the cached value.
-
1\$\begingroup\$ +1 - the key here is assuming you want to avoid opening the file unless necessary. \$\endgroup\$Mathieu Guindon– Mathieu Guindon2014年09月10日 14:50:09 +00:00Commented Sep 10, 2014 at 14:50
-
1\$\begingroup\$ Lazy properties is where I'd go with it as well. However, I would join all of these properties together (page count, word count, file size, whatever) in a single initializer to be called when one of the properties is accessed. Otherwise reading n closely related properties will still open/close the document n times. \$\endgroup\$Naltharial– Naltharial2014年09月10日 17:36:27 +00:00Commented Sep 10, 2014 at 17:36
-
2
Wouldn't it be better to replace the private field/public property structure with a public property that has a private setter, like this:
public string FullPath { get; private set; }
Makes the code more concise.
-
1\$\begingroup\$ Good catch! Auto-properties FTW! \$\endgroup\$Mathieu Guindon– Mathieu Guindon2014年09月10日 15:12:06 +00:00Commented Sep 10, 2014 at 15:12
-
1\$\begingroup\$ @chazjn do note, however, that a
private readonly
private field conveys intent in a more unambiguous way; I wouldn't use such auto-properties in an immutablestruct
. \$\endgroup\$Mathieu Guindon– Mathieu Guindon2014年09月10日 15:44:56 +00:00Commented Sep 10, 2014 at 15:44