The primary goal of this code is to convert a PDF file to images. I also create a directory under C:/Media
based on a combination of the PDF name and current date and insert the uploaded PDF to the same directory as the images for safe keeping.
I am primarily concerned that I am over-complicating the process by initializing too many variables and the process by which I move the PDF to the newly created directory. I am looking for guidance or advice on how to improve this code.
public class PdfService {
public static void main(String[] args) {
System.out.println("Test of Convert to PNG");
convertPDFtoImage(args[0]);
}
public static void convertPDFtoImage(String PDFFileName) {
try {
// load PDF document
PDFDocument document = new PDFDocument();
File baseFile = new File(PDFFileName);
document.load(baseFile);
//Get todays date as a String so that you can append it to the directory
String df = new SimpleDateFormat("MM-dd-yy").format(new Date());
//Get the filename without extension so that you can create a directory based on the name
Path path = Paths.get(PDFFileName);
String filenameWithExtension = path.getFileName().toString();
String filename = FilenameUtils.removeExtension(filenameWithExtension);
//add the current date to the filename to get the directory name
String dirname = filename + "-" + df;
//Create the directory
Path dir = Paths.get("C:\\Media\\" + dirname);
Files.createDirectory(dir);
//Move the pdf to the newly created directory
movePDFToDirectory(baseFile, dirname, filenameWithExtension);
// create renderer
SimpleRenderer renderer = new SimpleRenderer();
// set resolution (in DPI)
renderer.setResolution(150);
// render
List<Image> images = renderer.render(document);
// write images to files to disk as JPEG
try {
System.out.println("Begin converting PDF...");
for (int i = 0; i < images.size(); i++) {
ImageIO.write((RenderedImage) images.get(i), "jpeg", new File("C:\\Media\\" + dirname + "\\" + filename + "_" + (i + 1) + ".jpeg"));
}
System.out.println("PDF converted to image(s) successfully.");
} catch (IOException e) {
System.out.println("ERROR: " + e.getMessage());
}
} catch (Exception e) {
System.out.println("ERROR: " + e.getMessage());
}
}
public static void movePDFToDirectory(File originalFile, String directory, String filename) {
InputStream inStream = null;
OutputStream outStream = null;
try{
File newFile = new File("C:\\Media\\" + directory + "\\" + filename);
inStream = new FileInputStream(originalFile);
outStream = new FileOutputStream(newFile);
byte[] buffer = new byte[1024];
int length;
//copy the file content in bytes
while ((length = inStream.read(buffer)) > 0){
outStream.write(buffer, 0, length);
}
inStream.close();
outStream.close();
System.out.println("Original PDF has been copied successfully!");
}catch(IOException e){
e.printStackTrace();
}
}
}
Notes:
- Eventually I plan on moving this bit of code to a web application that accepts the image as an upload. As it stands I'm just passing the file location as an argument in the
main
method. - I plan on making the base directory (
C:/Media
) a dynamic variable in the event we move directories.
2 Answers 2
+1 @Marc-Andre and some other notes:
movePDFToDirectory
: There is a similar method in Apache Commons IO:FileUtils.copyFile(java.io.File, java.io.File)
. Fortunately it's a dependency of Ghost4J, so it's on the classpath already, you just need to use it and deletemovePDFToDirectory
.Commons IO is open-source. It's interesting to see the corner cases which it handles, like comparing the
destFile.length()
andsrcFile.length()
to make sure that everything was copied properly.Effective Java, 2nd edition, Item 47: Know and use the libraries (The author mentions only the JDK's built-in libraries but I think the reasoning could be true for other libraries too.)
If somebody calls the program without any argument it throws an
ArrayIndexOutOfBoundsException
. A proper error message with the expected parameters would be more user-friendly and professional.I'd move the
main
method to a separate class. (For example,PdfConverterMain
.) It's usually a good idea to separate a class from its clients.The following line is really hard to follow:
ImageIO.write((RenderedImage) images.get(i), "jpeg", new File("C:\\Media\\" + dirname + "\\" + filename + "_" + (i + 1) + ".jpeg"));
A few explainatory local variable would make it more readable:
RenderedImage renderedImage = (RenderedImage) images.get(i); String formatName = "jpeg"; String outputFilePath = "C:\\Media\\" + dirname + "\\" + filename + "_" + (i + 1) + ".jpeg"; File outputFile = new File(outputFilePath); ImageIO.write(renderedImage, formatName, outputFile);
If you use them readers see immediately what kind of arguments
ImageIO.write
has, they don't have to be familiar with it and they don't have to check the API to understand the intent of the developer.Reference: Chapter 6. Composing Methods, Introduce Explaining Variable in Refactoring: Improving the Design of Existing Code by Martin Fowler:
Put the result of the expression, or parts of the expression, in a temporary variable with a name that explains the purpose.
And Clean Code by Robert C. Martin, G19: Use Explanatory Variables.
Actually, I'd extract out the former snippet to a named method (maybe called
writePage
) whose name summarizes the intent of that block for better readability and easier maintenance.I agree with Marc-Andre,
PDF
in variable and method names should be camelCase, it's easier to read. There is a good paragraph about it in Effective Java, 2nd edition, Item 56: Adhere to generally accepted naming conventions:While uppercase may be more common, a strong argument can made in favor of capitalizing only the first letter: even if multiple acronyms occur back-to-back, you can still tell where one word starts and the next word ends. Which class name would you rather see, HTTPURL or HttpUrl?
-
1\$\begingroup\$ Thank you for the advice, I have implemented your suggestions and my code is definitely cleaner. @Marc-Andre I have accepted this answer because the
FileUtils.copyFile
eliminates the need for my second method althoughtry-with-resource
was definitely great advice. I've upvoted both of your posts as well. Thank you again. \$\endgroup\$Dan– Dan2014年02月24日 01:38:40 +00:00Commented Feb 24, 2014 at 1:38
I see that you're using Java 7, you could use the try-with-resource for all you're IO operation. It will close the stream you're using in your application.
try{
File newFile = new File("C:\\Media\\" + directory + "\\" + filename);
inStream = new FileInputStream(originalFile);
outStream = new FileOutputStream(newFile);
byte[] buffer = new byte[1024];
int length;
//copy the file content in bytes
while ((length = inStream.read(buffer)) > 0){
outStream.write(buffer, 0, length);
}
inStream.close();
outStream.close();
System.out.println("Original PDF has been copied successfully!");
}catch(IOException e){
e.printStackTrace();
}
You're not closing your streams if there is an IOException
. This could lead to some potential problem in a future application. To get ride of the problem you could use the try-with-resource syntax. Your code could look like :
File newFile = new File("C:\\Media\\");
try ( InputStream inStream = new FileInputStream(newFile);
OutputStream outStream = new FileOutputStream(newFile);) {
byte[] buffer = new byte[1024];
int length;
// copy the file content in bytes
while ((length = inStream.read(buffer)) > 0) {
outStream.write(buffer, 0, length);
}
System.out.println("Original PDF has been copied successfully!");
} catch (IOException e) {
e.printStackTrace();
}
Argument name should not start with an uppercase letter : public static void convertPDFtoPNG(String PDFFileName)
. Do you need to capitalize pdf
?
I would revise some of you comments. Comments should be helpful, you don't need to repeat the code. Example :
// load PDF document
PDFDocument document = new PDFDocument();
File baseFile = new File(PDFFileName);
document.load(baseFile);
Does //load PDF document
is really needed ? You're already using document.load
. One liner comment are most of time just repeating what you are doing in code. Other readers don't need to know what you are doing, but rather why are you doing it. I am always wondering myself when I am adding a comment : Does my code is clear enough by itself ? Do I really to add a comment or could I change the method name or variable to better express myself. Comments are hard to maintain when the code is changing.
I know you're not in a production environment yet, but don't forget to change all the e.printStackTrace
for a logging of some sort.
It is worth to know that you can use /
for all of you path, instead of escaping \
. Java will manage the one you need for the OS you're using.
You're saying that your converting a PDF to an image, but it's not quite true. In fact, you are converting a PDF to a jpeg, which is quite different IMHO. You are hard coding the creation of the image as an jpeg, you will probably want to change in the future to support more formats. You will probably extract the writing code of the image into his own method or interface so that you have multiple implementation for each format (if needed, I am not good in image at all).
-
1\$\begingroup\$ Thank you, your suggestions were very helpful as I had no idea about the
try-with-resource
option in Java 7. \$\endgroup\$Dan– Dan2014年02月21日 17:26:36 +00:00Commented Feb 21, 2014 at 17:26 -
1\$\begingroup\$ Try-with-resources is good advice, but
File
is notCloseable
. \$\endgroup\$ntoskrnl– ntoskrnl2014年02月21日 21:03:49 +00:00Commented Feb 21, 2014 at 21:03 -
\$\begingroup\$ @ntoskrnl Thanks, I've overlooked this while typing the answer. \$\endgroup\$Marc-Andre– Marc-Andre2014年02月21日 21:55:16 +00:00Commented Feb 21, 2014 at 21:55