I have a method which extracts filedata
and converts it into a String
array:
import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.DataInputStream;
import org.apache.james.mime4j.message.BodyPart;
import org.apache.james.mime4j.message.Message;
import org.apache.james.mime4j.message.Multipart;
import org.apache.james.mime4j.message.TextBody;
protected String[] extractLedesText(byte[] fileData) {
// Remove the BOM if present
byte[] array = { (byte) 0xEF, (byte) 0xBB, (byte) 0xBF };
byte[] data = { fileData[0], fileData[1], fileData[2] };
if (fileData.length > 3 && Arrays.equals(data, array)) {
fileData = ArrayUtils.subarray(fileData, 3, fileData.length-1);
}
String ledes = new String(fileData);
if (ledes.startsWith("MIME")) {
try {
ledes = null;
Message signed = new Message(new ByteArrayInputStream(fileData));
for (BodyPart part : ((Multipart) signed.getBody()).getBodyParts()) {
if (part.getMimeType().equalsIgnoreCase("text/plain")) {
TextBody tb = (TextBody) part.getBody();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
tb.writeTo(baos);
return extractLedesText(baos.toByteArray());
}
}
throw new BaseApplicationException(
"No MIME part found with MIME type of 'text/plain' while parsing submitted invoice file.");
} catch (IOException ioe) {
throw new BaseApplicationException(ioe);
}
} else {
return ledes.split("\\[]");
}
}
For example, below is the leads file:
LEDES98BI V2[] INVOICE_DATE|INVOICE_NUMBER|CLIENT_ID|LAW_FIRM_MATTER_ID|INVOICE_TOTAL[] 20150301|INV-Error_Test1|160|LF_MAT_1221|22[] 20150301|INV-Error_Test1|160|LF_MAT_1221|22[] 20150301|INV-Error_Test1|160|LF_MAT_1221|22[]
The extractLedesText
method converts the above file data to string array of lines.
We recently upgraded to Java 8 and am wondering whether this method can be optimized further.
1 Answer 1
It looks like extractLedesText()
is trying to do three things:
- Strip away any BOM from
fileData
. - If this is a MIME message, extract only the
TextBody
part and pass that recursively into this method again. - Else just do a
split("\\[]")
to get our desiredString[]
array.
So...
Data massaging
You can use a helper method to achieve this:
private static byte[] filterBOM(byte[] fileData) {
if (fileData.length < 3) {
return fileData;
}
final byte[] array = { (byte) 0xEF, (byte) 0xBB, (byte) 0xBF };
final byte[] data = { fileData[0], fileData[1], fileData[2] };
return fileData.length > 3 && Arrays.equals(data, array) ?
Arrays.copyRangeOf(fileData, 3, fileData.length) : fileData;
}
I think it may be better to do a fileData.length < 3
check before you construct your data
array. Also, I am using the Arrays
utility class instead of ArrayUtils
to copy part of the array.
Extracting the TextBody
from a MIME message
You can make use of try-with-resources
for both your ByteArrayInputStream
and ByteArrayOutputStream
instances, and I suppose this is where we can employ a bit of Stream trickery...
private static byte[] getTextBody(final String ledes) {
final TextBody tb;
try (final InputStream input = new ByteArrayInputStream(ledes.getBytes(StandardCharsets.UTF_8))) {
tb = ((Multipart) new Message(input).getBody()).getBodyParts().stream()
.filter(part -> part.getMimeType().equalsIgnoreCase("text/plain"))
.findFirst().orElseThrow(() -> new BaseApplicationException(
"No MIME part found with MIME type of 'text/plain' while parsing submitted invoice file."))
.getBody();
} catch (IOException e) {
throw new BaseApplicationException(e);
}
try (final ByteArrayOutputStream output = new ByteArrayOutputStream()) {
tb.writeTo(output);
return output.toByteArray();
} catch (IOException e) {
throw new BaseApplicationException(e);
}
}
We stream on the List
from getBodyParts()
, filtering those where their MIME type is "text/plain"
and then look for the first matching BodyPart
. If there isn't any, we call orElseThrow()
with a new BaseApplicationException(...)
, otherwise we will getBody()
and writeTo()
your ByteArrayOutputStream
. The output of this method is a byte[]
array if there are no thrown Exception
s.
Putting it all together
protected String[] extractLedesText(byte[] fileData) {
final String ledes = new String(filterBOM(fileData));
return ledes.startsWith("MIME") ? extractLedesText(getTextBody(ledes)) : ledes.split("\\[]");
}
Message
class? And the other Javadocs... \$\endgroup\$