I have these four different ways by which I am inserting some file data into an Oracle database using JDBC and PreparedStatements.
Approach A
ps.setBytes(1, fileDataInByteArray);
Approach B
Blob blob = ps.getConnection().createBlob(); blob.setBytes(1, fileDataInByteArray); ps.setBlob(1, blob);
Approach C
ps.setBinaryStream(1, fileDataAsInputStream);
Approach D
ps.setBlob(1, fileDataAsInputStream);
In general, Approach C & D seem to be very useful for large fileData. A & B can run out of memory, but obviously C & D are more reliable.
However, when the file size is smaller the danger of OOM is not there. I find that A & B take less time for the database insert. I suppose that’s because when the data is a byteArray, it’s much faster.
What can I do to speed up the insert when using C&D?
I have tried wrapping the input stream with a BufferedInputSteam. I played around with the buffer size. But that did not help.
Are we at the mercy of how fast Oracle can read the data? Can that be speeded up?
-
3This should not have been migrated, it's primarily a coding question.Paul W– Paul W2023年07月06日 12:29:53 +00:00Commented Jul 6, 2023 at 12:29
-
To understand if anything can be improved, you'll need to understand where your performance bottleneck is: is it CPU on the app server or the DB? Is it network latency? Is it disk I/O on the app server or the DB? Performance is highly dependent on your specific architecture and available infrastructure and I doubt we can tell you definitively what will work best. You'll likely need the assistance of a sysadmin and DBA for your app to do complete troubleshooting.pmdba– pmdba2023年07月06日 22:17:34 +00:00Commented Jul 6, 2023 at 22:17
-
In Java, presumably?Peter Mortensen– Peter Mortensen2023年07月18日 21:23:12 +00:00Commented Jul 18, 2023 at 21:23
-
I bet A and B are not faster: you probably don't measure the time to load the fileDataInByteArray, all 4 ops are I/O bounded if you consider the full path from file system to db.p3consulting– p3consulting2023年07月21日 12:54:21 +00:00Commented Jul 21, 2023 at 12:54
-
@p3consulting I dont think that was the caseRaster R– Raster R2023年07月22日 05:27:46 +00:00Commented Jul 22, 2023 at 5:27
1 Answer 1
I think Wireshark can nowadays interpret Oracle's SQL*Net protocol. Just capture traffic between App and DB server and you will see what's going on on the wire. Like how many round trips each approach requires. Also you can trace database session and look what database is doing. IMHO some of your approaches might require store LOB data in temporary tablespace(if size exceeds certain threshold)
In case of Approach B data have to be copied from TEMP into datafile after statement is executed.
Oracle's approach to LOB is slightly different from other databases. Oracle's LOB is a object backed in (temp)datafile - you can not deal with LOBs in Java directly. LOB can be much larger than your Java HEAP. Certain operations create so called "LOB Locator" - something like file handle. This locator can point either to persistent datafile or TEMP tablespace, what you have JVM is just proxy object for this locator.
Over the time Oracle's JDBC drivers did evolve in the way how they try to map JDBC standard interface onto Oracle's LOB internal implementation. That's why is such a mess and there are to many ways to store LOB in Oracle database.
PS: I'm surprised that Approach C,D are slower, they are closest to the way how Oracle treats LOBs internally. Approach B calls DBMS_LOB.CREATE_TEMPORARY in background and copies data from TEMP after statement is executed. Also it is possible that when data size is smaller than 4000B then JDBC driver might not use LOBs at all, they can rely in implicit datatype conversion.
-
Thanks for sharing. Will work on it further and get back if I find something moreRaster R– Raster R2023年07月13日 08:19:06 +00:00Commented Jul 13, 2023 at 8:19