Can JDBC Oracle file stream blob inserts be speeded up?

Question 1

I have these four different ways by which I am inserting some file data into an Oracle database using JDBC and PreparedStatements.

Approach A
```
 ps.setBytes(1, fileDataInByteArray);
```

Approach B

Blob blob = ps.getConnection().createBlob();
blob.setBytes(1, fileDataInByteArray);
ps.setBlob(1, blob);

Approach C

ps.setBinaryStream(1, fileDataAsInputStream);

Approach D
```
ps.setBlob(1, fileDataAsInputStream);
```

In general, Approach C & D seem to be very useful for large fileData. A & B can run out of memory, but obviously C & D are more reliable.

However, when the file size is smaller the danger of OOM is not there. I find that A & B take less time for the database insert. I suppose that’s because when the data is a byteArray, it’s much faster.

What can I do to speed up the insert when using C&D?

I have tried wrapping the input stream with a BufferedInputSteam. I played around with the buffer size. But that did not help.

Are we at the mercy of how fast Oracle can read the data? Can that be speeded up?

Question 2

This should not have been migrated, it's primarily a coding question.

Question 3

To understand if anything can be improved, you'll need to understand where your performance bottleneck is: is it CPU on the app server or the DB? Is it network latency? Is it disk I/O on the app server or the DB? Performance is highly dependent on your specific architecture and available infrastructure and I doubt we can tell you definitively what will work best. You'll likely need the assistance of a sysadmin and DBA for your app to do complete troubleshooting.

Question 4

In Java, presumably?

Question 5

I bet A and B are not faster: you probably don't measure the time to load the fileDataInByteArray, all 4 ops are I/O bounded if you consider the full path from file system to db.

Question 6

@p3consulting I dont think that was the case

Question 7

I think Wireshark can nowadays interpret Oracle's SQL*Net protocol. Just capture traffic between App and DB server and you will see what's going on on the wire. Like how many round trips each approach requires. Also you can trace database session and look what database is doing. IMHO some of your approaches might require store LOB data in temporary tablespace(if size exceeds certain threshold)

In case of Approach B data have to be copied from TEMP into datafile after statement is executed.

Oracle's approach to LOB is slightly different from other databases. Oracle's LOB is a object backed in (temp)datafile - you can not deal with LOBs in Java directly. LOB can be much larger than your Java HEAP. Certain operations create so called "LOB Locator" - something like file handle. This locator can point either to persistent datafile or TEMP tablespace, what you have JVM is just proxy object for this locator.

Over the time Oracle's JDBC drivers did evolve in the way how they try to map JDBC standard interface onto Oracle's LOB internal implementation. That's why is such a mess and there are to many ways to store LOB in Oracle database.

PS: I'm surprised that Approach C,D are slower, they are closest to the way how Oracle treats LOBs internally. Approach B calls DBMS_LOB.CREATE_TEMPORARY in background and copies data from TEMP after statement is executed. Also it is possible that when data size is smaller than 4000B then JDBC driver might not use LOBs at all, they can rely in implicit datatype conversion.

Question 8

Thanks for sharing. Will work on it further and get back if I find something more

ibre5041 ibre5041 1,5848 silver badges14 bronze badges · Answer 1 · 2023-07-13 08:02:24Z

I think Wireshark can nowadays interpret Oracle's SQL*Net protocol. Just capture traffic between App and DB server and you will see what's going on on the wire. Like how many round trips each approach requires. Also you can trace database session and look what database is doing. IMHO some of your approaches might require store LOB data in temporary tablespace(if size exceeds certain threshold)

In case of Approach B data have to be copied from TEMP into datafile after statement is executed.

Oracle's approach to LOB is slightly different from other databases. Oracle's LOB is a object backed in (temp)datafile - you can not deal with LOBs in Java directly. LOB can be much larger than your Java HEAP. Certain operations create so called "LOB Locator" - something like file handle. This locator can point either to persistent datafile or TEMP tablespace, what you have JVM is just proxy object for this locator.

Over the time Oracle's JDBC drivers did evolve in the way how they try to map JDBC standard interface onto Oracle's LOB internal implementation. That's why is such a mess and there are to many ways to store LOB in Oracle database.

PS: I'm surprised that Approach C,D are slower, they are closest to the way how Oracle treats LOBs internally. Approach B calls DBMS_LOB.CREATE_TEMPORARY in background and copies data from TEMP after statement is executed. Also it is possible that when data size is smaller than 4000B then JDBC driver might not use LOBs at all, they can rely in implicit datatype conversion.

Thanks for sharing. Will work on it further and get back if I find something more

Stack Exchange Network

Can JDBC Oracle file stream blob inserts be speeded up?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Can JDBC Oracle file stream blob inserts be speeded up?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions