Categories
Coding

Blob Hell

JDBC 4.0 Blob improvements are welcome

I was happy to see that the JDBC 4.0 spec will contain support for improved BLOB and CLOB handling. Blob handling is still one of the most awkward areas of cross-database implementation work. It would be nice to have a consistent interface for blob creation, update, and streaming across databases. Ostensibly, that’s what we have in JDBC 3 . However, in practise, things are not so simple.

One of the fundamental issues is how the RDBMS handles LOB data. To perform an INSERT or UPDATE of blob data, the RDBMS hands the client back a pointer or “locator” to the blob data. The manner in which they do this varies across implementations and vendors. This means in practise that you have different limitations depending on which DBMS you are using. Oracle has always been a culprit (until version 10) with its infamous 4k limit on streaming data, which has resulted in hundreds of applications and frameworks (Hibernate among them) having to write Oracle-specific JDBC-level code to work around this issue.

We are currently working on a system where we have written a server component which accepts standard FTP connections from clients. The clients send large binary files via FTP, which are streamed and stored into the database within a single transaction. Various consistency checks are performed along the way, and if any of the checks fail, an error is returned to the client and the transaction is rolled back. This means that if the client receives a successful return, they can be sure that the data is stored and ready to be processed.

The main problem has been with trying to successfully stream the blob data in a clean and portable way. Our production database is Oracle, but it’s nice to be able to test on say, MySQL or HSQL. If you want to go straight to JDBC, you can use Oracle’s empty_blob() function, grab a blob locator from that, and then use the oracle.sql.BLOB type’s getBinaryOutputStream() to get an OutputStream to write to. You can then chunk the data from a socket directly to the database server. However, achieving this at a higher level is altogether more challenging. Thankfully, Spring has some streaming LOB support, which is what we have based our current solution on.

Leave a Reply