DevX: “Use Stored Procedures for data access” – my comments

DexX have an an article on their site titled Use Stored Procedures for Java Persistence”. The article proposes the use of Stored Procedures for data access in Java applications to address some of the issues that Java applications commonly have with accessing data.

The problem with the article is that I don’t think it is addressing the main problems, but rather ignoring the problems that are most significant. The main issues the author identifies are:

  1. “Client/server data shuffling” – moving large quantities of unnecessary data between the database and the application. Unnecessary? Then surely the solution to this is to examine your queries and make them more specific. If you are retrieving too much data then you are simply asking for too much data. The database does not arbitrarily return data; it returns what you ask for.
  2. “Transactions opened from outside the database” – I am not sure why this is a problem. If I am developing an application that is accessing multiple transactional sources, for example more than one database and JMS message destinations, I want my application to be in control of the transaction, not the database because that will be at too fine a level, and will not take into account my other transactional sources, which the database will a) not have visibility to, and b) not be able to control the transaction involving them.
  3. “Tight coupling of database schema and Java code: The database structure is exposed to the Java layer. Hence, future changes to database structure may require changes to the Java code. As with Java class structure and member data, the database structure should be abstracted from other layers in the software” – this is one point that I do agree with. However, assuming that you have modelled your business domain in Java Classes so you have a representation of your domain in Java, then this is always going to be tied to the same relationships which presumably you are also going to want to store and express in your relational tables. I haven’t seen any significant enterprise system yet where I have not implemented relationships in data as relationships in my Java code in one form or another – if in my domain model a car has wheels, then I am likely to have a ‘Car’ class and a ‘Wheel’ class, with the Car class containing a collection of ‘Wheel’s. I could ignore this and chose to not implement it this way in Java, but if this is the foundation of my business model then it would seem futile to ignore the fact. So my point is, even if you retrieve data via a Stored Procedure, you still cannot ignore the attributes and relationships of your data, so I do not see how using a Stored Procedure will help you with decoupling (although it will make it less obvious to the developer calling the Stored Procedure where the data is coming from).
  4. “Software release coordination: Because the Java code depends on the database structure, database releases that contain schema changes must be coordinated with releases of the dependant Java code. It’s often difficult to coordinate such releases. Furthermore, if one system must be rolled-back due to a problem, dependant systems must be rolled back also”. I agree with this point also. I think this is unavoidable with the current approach to modelling and storing data, and the representation of data in a Java application. If you change the attribution and/or relationships of the data then there is likely going to have to be code chages that need to be synchronized. But Stored Procedures are not going to help you here either. If you change the inputs and outputs of your Stored Procedure because the database tables have changed, then you still have to make code changes, so you don’t gain anything.

As a whole I think this article is flawed in the understanding of the issues involved in Java application persistence. Statements such as this:

“As a Java developer, you have many choices for your application’s persistence layer. You can use entity beans with either container-managed persistence (CMP) or bean-managed persistence (BMP), or you can execute SQL queries from within your Java code through the Java Database Connectivity (JDBC) API.

Typically, those not familiar enough with SQL will choose CMP entity beans or frameworks such as TopLink and Hibernate, which generate queries and help to isolate them from the Java code. Regardless of which you choose, the resulting code will still contain Java-based queries. This article advises against this approach and proposes an alternative form of persistence that uses database stored procedures.”

The reason why frameworks such as Toplink and Hibernate are used is to address the issue of the ‘object/relational mismatch’ problem – we code Java applictions as Object Oriented applications using OO concepts, using an OO language. Why do we do this? Because an Object Oriented language allows us to model and implement a system with the main domain entities that are central to our business as Classes, which have attributes and behavior, typical to how we observe them in real life. Is this the best approach to implement a system? Maybe not, but it the currently accepted best approach and most natural approach, instead of using procedural languages which a centered more on the implementation logic as sequential steps rather than the data enties and their attributes, relationships and behavior as in an OO language. The problem is that current Relational Databases are relational, and do not use OO concepts. True, you can attempt to represent class inheritance using some workaround concepts expressed as relationships bewteen tables, but RDMSs today do not have good (I say ‘good’, because some database do have primitive object support) support for OO concepts which we take advantage of in our OO languages. This is the core problem – how do you work with Objects on one side and relational data on the other side? This is the problem that frameworks such as Hibernate and Toplink address (I won’t even mention Entity Beans here because up until 3.0 the Entity Bean approach was so flawed that it should not even be considered as a viable solution).

The given reason “those not familiar enough with SQL will choose CMP entity beans or frameworks such as TopLink and Hibernate” is a complete and utter misunderstanding of what the core issues are, and what the goals of these technologies are. Stored Procedures have their place – if I need to do some data manipulation and complex retrieval that involves several queries and iteration over the results, it makes sense to execute this close to the database and take advantage of the power of the database and what it does best. However, to say the use of Stored Procedures to solve the problems of data access in Java applications is a complete misunderstanding of what the problems actually are.

Object/Relational Mapping tools such as Toplink andf Hibernate address the issues that we have today. We use OO languages and OO concepts to build our application, and we use relational databases to store our data. Both approaches are the current best approach to solving particular issues. The main problem is that they do not work well together, like trying to force a square peg into a round hole. The translation of OO concepts to relational is the issue that Toplink and Hibernate (and others) address. This does not mean of course that this will always be the ideal solution. For example, assuming the OO databases mature to a point where they are comparable to the performance and reliability of traditional relational databases, then this would seem to be the ideal solution using current OO concepts – implement the system using OO concepts and technologies, and store data in an OO datastore.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.