Copyright © 2005 Alan P. Sexton
| Revision History | ||
|---|---|---|
| Revision 2.0 | 10 January 2005 | aps |
| I assume the use of the Standard Edition JDK 1.3.1 or better. | ||
Table of Contents
JDBC (the Java Database Connection) is the standard method of accessing databases from Java. Sun developed the JDBC library after considering Microsoft's ODBC. Their aims were to get something similar but easier to learn and use: ODBC is complex because it has a few very complex calls. JDBC has split up this complexity into many more calls, but with each of them being relatively simple.
Accessing a database using JDBC involves a number of steps:
Get a Connection object connected to a database
Get a Statement object from an open Connection object
Get a ResultSet from a Statement's query execution
Process the rows from the ResultSet
While the above is the standard pattern, there are variations: Instead of a Statement, you can get a PreparedStatement (which allows a query to be pre-compiled for extra performance when executed repetitively). An update query may only return a count of the number of rows updated, inserted or deleted instead of a ResultSet. There are calls to get information about the database and about ResultSet objects (e.g. the number, names and types of columns). There are calls to support transactional updates to the database.
The main package used is
import java.sql.* ;
This is in the Standard Edition JDK. There is an extension package from the Enterprise Edition JDK. However, this is rarely necessary and even then only for the use of some advanced features such as connection pooling:
import javax.sql.* ;
Individual database management systems require a JDBC driver to be accessed via JDBC. Sun provides, as part of the Standard Edition JDK, a JDBC-ODBC bridge driver, which lets you connect to any ODBC database. This driver is intended for test and study purposes and is not recommended for commercial use (it is slow, missing some features and is somewhat buggy). Most DBMS vendors supply their own drivers with their products and there are many third party drivers available (both commercial and free)
The first step in accessing the database is to load a driver. First your driver should be installed on the system (usually this requires having the driver jar file available and its path name in your classpath. If you are using the JDBC-ODBC bridge driver, then you have to register your database with ODBC (on windows, use the "ODBC data sources" tool in the control panel). The class name of the JDBC-ODBC bridge driver is sun.jdbc.odbc.JdbcOdbcDriver. When a driver is loaded, it registers itself with Drivermanager which is then used to get the Connection.
There are a number of alternative ways to do the actual loading:
Use new to explicitly load the Driver class. This hard codes the driver and (indirectly) the name of the database into your program and is not recommended as changing the driver or the database or even the name or location of the database will usually require recompiling the program.
Class.forName takes a string class name and loads the necessary class dynamically at runtime. This is a safe method that works well in all Java environments although it still requires extra coding to avoid hard coding the class name into the program.
The System class has a static Property list. If this has a Property jdbc.drivers set to a ':' separated list of driver class names, then all of these drivers will be loaded and registered automatically. Since there is support for loading property lists from files easily in Java, this is a convenient mechanism to set up a whole set of drivers. When a connection is requested, all loaded drivers are checked to see which one can handle the request and an appropriate one is chosen. Unfortunately, support for using this approach in servlet servers is patchy so we will stay with method 2 above but use the properties file method to load the database url and the driver name at runtime:
Properties props = new Properties() ;
FileInputStream in = new FileInputStream("Database.Properties") ;
props.load(in) ;
String drivers = props.getProperty("jdbc.drivers") ;
Class.forName(drivers) ;The Database.Properties file contents look like this:
# Default JDBC driver and database specification jdbc.drivers = sun.jdbc.odbc.JdbcOdbcDriver database.Shop = jdbc:odbc:Shop
To get a connection, we need to specify a url for the actual database we wish to use. The form of this url is specific to the driver we are using. With the driver loaded, we can use the properties file above to get the database url. Using the Sun JDBC-ODBC bridge driver, the url of the database is jdb:odbc:xxx where xxx is the ODBC data source name registered for your database. (The name of the property we use is unimportant)
String database = props.getProperty("database.Shop") ;
Connection con = DriverManager.getConnection(database) ;![]() | Warning |
|---|---|
Microsoft Access, accessed through the JDBC-ODBC bridge driver, is not thread safe. This means that multiple different concurrent connections to Access using the JDBC-ODBC bridge driver will not necessarily work in a correct manner (although it might appear to until some critical situation occurs). In consequence, no application using the JDBC-ODBC bridge driver should have multiple Connection objects open at the same time and, therefore, use of Microsoft Access using this drive for web database work is severely limited (although it is still fine for educational purposes). Note that this is just a problem with this particular driver: there are high quality third party JDBC drivers available to connect to Access or generally to any ODBC enabled database that do not have these problems. | |
A Statement is obtained from a Connection:
Statement stmt = con.createStatement() ;
Once you have a Statement, you can use it to execute, and control the execution of, various kinds of SQL queries.
Use stmt.executeUpdate with a string argument containing the text of an SQL update query (INSERT, DELETE or UPDATE). This returns an integer count of the number of rows updated.
Use stmt.executeQuery with a string argument containing the text of an SQL SELECT query. This returns a ResultSet object which is used to access the rows of the query results.
You can use stmt.execute to execute an arbitrary SQL statement which may be of any type. However, extracting the results, whether an integer or a ResultSet, is less convenient. This is usually only used where you want a generalized access to the database that allows programmatic generation of queries.
int count = stmt.executeUpdate("INSERT INTO Customers " +
"(CustomerFirstName, CustomerLastName, CustomerAddress) "
"VALUES ('Tony', 'Blair', '10 Downing Street, London')") ;
ResultSet rs = stmt.executeQuery("SELECT * FROM Customers") ;
// do something with count and RS![]() | Note |
|---|---|
The syntax of the SQL string passed as an argument must match the syntax of the database being used. In particular, appropriate quoting of special characters must be used. For example, if a name, O'Neill, is to be inserted, it has to be entered as
ResultSet rs = stmt.executeQuery("SELECT * FROM Customers" +
"WHERE CustomerLastName = 'O''Neill'") ; | |
If you do not know exactly the table structure (the schema) of the ResultSet, you can obtain it via a ResultSetMetaData object.
ResultSetMetaData rsmd = rs.getMetaData() ;
int colCount = rsmd.getColumnCount() ;
for (int i = 1 ; i <= colCount ; i++)
{
if (i > 1)
out.print(", ");
out.print(rsmd.getColumnLabel(i)) ;
}
out.println() ;Once a ResultSet has been obtained, you can step through it to obtain its rows, or, more specifically, the fields of its rows:
while (rs.next())
{
for (int i = 1 ; i <= colCount ; i++)
{
if (i > 1)
out.print(", ");
out.print(rs.getObject(i)) ;
}
out.println() ;
}Note that the column numbers start at 1, not 0 as in Java arrays. More conveniently, if slightly less efficiently, there is a getObject method for ResultSet which takes a String argument containing the column name. There are also getxxx methods that take the String name of the column instead of the column number. Thus the above code could have been written:
while (rs.next())
{
out.println(rs.getObject("CustomerID") + ", " +
rs.getObject("CustomerFirstName") + ", " +
rs.getObject("CustomerLastName") + ", " +
rs.getObject("CustomerAddress") ) ;
}Instead of getObject, you can use type specific methods, getInt, getString, etc. However, these have a major disadvantage: if the field is of primitive type such as int, float etc., then if the field is actually null in the database, then there is no value that can be returned that is indistinguishable from some valid value. There is a mechanism for finding out whether the last value obtained was really null or not: wasNull, but this must be called immediately after the getXxx method and before the next such call. If you use getObject, then if the field was null then the object value returned will be null so you can pass this value around and check for it at your convenience. Note also that printing is the most common thing to do with retrieved values, and passing a null to print will print the string "null". Thus for many cases no extra processing of nulls will be necessary.
Rather than Statement objects, PreparedStatement objects can be used. This have the advantages over plain Statement objects of:
For repetitive queries that are very similar except for some parameter values, they are considerably more efficient because the SQL is compiled once and then executed many times, with the parameter values substituted in each execution
The mechanism for inserting parameter values takes care of all necessary special character quoting in the correct manner for the connected database
The PreparedStatement has its SQL text set when it is constructed. The parameters are specified as '?' characters. After creation, the parameters can be cleared using clearParameters and set using setInt, setString, etc. methods (parameter positions start at 1) and the statement can then be executed using execute, executeUpdate or executeQuery methods as for Statement and with the same return types but with no arguments (as the SQL text has already been set when the statement was created):
PreparedStatement pstmt = con.prepareStatement(
"INSERT INTO Customers " +
"(CustomerFirstName, CustomerLastName, CustomerAddress) "+
"VALUES (?, ?, ?)") ;
pstmt.clearParameters() ;
pstmt.setString(1, "Joan") ;
pstmt.setString(2, "D'Arc") ;
pstmt.setString(3, "Tower of London") ;
count = pstmt.executeUpdate() ;
System.out.println ("\nInserted " + count + " record successfully\n") ;
pstmt.clearParameters() ;
pstmt.setString(1, "John") ;
pstmt.setString(2, "D'Orc") ;
pstmt.setString(3, "Houses of Parliament, London") ;
count = pstmt.executeUpdate() ;
System.out.println ("\nInserted " + count + " record successfully\n") ;Transactions are a mechanism to group operations together so that either all of them complete together successfully or none of them do. This avoids database consistency problems that can occur if some groups of operations are only partly completed. Think of a bank transfer that requires withdrawing money from one account to deposit in another. If the withdraw is completed but the deposit fails then the customer is likely to be very unhappy. If the deposit succeeds but the withdraw fails then the bank is likely to be very unhappy. Actually, transactions handle other aspects of consistency as well. For example, ensuring that a second transaction sees the database as if either the first transaction has completely finished or as if it has not started yet but not as if some of the first transaction's operations have completed but not others - even if both transactions are running simultaneously.
When a Connection is obtained, by default its AutoCommit property is set to true. This means that every query execution is committed immediately after it is executed and before the next one is executed. To enable grouping of operations in transactions, you have to switch the AutoCommit property off:
con.setAutoCommit(false) ;
Now you have to obtain new statement objects from the connection (the old ones won't work), and query or update as usual. When all operations that you want to group together have completed, you must commit the updates to the database:
con.commit() ;
At this point you can continue with more operations which will be grouped into a new transaction or you can switch AutoCommit back on:
con.setAutoCommit(true) ;
If anything goes wrong during a transaction (e.g. an Exception is thrown or an error means that you cannot complete your group of operations) then you have to undo all operations in your transaction so far:
con.rollBack() ;
![]() | Note |
|---|---|
If the database or the machine crashes, rollBack will (essentially) be called for you automatically to clean up uncommitted transactions when the database is restarted. | |
You should make every effort to minimise the length of time that you have open transactions running (i.e. the time after the first operation in a transaction until the call of the commit or rollback that terminates it) as they hold expensive resources and, in particular, locks in the database system which may stop any other competing transactions from proceeding.
Getting all this working correctly requires careful attention to your Exception handling. You must embed a transaction in a try clause so that any exception will trigger a rollback. If you do have to handle an exception (and therefore rollback) in a method that normally closes open connections before returning, make sure that this does not create a loop hole that allows the method to return without closing the connection. You can use a finally block to ensure that this is handled correctly.
Finally, note that when you modify the AutoCommit status of a connection, all operations by any thread using that connection object are run in the same transaction. Therefore you have to be very careful about sharing connection objects between different threads (particularly important in servlet and JSP code). The simple rule is that you can share without problems a connection which has been set to auto commit. Do not share non auto committing connections unless you use some other mechanism to make sure that you don't end up merging different transactions into one - with consequences for committing and roll backs. An appropriate mechanism would be connection pooling: an advanced topic that is not discussed in this tutorial.
SQLExceptions, thrown when something goes wrong in JDBC processing, has a slight peculiarity compared to other types of exceptions. When an SQLException is caught, it may have a chain of further SQLExceptions attached to it. This is because a number of problems may occur as part of one overall problem and you may need information about all of them to handle it correctly. If you wish to inspect the full chain of exceptions, you should have code similar to the following when catching SQLExceptions:
catch (SQLException e)
{
do
{
// do something with each SQLException in the chain:
System.out.println(e.getMessage()) ;
}
while ((e = e.getNextException()) != null) ;
}In practice, programmers rarely want to know the full chain of exceptions, but only whether an exception occurred or not. In this case, the following code provides a general framework for all simple transaction processing code:
Connection conn = null ;
Statement stmt = null ;
ResultSet rs = null ;
try
{
conn = getConnection();
conn.setAutoCommit(false);
// do some JDBC calls using conn, stmt and rs
conn.commit();
// if you get to here then everything has succeeded
// and the transaction is complete.
}
catch (Exception e)
{ // note that other exceptions might occur than SQLExceptions: if
// they have, we still have to rollback because the transaction
// has still only partially completed.
// Here something has gone wrong: rollback any updates to fix
// the database and do any other application specific
// corrections necessary
try
{
conn.rollback();
// application specific handling of exception
}
catch (SQLException e)
{ // in case something goes wrong with the rollback()
do
{
// do something with each SQLException in the chain.
// possibly you want to log it rather than print it as
// these messages are intended more for a system
// administrator than an end user.
System.out.println(e.getMessage()) ;
}
while ((e = e.getNextException()) != null) ;
}
}
finally
{ // ALWAYS close the JDBC objects in a finally block
// to ensure that they are correctly closed even if
// exceptions occur.
if (rs != null)
try {rs.close(); } catch(SQLException e) {;}
if (stmt != null)
try { stmt.close(); } catch(SQLException e) {;}
if (conn != null)
try { conn.close(); } catch(SQLException e) {;}
}![]() | Warning |
|---|---|
Note that you must not maintain a connection active longer than necessary: otherwise you are locking (parts of) the database against other users. You should certainly not start a transaction and then wait for user input before finishing it. Also you must not try to hold on to a ResultSet or a Statement object after the transaction has completed: if you need the data after the transaction has finished, copy it out into your own objects (not a ResultSet object) before committing the transaction. | |
There is another object related to exceptions that can give information about various kinds of problems: this is an SQLWarning. They are not exceptions and do not interrupt the normal flow of control but a Connection, Statement or ResultSet object can be queried for warnings after they have been used. The most common type of warning is a Data Truncation error. Otherwise many programmers don't query them at all.
The database we use for the sample program and the exercise is a (seriously) simplified version of a Microsoft Access database suitable for handling on-line book orders. We omit details that are necessary in practice but do not add anything qualitative to the programming exercises. Thus, for example, there is no information kept about authors, ISBN, delivery charges or options, etc.
The following is a picture of the relationship structure showing the tables, columns and attributes:

By way of explanation:
A Customer can have many Orders.
An Order can have many OrderDetail records.
Each OrderDetail record refers to precisely one Book.
All the primary keys ID fields (in each case the first column name in the corresponding table in the figure above) are AutoNum generated integers and therefore the primary key of each table does not need to be included when inserting new records.
Referential integrity rules are maintained on all foreign keys (ID fields which not in the first position in the tables of the figure above). Hence, for example, it will cause an error to try to insert an Order record with an Orders.CustomerID for which there is no corresponding Customers record.
OrderDetail.Quantity is an integer.
Books.BookPrice is a Currency object which is handled in Java via the java.math.BigDecimal class.
CustomerFirstName, CustomerLastName, CustomerAddress, and BookName are Strings.
Orders. OrderDate is a Date/Time object in MSAccess or a Timestamp object in Postgresql which is handled in Java as a java.sql.Timestamp object.
Timestamp objects can be set to the current time as follows:
long millisecs = System.currentTimeMillis() ; Timestamp ts = new java.sql.Timestamp(millisecs) ;
They can be set to a specific value using the valueOf method with a string argument:
ts = Timestamp.valueOf("2001-07-06 14:25:29.9") ;As should be expected, ResultSet has the usual getTimestamp, putTimestamp and updateTimestamp methods and PreparedStatement has a setTimestamp method.
BigDecimal works rather similarly, see the online API documentation for details.
Most databases provide a mechanism to automatically generate unique field values for records in a table. This mechanism is often used to obtain ID values for a primary key column. MSAccess has a field type called Autonum, PostgreSQL has a general construct called a sequence. In general, when selecting rows from tables which have such fields in them, these fields are read as if they were long integers in Java. When inserting records, simply omit the fields which are autonum/sequence fields and a new unique value will be generated for those fields and added when the record is inserted.
However, there is a remaining problem: when inserting multiple records that refer to each other, how does one obtain the automatically generated key of the record referenced, so that you can use it as the foreign key value to include in the referring record? To make this clearer, consider the following example. One wishes to, as part of a single transaction, create a new order and create a new order detail record associated with that order. Note that the OrderDetail record will have a foreign key into the Order record: i.e. the OrderDetail record will have one field called OrderID that has to contain the ID of the Order record that this OrderDetail record is associated with. However, since the OrderID field in the Order table is a sequence field, the new Order record only has an OrderId set for it as part of the SQL insert call. That means that you do not know what ID has been assigned without doing a select for the record you just inserted. Aside from the overhead of such an extra select, and depending on your data design, there may not even be a way to always find exactly and only the record that you just inserted: after all that is precisely why you have a OrderID field in the first place.
Fortunately, database management systems provide a way to work around this problem. Unfortunately, these workarounds tend to be different from one database management system to another. I will discuss the PostgreSQL mechanism here.
Assume, as in the case of the example shop database below, that the sequence associated with the OrderID field of the Order table is called orderID_seq. Then the OrderID field is, in fact, just a normal postgresql bigint field except that any record insert to this table which does not specify the OrderId value is defined to take the value of the function nextval('orderID_seq') by default for this field. This function atomically advances the sequence and returns the new value. The "atomically" here means that, even if multiple other different transactions are calling the same function on the same sequence simultaneously, the operation works correctly as if nothing else was operating on the sequence. This means that you can execute the SQL query:
SELECT nextval('orderID_seq') AS ID(using the usual jdbc calls), get a ResultSet which will contain a single row with a single column, and get the value (of Java type long) from that to find an ID value. You can then create the Order record using this ID value (you need to set the OrderID field in this case to the long value you have just obtained - do not leave it undefined) and you still have that value available to use as a foreign key value in any OrderDetail records you wish to create.
Note that this approach has a major disadvantage: one must execute an extra query to obtain the identifier before inserting the record required.Hence this approach, while simple and safe, is not terribly efficient. For more sophisticated approaches, see [Johnson03], [Ambler03] or the latter author's online essay
The JDBC3 standard allows a simple and effective solution: the method to create a Statement or a PreparedStatement object from a Connection can now take an extra integer parameter. If this has the value
Statement.RETURN_GENERATED_KEYS
then, after the query has been executed (and only if it is an insert statement) you can call a
getGeneratedKeys()
method on the Statement or PreparedStatement object to obtain a ResultSet object that contains the keys that have just been generated. Thus generating the keys, inserting the records and returning the key values are now all done in one efficient operation. Unfortunately, many of the JDBC3 drivers currently available are still very incomplete and, in particular, the JDBC3 driver for PostgreSQL does not support this feature at the time of writing of this tutorial (an attempt to invoke the new Statement creation method results in an exception being thrown with the message "This method is not yet implemented").
![]() | Warning |
|---|---|
The MSAccess files are currently out of date with respect to this text. Please use the Postgres files until I update them. | |
The following program shows a full program using each of the features discussed above and an MSAccess database: Customer.java You also need the Database.Properties file and the Shop.mdb database in the same directory. Finally, you need to create an entry in your ODBC data sources (from the control panel) for Shop.mdb
The following program shows a full program using each of the features discussed above and a Postgresql database: Customer.java You also need the Database.Properties file. This file must be placed in the same directory that the Customers.class file is in when the program is executed (Customers uses the getResourcesAsStream method to use the class loader to find the Database.Properties file). Note that you must modify the database.name and the default database.user properties in this file to match your database and username. Note that the Customers.java file does not contain the password for the database nor is it stored in the properties file. Instead there is a routine that creates a Swing dialog that asks you to enter the password. This means that the application will not run on a commandline only interface. This restriction is better than requiring you enter your password as a command line argument (which is very easy for other users on the same machine to snoop) or to enter it in your files. Unfortunately, Java does not provide a proper portable way to input passwords securely from a command line interface.
The shop_create.sql script sets up the shop database in Postgresql (use the '\i' command in psql to import the script). There is also a script, shop_drop.sql to delete all the tables and sequences created by the creation script so that you can easily clear out your data and start again. Note that the Customer.java program does require the database to be set up and it ends up deleting all records from the Customer table.
List books application
Write a console application that takes a string from the command line and lists book details for books whose titles start with this string. Hint: use the LIKE predicate in the SQL query.
Add books to an order application
Write an application that takes a customer email address and a book ISBN from the command line.
It should check that the customer exists in the database and that a book with the given ISBN exists in the database and exit with an error message otherwise.
If there is no open order for that customer then a new open order is created and the book that matches the book ISBN should be added to the open order with a quantity of 1 and a NULL charge (charge is only set when the order is closed).
If there is an open order for that customer but the order does not contain any copies of this book, then the book that matches the book ISBN should be added to the open order with a quantity of 1 and a NULL charge.
If there was an open order for that customer which already contains any copies of this book, then the quantity for this book should be incremented by 1.
Print to System.out a listing of the customer's details and the customer's open order.