Copyright © 2005, 2006 Alan P. Sexton
| Revision History | ||
|---|---|---|
| Revision 1.4 | 23 January 2006 | aps |
Table of Contents
Hibernate is an Object Relational Mapping (ORM) tool. It manages the persistence of java objects in a relational database. The idea is that a programmer should be able to design his business objects as standard Java objects with very little interference from the problems of making these objects persist in a database. Together with a little help from the programmer, Hibernate saves the objects into the database, retrieves them when needed and supports queries on the database written in a form similar to SQL but which refers to objects and object properties instead of tables and column names. The end result is that the code that needs to be written to interact with the database is considerably shorter and simpler.
This document is intended to cover only the basics of Hibernate: i.e., do things the way it is described here until you have outgrown it. Hence many features are only referred to in passing or not mentioned at all. The full reference documentation and a great deal of other important information is available on line at http://www.hibernate.org/5.html:Hibernate On Line Documentation. However, for serious users of Hibernate, a thorough study of [BK05] is highly recommended.
![]() | Important |
|---|---|
The books currently in print all discuss a version of Hibernate earlier than 3.0. However, Hibernate has moved on and the latest version as of 23 January 2006, is 3.1.1. There were incompatible changes between versions 2 and 3, so when reading the text books, you should have beside you the migration guide: http://www.hibernate.org/250.html:Hibernate Migration Guide
The differences are, generally, small, but enough to stop even
simple hibernate examples from working. Critically, all the
hibernate classes (and log4j logger names) are now
The set of cascade options has changed to bring Hibernate into alignment with the EJB3 standard. | |
Here we provide a fully working, but very simple, example application using Hibernate. The example is based on some fragments that appear in chapter 2 of [BK05]. The files necessary are:
Main.java: the main class that manipulates Message objects, storing them into and retrieving them from the database. Note how only Commons logging classes are used as a thin wrapper around log4j — this is common practice to provide independence from specific logging systems when an application may use multiple different libraries, each of which might otherwise may different choices for logging.
Message.java: the class defining the objects that will be persisted to the database.
Message.hbm.xml: the mapping file that describes how the properties of a message file should be mapped to columns of tables in the database (along with other necessary information such as how keys are generated, what database constraints and indexes should be maintained etc.)
hibernate.properties: the Hibernate configuration file that specifies the database that is to be used, the database connection pooling system (if any) and other configuration parameters for the system.
log4j.properties: the log4j configuration file that sets many parameters of the logging system.
ehcache.xml: the ehcache configuration file that sets parameters parameters for the second level cache in hibernate. In fact, this won't really be used without further work but without it you will get warning messages when you start your application.
The basic pattern of database interactions via Hibernate is
visible in the Main.java file above.
Create a Configuration object, load the
configuration parameters from the
hibernate.properties and adjust them as required.
Create a SessionFactory object from the
Configuration object. The
SessionFactory object is a heavyweight,
thread safe object. You would normally share one such object
between all your threads in a web application.
For each unit of work (normally one use case) use the
SessionFactory object to obtain a
Session object. This is an extremely
lightweight, non-thread safe object. It will be associated
with a database connection but it only obtains that
connection lazily, i.e., only when (and if) it is required.
Session objects must not be shared between different
threads. Recall that every request to the web server
usually runs in its own thread.
Inside a try block, get a Transaction object by calling
beginTransaction() on the
Session object.
Interact with the database:
explicitly by calling methods
of Session to associate objects
to the database (i.e., map them to the database),
execute queries, load, save, delete mapped objects
etc.
implicitly by calling property mutators on mapped objects that will lead to the database being updated.
implicitly by referencing non-mapped objects from mapped objects which (in certain circumstances) can cause the non-mapped objects to be added to the database.
implicitly by unreferencing mapped objects from other mapped objects which (in certain circumstances) can cause the unreferenced objects to be deleted from the database.
Call commit() on the
Transaction object and close the try
block, handling exceptions and closing the
Session object in the usual way.
Transient objects do not (yet) have any
association with the database. they act like any normal Java
object and are not saved to the database. When the last
reference to a transient object is lost, the object itself
is lost and is (eventually) garbage collected. There is no
connection between transactions and such objects:
commits and
rollbacks have no effects on them. They
can be turned into persistent objects
via one of the save method calls of the
Session object, or by adding a reference
from a persistent object to this object.
Persistent objects do have an
association with the database. They are always associated
with a persistence manager, i.e., a
Session object and they always
participate in a transaction. Actual updates of a database
from the persistent object may occur at any time between
when the object is updated until the end of the transaction: it
does not necessarily happen immediately. However, this
feature, which allows important optimizations in database
interactions, is essentially invisible to the programmer.
For example, one place where one might expect to notice the
difference between the in-memory persistent object and the
database version is at the point of executing a query. In
such a case, Hibernate will, if necessary, synchronise any
dirty objects with the database (i.e., save them) in order
to ensure that the query returns the correct results.
A persistent object has a primary key value set, whether or not it has been actually saved to the database yet.
Calling the delete method of the
Session object on a persistent object
will cause its removal from the database and will make it
transient (it will still be available as a normal,
non-persistant Java object).
Aside from making a persistent object out of a transient one
as described above, one can also create a new persistent
object, with its values obtained from the database, by
executing the load or
get methods of Session
if you know the object's database identifier. You can also
get persistent objects by creating a query
(createQuery) and extracting the results
from it.
Detached objects are objects that were
persistent but no longer have a connection to a
Session object (usually because you
have closed the session). Such an object contains data that
was synchronised with the database at the time that the
session was closed, but, since then, the database may have
changed; with the result that this object is now
stale.
![]() | Important |
|---|---|
A detached object may be re-attached
later to another | |

The Hibernate Object Life Cycle
Given a pair of (persistent) objects of the same class, we now have three concepts of identity to consider.
a==b Java Identitya.equals(b) Java Equalitya.getId().equals(b.getId()) Database Identity
The rule for Hibernate, is that if, within a single
Session, you request two objects which have the
same database identifier, then you will get references to the same
actual objects. It accomplishes this by using a cache for
persistent objects. Anytime you request, in any way, a persistent
object, Hibernate first checks in the cache for it and, only if it
can't find it there, will Hibernate actually execute a query to
get the data from the database and create an object for it.
Incidentally, this means that you can use Java Identity (i.e.
==) to test persistent objects for identity,
even if you would normally need to use the
equals().
Since the programmer can define the equals(),
it is important not to use the
id field in that definition if the
id field is a surrogate key. This is because,
if the object uses a generated identifier value for its
id, or if at least part of its
id is a reference to another object (i.e. a
foreign key), then Hibernate only sets the field the first time it
saves the object to the database. Hence, for example, if you add
the object to some set or map collection, then saving the object will
result in its identity changing, and part of the rules about using
the set collection class is that the contained object's identity
must not change while it is in the collection.
In fact, this situation is almost certain to occur because of the
frequent use of collection classes to represent the
many side of one-to-many
or many-to-many relationships. Therefore we use
the Java Equality concept to define when two objects should really
be the same database object.
However, there are other problems with using all the non-id values
of an object in the equality test: you really want the test to
return true if the objects map to the same row of the the same
table in the database (i.e. they represent the same real world
concept). But two objects may represent the same real world object
and have some different values. For example, two
Customer objects may differ in the value of a
password property (because the two objects date from different
instances in time between which the customer has change her
password). But they still refer to the same real world concept:
i.e., the same customer.
The solution is to decide on a Business Key
for a class. This is like a database key, but involves no
generated surrogate keys. Instead it consists of those
"real-world" properties of the class that the programmer considers
to uniquely identify a particular record. It is not a requirement
that the business key absolutely never changes, merely that it
changes will not change within the period in which it might be
stored in memory in a collection class. For the
Customer class on a web application, an
appropriate business key might be the customer's email address.
This, of course, can change, in which case the customer will be
treated as a new different customer. However, this is rarely a
significant problem, and if it is, one can always provide a
mechanism to reconnect the old data about the customer to the new
customer record. More importantly, from our point of view, a
change of customer email address is extremely unlikely to affect
any reattachment of a detached Customer object to a new session.
Note that Hibernate does not know or care anything about your business
keys. As far as it is concerned, reattaching an object works by
checking the id property of the object. If it
is null, then the object is a new one that
could be added to the database but certainly cannot be reattached.
Otherwise, the object can be matched up with a record in the
database and, on reattachment, the contents of the object are used
to update the contents of the corresponding database record(s).
In writing an equals method, there are two important considerations to bear in mind:
If you write an equals method, you must
write a hashCode method which always
returns the same value for two objects which
equals decides are equal.
When referring to instance variables of the argument object, always use the accessor method rather than directly referring to the raw instance variable: this is because, in an environment such as a web application or service, you may actually be dealing with a proxy object rather than the actual object you expect for reasons of, for example, distributed load balancing or scalability to very large service loads.
Given that, the equals and
hashCode methods should be
written as follows:
public class Customer
{
…
public boolean equals(Object other)
{
if (this==other)
return true;
if (other==null)
return false;
if (!(other instanceof Customer))
return false;
final Customer o = (Customer) other;
return this.emailAddress.equals(o.getEmailAddress());
}
public int hashCode()
{
return emailAddress.hashCode();
}
}
A Hibernate object, suitable for mapping into a database, is a normal java bean with a number of extra requirements.
There must be a default constructor for the class.
There must be accessors and mutators for all the instance variables of the class. Actually this is overstating the requirement but is a good base rule: read the Hibernate documentation for the full details.
The class should implement
Serializable. Strictly speaking, this
is not a requirement. However, in practice you will normally
want your Hibernate objects to be serializable so that they
can be (potentially) migrated around a multiprocessor
cluster or saved and restored across a web server reboot etc.
The class should have an id instance
variable, usually of type Long. Again
this is not a true requirement but it is recommended to use
automatically generated surrogate keys and, if so, to use an
instance variable called id to hold it.
Certainly, alternatives are possible.
The mutator for the id property should
be private, not public. Again not a requirement but good
practice. You should never update the
id property directly but rather rely on
Hibernate updating it for you. In practice, it is the value
of this field that Hibernate uses to decide if an object has
been mapped to a database record or not. Change the property
yourself and you could seriously confuse Hibernate.
You should decide on a business key for the object and
implement the equals and
hashCode methods for it.
You can add any extra type specific constructors (which
should leave the id field
null) and business rule methods you like.
You should not make the class final if
you want to be able to use lazy loading
for objects of the class (which you normally do).
Once you have a Session object, and are
executing inside a Transaction, there are a
number of ways you can interact with the database.
session.get is used to create a new
persistent object by id from the database. It
returns null if there was no such
object in the database. session.load is
similar except that if there was no such object in the
database it throws an exception.
![]() | Important |
|---|---|
Conceptually, these methods do not just get the object requested but also all objects that it refers to through its properties, and, transitively, all objects that they refer to as well, and so on. If the database is large, and there is a path of associations from every object to every other object, then fetching one object could try to load in the entire database. There are two issues to consider:
| |
session.delete will cause the database
row corresponding to a persistent (or even a detached!)
object to be deleted and the object will become transient.
What happens to the objects it refers to depends on
the cascade properties of the mapping configuration for that
reference.
session.save on a transient item will
assign it an id and make the object
persistent: i.e., ensure it, and any
other objects it refers to, get saved to the database. This
operation essentially causes an SQL
INSERT to be executed. Any further calls to the
mutators of the object within the transaction will cause an
SQL UPDATE to be invoked.
session.lock and
session.update are both intended for
reattaching a detached object. Normally you should use
session.update which triggers an
SQL UPDATE to the database row with
id equal to that of the object. Thus
if the database and the object disagreed on the values
contained, then the object overrides the database.
session.lock simply reattaches the
object (to the session) without checking or updating the
database on the assumption that the database is still fully
in synch with the object. Generally, do not use this method
unless you are absolutely sure that nothing has changed the
database state of the object since it was detached.
session.saveOrUpdate is a convenience
method that checks whether the object is transient, in which case
it acts like session.save, or detached,
in which case it acts like session.update.
session.merge checks for a persistent
object with the same identifier in the session. If it finds
one, it copies the data from the detached object onto the
persistent one, Otherwise it creates a new persistent object
from the data of the detached one. Either way, the detached
object stays unchanged and detached and you are now
guaranteed that there is a persistent object in the session
(which is returned by the method) which exactly matches the
detached one.
So far, our discussions above show how to fetch an object from the
database if we know its id. Obviously we need
more powerful querying facilities.
As described above, one of the simplest ways of querying is simply by invoking a chain of accessor methods on a persistent object:
X x = z.getY().getX()
The most critical type of access that is not covered above is
finding an object (or collection of objects) when you have some
information about them but not the identifier and you do not have
a persistent object available that refers to them. For example, if
we want to find the customers whose email address match a given
one. Here we can use the session.createQuery
method which returns a Query object. The
Query object can be executed by invoking
list() or iterate() to
return, respectively, a list of results or an iterator over the
results. The results, i.e. the elements of the list or extracted
from the iterator, will either be an object of one of your
persistent classes, or a object array containing a list of such
objects depending on whether your query asked for one or for a
number of objects on each row of the result.
The query language is not SQL but HQL. HQL is very similar to SQL but where, instead of names of tables and columns, the query uses names of java objects and properties. Of course there is a great deal more to it that that and full details can be found in http://www.hibernate.org/5.html:Hibernate On Line Documentation or [BK05]. The following paragraphs gives a little taster of HQL.
The statement
session.createQuery("from customer cust where cust.city=:cityName")
.setString("cityName", "Birmingham")
.list() ;
would return a list (java.util.List) of
customer objects whose city property was
"Birmingham". Naturally, one does not need to
chain the method calls but can introduce variables and execute
whole operation in separate steps if you wish. Note that instead
of the "?" mechanism for JDBC's
PreparedStatement, HQL provides a named
parameter mechanism which is somewhat more readable and less
error-prone. HQL does provide a "?" parameter
numbering mechanism, but HQL's numbering starts at 0, whereas
JDBC's starts at 1.
In the above query, there was no select clause.
The select clause is optional in HQL, but if
used, it should contain a list of one or more objects, rather than
one or more column names:
Query query = createQuery("select cust, sa " +
"from Customer cust, SalesAgent sa " +
"where cust.city = sa.city");
This query would find pairs of customers and salesagents in the same city. Thus one might print the results by executing the following:
ArrayList results = query.iterate();
while ( results.hasNext() )
{
Object[] row = (Object[]) results.next();
Customer cust = (Customer) row[0];
SalesAgent sa = (SalesAgent) row[1];
System.out.format("Customer: %20s, Sales Agent: %20s\n", cust.getName(), sa.getName());
}
Often, a web request may cause a query to be executed which would
require a large number of rows to be returned. In such a case, one
usually limits the rows to be returned to some limit (say 10 or
20) and allow the user to see them and request for the next block
of rows: for example, search engines like Google return only one
page of matches at a time. This is called
Pagination and Hibernates queries support
this in a very simple way. Given a Query
object, which should contain a query with a specific
ordering, you can use the (chainable) methods
setFirstResult() and
setMaxResults(), each of which take a single
integer argument, to, respectively, choose which row to return
first (counting starts at 0) and how many rows to return from
there on. The code to return the third page of results (i.e. page
2), where each page holds 10 rows, might thus look like:
int pageSize = 10 ;
int pageNo = 2 ;
Query query = createQuery("select cust, sa " +
"from Customer cust, SalesAgent sa " +
"where cust.city = sa.city" +
"order by cust.name asc, sa.name asc);
query.setFirstResult(pageNo * pageSize);
query.setMaxResults(pageSize);
List customerSalesagentList = query.list();
We have said, a number of times, that when an object is made
persistent, that the objects it refers to are also made
persistent. This was an oversimplification. In the mapping files
for the classes, there is an attribute,
cascade, of the various mapping elements (e.g.
one-to-one, one-to-many
etc.) that lets us control how much, or how little, of a reference
graph gets automatically persisted, deleted or updated etc. For a
full discussion, see the section
on transitive persistence in the Hibernate reference
manual. The values that it can be set to, and their meanings when
specified on a relationship from a referencing object to one or
more referenced objects, are given in the following list. Note
that you can have the union of a number of cascade behaviours by
writing the behaviours in a comma separated list
none: no automatic action on the
referenced object takes place. This is the default if no
cascade behaviour is set.
persist: Cascade any
persist() operation across
this relationship. Note that there is a error in the
reference manual where this is called create.
merge: Cascade any
merge() operation across
this relationship.
lock: Cascade any
lock() operation across
this relationship.
evict: Cascade any
evict() operation across
this relationship.
replicate: Cascade any
replicate() operation across
this relationship.
refresh: Cascade any
refresh() operation across
this relationship.
save-update: If
save(), update() or
saveOrUpdate(), is called on the
referencing object, automatically call
saveOrUpdate() on all referenced objects.
delete: automatically delete the
referenced object(s) when delete() is
called on the referencing object. Note that, if the
referencing object is not deleted but merely removes its
reference to the referenced object, then this option will
not do anything and, potentially, a garbage (or orphan)
object will be left in the database.
delete-orphan: automatically delete any
object for whom the reference has been removed from the
referencing object. This option is only available for
one-to-many and one-to-one relationships.
all: cascade all operations, but do not
take the action of delete-orphan.
all-delete-orphan: cascade all
operations, and take the action of
delete-orphan as well.
It is not normally appropriate to specify any cascade behaviour on a
many-to-one or a
many-to-many relationship.
In the situation where you have a one-to-one, or a one-to-many
relationship, where the referencing object
owns the referenced object(s), the
appropriate cascade behaviour is all,delete-orphan.
In any other case, if you want some cascade behaviour, but there
is no ownership relationship involved (so that, for example if the
referencing object is deleted, the objects it refers too can
continue to exist in the database) then the appropropiate
behaviour is persist,merge,save-update
As we have seen, the standard pattern for executing a use case is
to get a Session from the
SessionFactory, get a
Transaction from the
Session, interact with the database via the
Session, commit the
Transaction and close the
Session. Details of the exception handling have
been given above. This is fine for normal, fully serialised
database transactions but there are two situations when it is not
fine:
When using an isolation level other than fully serializable. There are 4 standard transaction isolation levels: Read Uncommitted, Read Committed, Repeatable Read and Serializable. They indicate different levels of locking strategies used within transactions and effect just how isolated one transaction really is from other transactions running simultaneously. Thus Serializable means that two transactions run as if one had completely finished (committed or rolled back) before the other had started. In practice, providing this level of isolation requires considerable resources and causes problems with scalability of applications. For this reason, most applications use a weaker form of isolation and use other strategies to overcome the consequent problems that can arise.
The most common isolation level used, and the default obtained with a PostgreSQL JDBC connection, is Read Committed. This ensures that one transaction can not see any value which has been written by another transaction if that other transaction has not yet committed. Therefore we don't have to worry about the other transaction rolling back with the result that we would have to roll back this transaction. With this level, the isolation problems that can occur are:
Lost Updates: tx 1 reads a row, tx 2 reads the same row, tx 1 writes the row and commits, tx 2 writes the row and commits, the value written by tx 1 is lost. This can be dealt with using versioning (see below).
Unrepeatable Reads: Transaction (tx) 1 reads a row, tx 2 writes the row and commits, tx 1 reads the same row and gets a different result from last time. The fact that Hibernate uses a cache, and, adding to that, the use of versioning, will handle this situation.
Phantom Read: tx 1 executes a
Select query. tx 2 inserts or
deletes new rows in the database and commits, tx 1
executes the same query again and finds a different
set of rows from what was there last time. There is
very little that you can do about this except to be
aware, when you design your transactions, of the
issue. The most common place where this arises is in
pagination: where the results are too large to fit on
a screen, you return only a page full of results to
the user and allow him to select the next when ready,
then you execute the same query again but return the
next page full of records. A phantom read problem
might mean that the first time you return records
0–19, another transaction then deletes a record
in the range of that first page, say record 5, then
you are asked to display the second page of records
and dutifully return records 20–39. However,
record 20 after the delete is actually record 21 from
before the delete. The end result is that you never
show the user the old record 20 at all. If this is an
important problem, then you have to specifically check
that the new record 20 is still the old record 20 and
fix your offsets if it is not.
When long transactions are required. Here the problem is usually that some user interaction, which could take a considerable period of time, is required in a use case with database accesses taking place both before and after the user interaction. The issue is that we should not keep a database transaction open for a long period of time (i.e., for more than a fraction of a second), whereas a user interaction could take from minutes to hours. The solution is to break the long transaction up into two (or more) database transactions, and to use detached objects from the first transaction to carry the necessary information to the presentation layer. These objects get modified, outside any transaction, as part of the user interaction and then reattached to the second transaction to cause the necessary updates.
If we left it at that, then all the isolation problems
described above could occur, as well as the nastier one of a
Dirty Read: Consider a long transaction
and a normal transaction: the first contains two database
transactions tx1a and
tx1b (perhaps the first is to book a
flight, the second to book a hotel). The other transaction,
tx2 might be to order meals for the
passengers on the flight (okay; not very realistic but you
get the idea). Now tx1a could update a
row and commit and tx2 could read it. Now
tx1b decides to roll back (perhaps there
were no hotels available). This only rolls back
tx1b itself, because
tx1a is committed and cannot be rolled
back, but since this is part of a long transaction, the
programmer has written explicit code to undo the effects of
tx1a when tx1b rolls
back (such an undo operation is normally called a
compensating transaction: it may not
even really undo the original transaction, it might only make
up for it in some way such as authorizing a reimbursement or
a voucher if a promised booking cannot be honoured). Now
tx2 has executed, read data from a long
transaction, which has since been undone, and has committed.
To be correct, the effects of this second transaction should
be undone as well but there is no way of knowing this.
The safe solution here is to only read from the datatabase in the early transactions, collecting the values that must be written, but only to allow database writes in the last database transaction of a long transaction, and to use versioning there to ensure that no other transaction has modified the bits of the database that should not been changed for your writes to still make sense. Under certain application specific circumstances, more relaxed strategies can be taken but the above rule of thumb is safe and simple and should only be ignored with very careful analysis.
The idea of versioning is very simple and particularly easy to handle with the support Hibernate provides:
Add a version instance variable, and corresponding
accessor and mutator methods, to your objects. An
int or long is
recommended although some people prefer a
TimeStamp or
Calendar. The latter two have slightly
worse performance and are not absolutely guaranteed to
work correctly (one might end up with two updates made so
close together in time that they have the same
TimeStamp value - although some
operating systems ensure that this can not the case), but
they have the advantage that you can easily see exactly
when the update was made.
Declare the version instance variable in your mapping file
for the class. This requires the following element added to
the mapping file immediately after your
id element (assuming your instance
variable is called "version" and you want it in a column
in the database called "version"):
<version name="version" column="version">
Now whenever you make an object dirty in memory, Hibernate will
update its version (in memory). Whenever the object gets flushed
to disk (e.g., at the end of a transaction or because you call
session.update or
session.saveOrUpdate to re-attach a detached
object, Hibernate will throw a
StaleObjectStateException if the version
number of the object on disk is not the same as it was when the
object was loaded. By catching that exception, the programmer
can then decide what to do about the conflict (e.g., report back
to the user that the choice he/she has just made is, in fact, no longer
available and could they please make another one).
Hibernate objects, i.e., objects whose persistence Hibernate will manage, can be divided into two types.
Entity beans are objects which have a persistent identity: i.e., usually an identifier field which is managed by Hibernate. These are typically the central business objects in an application such as User, Customer, Order etc.
Value beans are objects which only exist in relationship to an entity bean. These are typically support objects for the entity objects such as Address, CreditCardDetails etc.
The connection between entity beans and database
tables and columns is described in a mapping file: usually
named X.hbm.xml for class X
and stored in the same directory as the compiled file X.class.
The connection between value beans and the database is usually described in the mapping file for the corresponding entity bean.
In the following sections, we will look at the details of how to specify mappings for different types of mappings and classes. We only cover the basic situations. There are many variations possible and great flexibility in all the options and for more complex situations
A basic mapping file is as follows:
<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
"-//Hibernate/Hibernate Mapping DTD 2.0//EN"
"http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<hibernate-mapping package="a.b.c">
<class name="User" table="user" lazy="true">
<id name="id" column="id" type="long">
<generator class="sequence"/>
</id>
<version name="version" column="version"/>
<property name="dateOfBirth" column="dob" type="date"/>
<property name="username" not-null="true" unique="true"/>
<property name="gender"/>
</class>
</hibernate-mapping>![]() |
The |
![]() |
The |
![]() |
The |
![]() |
The use of the optional |
![]() | Types for columns do not need to be specified if they are simple and can be deduced by inspecting the class properties. However, sometimes you want to override the default or specify something a bit more sophisticated. |
![]() | We can specify whether a column should be unique (which will create a database constraint) and/or whether it can be null. |
![]() | Finally, the simplest case is where we specify nothing but the property name — everything else is taken care of by the defaults. |
Sometimes one has properties of entity beans which are more
complex than simple base types but which are totally owned by
the entity. The standard example is that of an
Address object stored as a property of a
Person object. In this case there is only one
Address object for the Person object.

A Value Component
There are a number of ways this can be handled, but the
simplest, which Hibernate calls components
is to store both the parent object (Person)
and the child (Address) in the same database
row, to construct and connect the two objects on reading this
row from the database and to coalesce the two objects and write
them together when saving or updating either of them. To specify
this, you use a component sub-element for the
child object in the class element for the parent object instead
of the usual property element. Thus, instead
of something like
<property name="address" type="string"/>
one would enter:
<component name="address" class="Address">
<property name="street" column="user_street"/>
<property name="postcode" column="user_postcode"/>
</component>
Some details to be aware of are:
These child objects are wholly owned by their parents: you cannot have two different parents.
A null child property is represented in the database by
setting to null all the fields
corresponding to the child object. Thus loading such a row
will result in a parent object with a null child property,
not a parent object with a child object whose properties
are all null.
Not only can you have multiple components in a class, but
one can have multiple components of the same (child) class
in a class: simply make sure that the component names are
different and that the database field names (the
column attributes) are different for the
different components.
Normally, the child object has no way to reference its parent object. However, if you want a property of the child object to refer to its parent, add a element of the form
<parent name="person"/>
to make a person property of the child
object refer to its parent object.
Again there are a number of ways Hibernate can handle inheritance. These are based on the standard techniques for reducing generalisation hierarchies in entity-relationship diagrams[BCN91].
The simplest is to use one table for the whole hierarchy. With
this design, each row of the table can hold an object of any
type from the hierarchy. There is one column for each of the
properties in the union of the sets of properties of all the
classes in the hierarchy and there is one
discriminator column which contains a value
(usually of type string, character or integer)
used to tell which actual type of object is stored in this
particular row. One normally does not make this discriminator a
property of the class: it is used only by Hibernate to record
and detect the type of the object that a row represents.

An Inheritance Class Hierarchy
<class name="Person" table="people" discriminator-value="P">
<id name="id" column="id" type="long">
<generator class="sequence"/>
</id>
<version name="version" column="version"/>
<discriminator column="subclass" type="character"/>
<property name="dateOfBirth" column="dob" type="date"/>
<property name="name"/>
<property name="gender"/>
<subclass name="Lecturer" discriminator-value="L">
<property name="office" type="string"/>
<property name="telephone" type="string"/>
</subclass>
<subclass name="Student" discriminator-value="D">
<property name="studentID" type="integer"/>
</subclass>
</class>
Here one may not specify any of the subclass fields as not null because the corresponding column will be null in the table for any object of the hierarchy which is not of the subclass that contains the relevant property for that column.
This corresponds to the standard Java reference to one object from another.

A Many-to-One, Unidirectional Association
In the diagram above, we represent the relationship between
students and their thesis supervisors. In this design, a student
can have no more than one supervisor but may not (yet) have any.
However, a lecturer may have any number, including zero, of
students to supervise. Furthermore, we only allow a one
directional link: Student has a property (say
getSupervisor/setSupervisor) but there is no
direct way, starting with a Lecturer object,
to find the students that the lecturer supervises.
If we start with the simple, non-related, base entity mapping
files for Student and
Lecturer, we add this association by adding
the following, as a sub-element of the class element, to the mapping file for
Student:
<many-to-one name="supervisor" column="supervisor"/>
This element acts very much like a normal property element in
that it defines the mapping between the
supervisor property of
Student and the column in the
students table. However, it also sets up the
relationship so that, after getting a Student
object from the database, if we use the
supervisor accessor of that
Student object, we will get the corresponding
Lecturer object (or a proxy thereof if we
have enabled lazy loading of the Lecturer
objects). Finally, it ensures that the underlying database is
created with a foreign key constraint that the
supervisor column is a foreign key into the
lecturers table.
As things stand, there is now a question of what you want the
cascade behaviour of the relationship to be
(see the section on cascade
above). Without adding the optional cascade
attribute to the many-to-one element, then
the Lecturer object on the other end of the
association is ignored when the Student
object is saved, updated, deleted or when its
supervisor property is reset away from it.
Certainly we would not want the lecturer to be removed from the
database when the student is deleted or when the student no
longer has that lecturer as his or her supervisor; so none of
the delete or all options
are appropriate. But what about save-update?.
There are two scenarios under which this might have an effect:
If you create a new (transient) lecturer and make a
persistent student refer to it. In fact, for this
particular object design, one would never do such a thing:
the obvious semantics of the situation dictate that you
cannot just invent new lecturers on demand: you would
always have to have the lecturer as a currently existing
object in the database before setting the student's
supervisor property to that lecturer. Since the scenario
will never arise, this is neither a vote for or against
using the save-update option.
If the Student, and associated
Lecturer objects were detached, and now
you reattach the Student object, then
you need the save-update option if you
want the Lecturer object to be
reattached automatically. Without that, you need to
reattach it directly yourself — an easy task to
overlook and therefore a source of bugs. This therefore,
is a vote for set the
cascade="save-update" option.
Note that you can specify unique="true" as an
attribute of the many-to-one element. This
has the effect of disallowing the possibility of having two
student rows with the same supervisor values,
i.e., turning the "*" on the Student side of
the class diagram into a "0..1" or limiting each lecturer to
having at most one supervisee. Similarly, specifying
not-null="true" adds the requirement that
every student must have a valid supervisor, i.e., it changes the
"0..1" on the Lecturer side of the diagram to a "1".
This relationship is essentially the same as last one, but now
we choose the opposite direction for navigating the connection.
Thus our Lecturer object now has a property
which is a collection of Student objects while the
Student objects have no properties which
refer to their supervising Lecturer.
![]() | Warning |
|---|---|
For reasons described below, we would (almost) never use such an association: it is inefficient and there is almost no overhead in converting it to a much more efficient bidirectional association. Nonetheless, it is useful to discuss this case as a first step towards the bidirectional version of the association. | |

A One-to-Many, Unidirectional Association
As far as the database is concerned, there is no difference
between this and the unidirectional
many-to-one association: there will still be
a single column in the table holding the
Student objects that contains a foreign key
into the table holding the Lecturer objects.
Now that entities are being stored in collections, it becomes
critical that you have appropriately implemented
equals and hashCode
methods for those entities. In particular, you should ensure that
these methods are independent of the generated surrogate keys
and that trivial changes to the object do not effect the
methods while the objects are in the collections.
The simplest collection, for our purposes, is a
Set. To create the association, we add a
Set valued property to
Lecturer
<set name="advisees" cascade="save-update" lazy="true">
<key column="lecturer_id"/>
<one-to-many class="Student"/>
</set>
Here, we define a property of Lecturer which
is Set valued. The name of the property is
advisees. This property is to capture a
one-to-many association to the
Student class and it this association is to be implemented in the
database as a foreign key to the table holding
Lecturer objects stored in the column
lecturer_id in the table holding
Student objects.
There are a number of constraints imposed by the use of this
one-to-many association which arises from the
fact that it is represented by this "reverse" link from the
contained object side of the association:
From a Java point of view, we could potentially have two
different Lecturer objects, both of
which have the same Student object in
their container. However, this is not possible for a
one-to-many association because, in the
database, each Student row refers to
the single Lecturer row which contains
it. If you want the true Java semantics, you have to
represent the association as a
many-to-many one.
You cannot have the same object multiple times in the same
collection. This is obvious when the collection property
is of type Set, but one could use other
types, such as List. However, the implementation of
association by the reverse foreign key makes this
impossible. Again, a many-to-many
association can provide the appropriate semantics
Finally, there is the question of why
one-to-many associations between entities cause
problems. Consider the following code:
tx = session.beginTransaction();
Lecturer lect = new Lecturer("Gordon Brown") ;
lect.getAdvisees().add(new Student("Tony Blair")) ;
lect.getAdvisees().add(new Student("Michael Howard")) ;
session.save(lect);
tx.commit();
Note that the association belongs to the
Lecturer class (as it is defined in
Lecturer's mapping file). This means that
adding a student to a lecturer's advisees is considered an
operation on a lecturer, not on a student. Thus the
SQL statements that would be generated for
the above statements would include an insert for the lecturer
object, together with an insert each for the two connected
student objects (because the students are referred to by the
lecturer and we have put the cascade="save-update"
declaration in the Lecturer's mapping file).
But because the association does not belong to the students, the
saving of the students would not set the foreign key value to the
advising lecture. Thus there would be two extra update statements for adding
the lecturer's foreign key value into the student records. These
extra two update statements are not just an efficiency problem:
If every student should have a supervisor, then we would like to
add the not-null="true" attribute to the
key element in the mapping file for the
association. However this would cause errors as the above
sequence of inserts and updates does insert nulls (if only to
immediately update them) where they should never occur.
The solution is to only create such
one-to-many associations as the inverse end
of a bidirectional many-to-one association.
This gives ownership of the association to the
Student end and, as we see below, leads to
the foreign key being created as part of the initial insert of
the Student record instead of after it as a
consequence of the Lecturer insert.
In this case we allow navigation in both directions between the
two classes. On the Many side it is a
standard java reference. on the One side it
is a collection. However, the two associations are not
independent of each other but rather, one is the inverse of the
other and the same foreign key column is used for both associations.
Thus our Lecturer object now has a property
which is a collection of Student objects while the
Student objects have a properties which
refers to the Student's supervising Lecturer.

A Many-to-One, bidirectional Association
To achieve this, we start by using the
many-to-one element as before in the mapping
file for the Student class, and the
Set element as before in the mapping file for
the Lecturer class, ensuring that both
associations use the same column in Student's
table to encode the association. Then we add a new attribute,
inverse="true" to the set
element in Lecturer's mapping file. Without
this, adding a new Student as an advisee to a
Lecturer would trigger Hibernate to set the
foreign key column of the Student table
twice: once for each association that has been changed. The
inverse attribute tells hibernate that
Student owns the association and that
Hibernate should not trigger updates of the foreign key column
when it changes on the Lecturer side.
Thus the mapping file for Lecturer looks like this:
<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
"-//Hibernate/Hibernate Mapping DTD//EN"
"http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<hibernate-mapping>
<class name="Lecturer" table="lecturers">
<id name="id" column="lecturer_id">
<generator class="sequence"/>
</id>
<version name="version" column="version"/>
<property name="name" column="name"/>
<set name="advisees" inverse="true" cascade="save-update" lazy="true">
<key column="lecturer_id"/>
<one-to-many class="Student"/>
</set>
</class>
</hibernate-mapping>
The mapping file for student is as follows:
<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
"-//Hibernate/Hibernate Mapping DTD//EN"
"http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<hibernate-mapping>
<class name="Student" table="students">
<id name="id" column="student_id">
<generator class="sequence"/>
</id>
<version name="version" column="version"/>
<property name="name" column="name"/>
<property name="regNo" column="reg_no"/>
<many-to-one name="advisor" column="lecturer_id" cascade="save-update"/>
</class>
</hibernate-mapping>
Now all the programmer has to do is to ensure that, when the
Lecturer's advisee
property property is changed, the corresponding correct changes
are made to the appropriate Student's
advisor property. So long as both are done
together, the Java object graph will be correct and the correct
update on disk will be made as well. Furthermore, since the
association belongs to the Student, there
will never be an insert of a Student record
with a null Lecturer foreign key if the
Student has an advisor, thus avoiding
not-null constraint breaking. To ensure that these updates are
made together, it is usual to add some convenience methods: in
Lecturer, change the getAdvisees() and
setAdvisees() methods to
private and add a convenience method to
update the object graph correctly when adding a new Student
advisee to a Lecturer:
public void addAdvisee(Student st)
{
Lecturer oldAdvisor = st.getAdvisor() ;
if (oldAdvisor != this)
{
if (oldAdvisor != null)
oldAdvisor.getAdvisees().remove(st) ;
st.setAdvisor(this);
advisees.add(st) ;
}
}
Note how we are careful to correctly handle removal of a
Student from a previous advising
Lecturer before adding it to this one.
Whether you need to do something similar for your code will
depend on your detailed design.
If we have a true composition relationship, i.e., a parent-child
relationship where if the parent gets deleted then the child
should also be deleted etc., then we should change the
cascade attribute on the
set element in the
Lecturer mapping file to be all-delete-orphan.
There is still plenty more to learn about Hibernate. There are one-to-one and many-to-many associations, value (as opposed to entity) collections, outer-join and batch fetching, Iterate queries, Criteria queries and the whole of the HQL query language, not to mention explicit SQL queries. There are alternative strategies for inheritance hierarchy mapping and polymorphism handling. There are user-defined data types and mappings, Interceptors, caching and all the Hibernate related tools. All of this and more are discussed on the Hibernate web site and in the book.