Getting Started with Hibernate

Revision History
Revision 1.36 February 2005aps

Table of Contents

1. Introduction to Hibernate
2. Example Hibernate Application
3. Database Interaction Pattern
4. The Hibernate Object Life Cycle
5. Hibernate Objects
6. The Session
7. Querying
8. Cascading Persistence
9. Transactions
9.1. Versioning
10. Mapping Classes to the Database
10.1. Mapping Simple Entity Classes without Relationships
10.2. Mapping Value Objects within Entities
10.3. Mapping Entities with Inheritance
10.4. Many-to-One, Unidirectional Associations
10.5. One-to-Many, Unidirectional Associations
10.6. Many-to-one, bidirectional Associations
11. Patterns
12. Going Further
References

1. Introduction to Hibernate

Hibernate is an Object Relational Mapping (ORM) tool. It manages the persistence of java objects in a relational database. The idea is that a programmer should be able to design his business objects as standard Java objects with very little interference from the problems of making these objects persist in a database. Together with a little help from the programmer, Hibernate saves the objects into the database, retrieves them when needed and supports queries on the database written in a form similar to SQL but which refers to objects and object properties instead of tables and column names. The end result is that the code that needs to be written to interact with the database is considerably shorter and simpler.

This document is intended to cover only the basics of Hibernate: i.e., do things the way it is described here until you have outgrown it. Hence many features are only referred to in passing or not mentioned at all. The full reference documentation and a great deal of other important information is available on line at http://www.hibernate.org/5.html:Hibernate On Line Documentation. However, for serious users of Hibernate, a thorough study of [BK05] is highly recommended.

2.  Example Hibernate Application

Here we provide a fully working, but very simple, example application using Hibernate. The example is based on some fragments that appear in chapter 2 of [BK05]. The files necessary are:

  • Main.java: the main class that manipulates Message objects, storing them into and retrieving them from the database.

  • Message.java: the class defining the objects that will be persisted to the database.

  • Message.hbm.xml: the mapping file that describes how the properties of a message file should be mapped to columns of tables in the database (along with other necessary information should as how keys are generated, what database constraints and indexes should be maintained etc.)

  • hibernate.properties: the Hibernate configuration file that specifies the database that is to be used, the database connection pooling system (if any) and other configuration parameters for the system.

  • log4j.properties: the log4j configuration file that sets many parameters of the logging system.

3.  Database Interaction Pattern

The basic pattern of database interactions via Hibernate is visible in the Main.java file above.

  1. Create a Configuration object, load the configuration parameters from the hibernate.properties and adjust them as required.

  2. Create a SessionFactory object from the Configuration object. The SessionFactory object is a heavyweight, thread safe object. You would normally share one such object between all your threads in a web application.

  3. For each unit of work (normally one use case) use the SessionFactory object to obtain a Session object. This is an extremely lightweight, non-thread safe object. It will be associated with a database connection but it only obtains that connection lazily, i.e., only when (and if) it is required. Session objects must not be shared between different threads.

  4. Inside a try block, get a Transaction object by calling beginTransaction() on the Session object.

  5. Interact with the database:

    • explicitly by calling methods of Session to associate objects to the database (i.e., map them to the database), execute queries, load, save, delete mapped objects etc.

    • implicitly by calling property mutators on mapped objects that will lead to the database being updated.

    • implicitly by referencing non-mapped objects from mapped objects which (in certain circumstances) can cause the non-mapped objects to be added to the database.

    • implicitly by unreferencing mapped objects from other mapped objects which (in certain circumstances) can cause the unreferenced objects to be deleted from the database.

  6. Call commit() on the Transaction object and close the try block, handling exceptions and closing the Session object in the usual way.

4. The Hibernate Object Life Cycle

  • Transient objects do not (yet) have any association with the database. they act like any normal Java object and are not saved to the database. When the last reference to a transient object is lost, the object itself is lost and is (eventually) garbage collected. There is no connection between transactions and such objects: commits and rollbacks have no effects on them. They can be turned into persistent objects via one of the save method calls if the Session object or by adding a reference from a persistent object to this object.

  • Persistent objects do have an association with the database. They are always associated with a persistence manager, i.e., a Session object and they always participate in a transaction. Actual updates of a database from the persistent object may occur at any time between when the object is updated to the end of the transaction: it does not necessarily happen immediately. However, this feature, which allows important optimizations in database interactions, is essentially invisible to the programmer. For example, one place where one might expect to notice the difference between the in-memory persistent object and the database version is at the point of executing a query. In such a case, Hibernate will, if necessary, synchronise any dirty objects with the database (i.e., save them) in order to ensure that the query returns the correct results.

    A persistent object has a primary key value set, whether or not it has been actually saved to the database yet.

    Calling the delete method of the Session object on a persistent object will cause its removal from the database and will make it transient.

  • Detached objects are objects that were persistent but no longer have a connection to a Session object (usually because you have closed the session). Such an object contains data that was synchronised with the database at the time that the session was closed, but, since then, the database may have changed; with the result that this object is now stale.

    [Important]Important

    A detached object may be re-attached later to another Session object to become persistent again. Thus, in essence, these objects can happily exist, and be used, without concern for being inside a transaction. This mechanism, in fact, is the basis for letting business objects which are stored persistently in the database, to escape up to higher levels in the system without having to add extra value beans (also known as Data Transfer Objects (DTOs) which exist to copy the data of objects tied to one layer in the system to objects tied to another layer. Without this mechanism, one typically has to create a number of classes for each business object, where the instance variables are all basically the same, but which differ in the layer specific details.

The Hibernate Object Life Cycle

The Hibernate Object Life Cycle

Given a pair of (persistent) objects of the same class, we now have three concepts of identity to consider.

  • a==b  Java Identity
  • a.equals(b)  Java Equality
  • a.getId().equals(b.getId())  Database Identity

The rule for Hibernate, is that if, within a single session, you request two objects which have the same database identifier, then you will get references to the same actual objects. However, if you reattach a object to a session, you have a potential source of confusion in that you could end up with two different persistent objects (different as defined by Java Equality), which should be stored in the same database row.

Since the programmer can define the meaning of Java Equality, it is important not to use the id field in that definition if the id field is a surrogate key. This is because Hibernate only sets the field when saving the object. Hence, for example, if you add the object to some set collection, then saving the object will result in its identity changing, and part of the rules about using the set collection class is that the contained object's identity must not change while it is in the collection.

In fact, this situation is almost certain to occur because of the frequent use of collection classes to represent the many side of one-to-many or many-to-many relationships. Therefore we use the Java Equality concept to define when two objects should really be the same database object.

However, there are other problems with using all the non-id values of an object in the equality test: you really want the test to return true if the objects map to the same row of the the same table in the database (i.e. they represent the same real world concept). But two objects may represent the same real world object and have some different values. For example, two Customer objects may differ in the value of a password property (because the two objects date from different instances in time between which the customer has change her password). But they still refer to the same real world concept: i.e., the same customer.

The solution is to decide on a Business Key for a class. This is like a database key, but involves no generated surrogate keys. Instead it consists of those "real-world" properties of the class that the programmer considers to uniquely identify a particular record. It is not a requirement that the business key absolutely never changes, merely that it changes will not change within the period in which it might be stored in memory in a collection class. For the Customer class on a web application, an appropriate business key might be the customer's email address. This, of course, can change, in which case the customer will be treated as a new different customer. However, this is rarely a significant problem, and if it is, one can always provide a mechanism to reconnect the old data about the customer to the new customer record. More importantly, from our point of view, a change of customer email address is extremely unlikely to affect any reattachment of a detached Customer object to a new session.

Note that Hibernate does not know or care anything about your business keys. As far as it is concerned, reattaching an object works by checking the id property of the object. If it is null, then the object is a new one that could be added to the database but certainly cannot be reattached. Otherwise, the object can be matched up with a record in the database and, on reattachment, the contents of the object are used to update the contents of the corresponding database record(s).

In writing an equals method, there are two important considerations to bear in mind:

  • If you write an equals method, you must write a hashCode method which always returns the same value for two objects which equals decides are equal.

  • When referring to instance variables of the argument object, always use the accessor method rather than the raw instance variable: this is because, in an environment such as a web application or service, you may actually be dealing with a proxy object rather than the actual object you expect for reasons of, for example, distributed load balancing or scalability to very large service loads.

Given that, the equals and hashCode methods should be written as follows:

public class Customer
{
    …
    public boolean equals(Object other)
    {
        if (this==other)
            return true;
        if (other==null)
            return false;
        if (!(other instanceof Customer))
            return false;
        final Customer o = (Customer) other;
        return this.emailAddress.equals(o.getEmailAddress());
    }

    public int hashCode()
    {
        return emailAddress.hashCode();
    }
}

5.  Hibernate Objects

A Hibernate object, suitable for mapping into a database, is a normal java bean with a number of extra requirements.

  • There must be a default constructor for the class.

  • There must be accessors and mutators for all the instance variables of the class. Actually this is overstating the requirement but is a good base rule: read the Hibernate documentation for the full details.

  • The class should implement Serializable. Strictly speaking, this is not a requirement. However, in practice you will normally want your Hibernate objects to be serializable so that they can be (potentially) migrated around a multiprocessor cluster or saved and restored across a web server reboot etc.

  • The class should have an id instance variable, usually of type Long. Again this is not a true requirement but it is recommended to use automatically generated surrogate keys and, if so, to use an instance variable called id to hold it. Certainly, alternatives are possible.

  • The mutator for the id property should be private, not public. Again not a requirement but good practice. You should never update the id property directly but rather rely on Hibernate updating it for you. In practice, it is the value of this field that Hibernate uses to decide if an object has been mapped to a database record or not. Change the property yourself and you could seriously confuse Hibernate.

  • You should decide on a business key for the object and implement the equals and hashCode methods for it.

  • You should add any extra type specific constructors (which should leave the id field null) and business rule methods you like.

  • You should not make the class final if you want to be able to use lazy loading for objects of the class.

6. The Session

Once you have a Session object, and are executing inside a Transaction, there are a number of ways you can interact with the database.

  • session.get is used to create a new persistent object by id from the database. It returns null if there was no such object in the database. session.load is similar except that if there was no such object in the database it throws an exception.

    [Important]Important

    Conceptually, these methods do not just get the object requested but also all objects that it refers to through its properties, and, transitively, all objects that they refer to as well, and so on. If the database is large, and there is a path of associations from every object to every other object, then fetching one object could try to load in the entire database. There are two issues to consider:

    1. Controlling what really gets loaded while still making everything work as if everything referred to has been loaded. This is done by a technique of lazy loading using proxies. To specify the use of proxies for a class, the attribute lazy="true" must be specified for that class in the mapping file.

      Using lazy fetching, however, now means that such lazy associations can only be turned into eager associations (by a process called initialisation) within a session. Within a session, simply accessing a non-id property of a persistent (although possibly lazy) object initialises it. But once the object is detached, any request for a lazy and uninitialised property will throw an exception. There is a static method, Hibernate.initialise, which can be used to ensure that a lazy or proxy object is materialised before closing the session, and another Hibernate.isInitialised, to test its initialisation state. However, these methods do not work recursively over the whole object graph. Therefore, the programmer must do one of the following:

      • Recursively walk over the object graph, initialising objects on the way.

      • Re-attach the object to a session before de-referencing potentially uninitialised objects.

      • Load the objects in a query in which the fetch strategy has been changed to an eager one. This runtime mechanism can override the lazy setting for the objects in the mapping file.

      • Keep the session open until any possible chain of dereferences of the properties of the object has been completed.

      The standard advice ([BK05]) is to make all associations lazy by default in the mapping files and override this at runtime, where necessary, with queries that force eager fetching.

    2. Loading parts of the object graph efficiently. The naïve approach would be for Hibernate to simply load the object requested, get the ids of the objects referred to, load them and so on, with each load being a separate SQL query. Hibernate provides that strategy as an option, but a possibly significantly more efficient strategy is also available: that of executing an outer join to get the first object and the objects it refers to in one query. This is specified in the mapping file with the attribute outer-join="true" on the association elements, i.e., one-to-one many-to-one, one-to-many and many-to-many. For detailed semantics and other parameters of this feature, see the manual http://www.hibernate.org/5.html:Hibernate On Line Documentation.

  • session.delete will cause the database row corresponding to a persistent (or even a detached!) object to be deleted and the object will become transient. What happens to persistent objects it refers to depends on the cascade properties of the mapping configuration for that reference.

  • session.save on a transient item will assign it a id and make the object persistent: i.e., ensure it, and any other objects it refers to, get saved to the database. This operation essentially calls an SQL INSERT to be executed. Any further calls to the mutators of the object within the transaction will cause an SQL UPDATE to be invoked.

  • session.lock and session.update are both intended for reattaching a detached object. Normally you should use session.update which triggers an SQL UPDATE to the database row with id equal to that of the object. Thus if the database and the object disagreed on the values contained, then the object overrides the database. session.lock simply reattaches the object (to the session) without checking or updating the database on the assumption that the database is still fully in synch with the object. Generally, do not use this method unless you are absolutely sure that nothing has changed the database state of the object since it was detached or if it does not matter because you will be overwriting all columns that could have changed anyway later on in the transaction.

  • session.saveOrUpdate is a convenience method that checks whether the object is transient, in which case it acts like session.save, or detached, in which case it acts like session.update.

7.  Querying

So far, our discussions above show how to fetch an object from the database if we know its id. Obviously we need more powerful querying facilities.

As describe above, one of the simplest ways of querying is simply by invoking a chain of accessor methods on a persistent object:

X x = z.getY().getX()

The most critical type of access that is not covered above is finding an object (or collection of objects) when you have some information about them but not the identifier and you do not have a persistent object available that refers to them. For example, if we want to find the customers whose email address match a given one. Here we can use the session.find method. This method always returns a (possibly empty) list of the appropriate result objects. There are three forms of find:

  1. session.find(queryString) where queryString is a String containing no "?" parameter placeholders. In this case the query takes no parameters and can be executed directly.

  2. session.find(queryString, pObject, pType) where queryString is a String containing one "?" parameter placeholder, pObject is an object of the required type to use as the indicated parameter and pType is the corresponding Hibernate type constant (e.g. Hibernate.STRING, Hibernate.DATE, etc.). In this case the object is substituted into the query in much the same way that JDBC preparedStatement queries have their parameters substituted in.

  3. session.find(queryString, pObjectArray, pTypeArray) where queryString is a String containing one or more "?" parameter placeholders, pObjectArray is an array of objects of the required size and types to use as the indicated parameters and pTypeArray is the corresponding array of Hibernate type constants, which again must match. In this case the objects from the object array are substituted into the query in much the same way that JDBC preparedStatement queries have their parameters substituted in.

Finally, the query string format is not SQL but HQL. HQL is Hibernate's query language which looks very similar to SQL but where, instead of names of tables and columns, the query uses names of java objects and properties. Of course there is a great deal more to it that that and full details can be found in http://www.hibernate.org/5.html:Hibernate On Line Documentation or [BK05]

8.  Cascading Persistence

We have said, a number of times, that when an object is made persistent, that the objects it refers to are also made persistent. This was an oversimplification. In the mapping files for the classes, there is an attribute, cascade that lets us control how much, or how little, of a reference graph gets automatically persisted, deleted or updated. The values that it can be set to, and their meanings, are as follows:

  • none: no automatic action on the referenced object takes place.

  • save-update: automatically save or update the referenced object when the referencing object is saved or the transaction commits.

    delete: automatically delete the referenced object when delete() is called on the referencing object. Note that, if the referencing object is not deleted but merely removes its reference to the referenced object, then this option will not do anything and, potentially, a garbage (or orphan) object will be left in the database.

  • delete-orphan: automatically delete any object for whom the reference has been removed from the referencing object.

  • all: take the same actions as save-update and delete but not that of delete-orphan.

  • all-delete-orphan: take the same action as save-update, delete and delete-orphan.

9. Transactions

As we have seen, the standard pattern for executing a use case is to get a Session from the SessionFactory, get a Transaction from the Session, interact with the database via the Session, commit the Transaction and close the Session. Details of the exception handling have been given above. This is fine for normal, fully serialised database transactions but there are two situations when it is not fine:

  1. When using an isolation level other than full serializable. There are 4 standard transaction isolation levels: Read Uncommitted, Read Committed, Repeatable Read and Serializable. They indicate different levels of locking strategies used within transactions and effect just how isolated one transaction really is from other transactions running simultaneously. Thus Serializable means that two transactions run as if one had completely finished (committed or rolled back) before the other had started. In practice, providing this level of isolation requires considerable resources and causes problems with scalability of applications. For this reason, most applications use a weaker form of isolation and use other strategies to overcome the consequent problems that can arise.

    The most common isolation level used, and the default obtained with a PostgreSQL JDBC connection, is Read Committed. This ensures that one transaction can not see any value which has been written by another transaction if that other transaction has not yet committed. Therefore we don't have to worry about the other transaction rolling back with the result that we would have to roll back this transaction. With this level, the isolation problems that can occur are:

    • Lost Updates: tx 1 reads a row, tx 2 reads the same row, tx1 writes the row and commits, tx 2 writes the row and commits, the value written by tx1 is lost. This can be dealt with using versioning (see below).

    • Unrepeatable Reads: Transaction (tx) 1 reads a row, tx 2 writes the row and commits, tx 1 reads the same row and gets a different result from last time. The fact that Hibernate uses a cache, and, adding to that, the use of versioning, will handle this situation.

    • Phantom Read: tx 1 executes a Select query. tx 2 inserts or deletes new rows in the database and commits, tx 1 executes the same query again and finds a different set of rows from what was there last time. There is very little that you can do about this except to be aware, when you design your transactions, of the issue.

  2. When long transactions are required. Here the problem is usually that some user interaction, which could take a considerable period of time, is required in a use case with database accesses taking place both before and after the user interaction. The issue is that we should not keep a database transaction open for a long period of time (i.e., for more than a fraction of a second), whereas a user interaction could take from minutes to hours. The solution is to break the long transaction up into two (or more) database transactions, and to use detached objects from the first transaction to carry the necessary information to the presentation layer. These objects get modified, outside any transaction, as part of the user interaction and then reattached to the second transaction to cause the necessary updates.

    If we left it at that, then all the isolation problems described above could occur, as well as the nastier one of a Dirty Read: Consider a long transaction and a normal transaction: the first contains two database transactions tx1a and tx1b (perhaps the first is to book a flight, the second to book a hotel). The other transaction, tx2 might be to order meals for the passengers on the flight (okay; not very realistic but you get the idea). Now tx1a could update a row and commit and tx2 could read it. Now tx1b decides to roll back (perhaps there were no hotels available). This only rolls back tx1b itself, because tx1a is committed and cannot be rolled back, but since this is part of a long transaction, the programmer has written explicit code to undo the effects of tx1a when tx1b rolls back (such an undo operation is normally called a compensating transaction: it may not even really undo the original transaction, it might only make up for it in some way such as authorizing a reimbursement or a voucher if a promised booking cannot be honoured). Now tx2 has executed, read data from a long transaction, which has since been undone, and has committed. To be correct, the effects of this second transaction should be undone as well but there is no way of knowing this.

    The safe solution here is only to allow database writes in the last database transaction of a long transaction, and to use versioning there. Under certain application specific circumstances, more relaxed strategies can be taken but the above rule of thumb is safe and simple and should only be ignored with very careful analysis.

9.1. Versioning

The idea of versioning is very simple and particularly easy to handle with the support Hibernate provides:

  • Add a version instance variable, and corresponding accessor and mutator methods, to your objects. An int or long is recommended although some people prefer a TimeStamp or Calendar. The latter two have slightly worse performance and are not absolutely guaranteed to work correctly (one might end up with two updates made so close together in time that they have the same TimeStamp value - although some operating systems ensure that this can not the case), but they have the advantage that you can easily see exactly when the update was made.

  • declare the version instance variable in your mapping file for the class. This requires the following element added to the mapping file immediately after your id element (assuming your instance variable is called "version" and you want it in a column in the database called "version"):

    <version name="version" column="version">

Now whenever you make an object dirty in memory, Hibernate will update its version (in memory). Whenever the object gets flushed to disk (e.g., at the end of a transaction or because you call session.update or session.saveOrUpdate to re-attach a detached object, Hibernate will throw a StaleObjectStateException if the version number of the object on disk is not the same as it was when the object was loaded. By catching that exception, the programmer can then decide what to do about the conflict (e.g., report back to the user that the choice he/she has just made is, in fact, no longer available and could they please make another one).

10. Mapping Classes to the Database

Hibernate objects, i.e., objects whose persistence Hibernate will manage, can be divided into two types.

  1. Entity beans are objects which have a persistent identity: i.e., usually and identifier field which is managed by Hibernate. These are typically the central business objects in an application such as User, Customer, Order etc.

    Value beans are objects which only exist in relationship to an entity bean. These are typically support objects for the entity objects such as Address.

The connection between entity beans and database tables and columns is described in a mapping file: usually named X.hbm.xml for class X and stored in the same directory as the compiled file X.class.

The connection between value beans and the database is usually described in the mapping file for the corresponding entity bean.

In the following sections, we will look at the details of how to specify mappings for different types of mappings and classes. We only cover the basic situations. There are many variations possible and great flexibility in For all the options and more complex situations

10.1.  Mapping Simple Entity Classes without Relationships

A basic mapping file is as follows:

<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
        "-//Hibernate/Hibernate Mapping DTD 2.0//EN"
        "http://hibernate.sourceforge.net/hibernate-mapping-2.0.dtd">
<hibernate-mapping package="a.b.c">                                            (1)
        <class name="User" table="user" lazy="true">                           (2)
                <id name="id" column="id" type="long">                         (3)
                        <generator class="sequence"/>
                </id>
                <version  name="version"     column="version"/>                (4)
                <property name="dateOfBirth" column="dob" type="date"/>        (5)
                <property name="username"    not-null="true" unique="true"/>   (6)
                <property name="gender"/>                                      (7)
        </class>
</hibernate-mapping>
1

The package attribute is optional but using it defines a default package prefix for all classes mentioned in this class mapping specification

2

The class element is to specify the class that will be persisted. The name attribute is required. The column attribute is optional (it defaults to a suitable SQL name based on the class name). The lazy attribute is optional (it defaults to false) and specifies that lazy loading of this class via proxies should take place as described previously.

3

The id element defines what the primary key of the table should be (again the column attribute is optional). There are a number of options for type and the identifier generator algorithm but leaving it as shown here (i.e., using the database supplied sequence or automatic number generator) is a safe option.

4

The use of the optional version element to specify optimistic concurrency control was discussed above. Without this element, no check for lost updates or unrepeatable reads will be made, and therefore such problems may occur, when reattaching detached objects (irrespective of the transaction isolation level) or when updating the database from a modified persistent object (when the isolation level is read committed or less).

5

Types for columns do not need to be specified if they are simple and can be deduced by inspecting the class properties. However, sometimes you want to override the default or specify something a bit more sophisticated.

6

We can specify whether a column should be unique (which will create a database constraint) and/or whether it can be null.

7

Finally, the simplest case is where we specify nothing but the property name — everything else is taken care of by the defaults.

10.2.  Mapping Value Objects within Entities

Sometimes on has properties of entity beans which are more complex than simple base types but which are totally owned by the entity. The standard example is that of an Address object stored as a property of a Person object. In this case there is only one Address object for the Person object.

Value Component

A Value Component

While there are a number of ways this can be handled, but the simplest, which Hibernate calls components is to store both the parent object (Person) and the child (Address) in the same database row, to construct and connect the two objects on reading this row from the database and to coalesce the two objects and write them together when saving or updating either of them. To specify this, you use a component sub-element for the child object in the class element for the parent object instead of the usual property element. Thus, instead of something like

<property name="address" type="string"/>

one would enter:

<component name="address" class="Address">
    <property name="street"   column="user_street"/>
    <property name="postcode" column="user_postcode"/>
</component>

Some details to be aware of are:

  1. These child objects are wholly owned by their parents: you cannot have two different parents.

  2. A null child property is represented in the database by setting to null all the fields corresponding to the child object. Thus loading such a row will result in a parent object with a null child property, not a parent object with a child object whose properties are all null.

  3. Not only can you have multiple components in a class, but one can have multiple components of the same (child) class in a class: simply make sure that the component names are different and that the database field names (the column attributes are different for the different components

    Normally, the child object has no way to reference its parent object. However, if you want a property of the child object to refer to its parent, add a element of the form

    <parent name="person"/>

    to make a person property of the child object refer to its parent object.

10.3.  Mapping Entities with Inheritance

Again there are a number of ways Hibernate can handle inheritance. These are based on the standard techniques for reducing generalisation hierarchies in entity-relationship diagrams[BCN91].

The simplest is to use one table for the whole hierarchy. With this design, each row of the table can hold an object of any type from the hierarchy. There is one column for each of the properties in the union of the sets of properties of all the classes in the hierarchy and there is one discriminator column which contains a value (usually of type string, character or integer) used to tell which actual type of object is stored in this particular row. One normally does not make this discriminator a property of the class: it is used only by Hibernate to record and detect the type of the object that a row represents.

An Inheritance Class Hierarchy

An Inheritance Class Hierarchy

<class name="Person" table="people" discriminator-value="P">
    <id name="id" column="id" type="long">
        <generator class="sequence"/>
    </id>
    <version  name="version"         column="version"/>
    <discriminator column="subclass" type="character"/>
    <property name="dateOfBirth"     column="dob" type="date"/>
    <property name="name"/>
    <property name="gender"/>
    <subclass name="Lecturer" discriminator-value="L">
        <property name="office" type="string"/>
        <property name="telephone" type="string"/>
    </subclass>
    <subclass name="Student" discriminator-value="D">
        <property name="studentID" type="integer"/>
    </subclass>
</class>

Here one may not specify any of the subclass fields as not null because the corresponding column will be null in the table for any object of the hierarchy which is not of the subclass that contains the relevant property for that column.

10.4.  Many-to-One, Unidirectional Associations

This corresponds to the standard Java reference to one object from another.

A Many-to-One, Unidirectional Association

A Many-to-One, Unidirectional Association

In the diagram above, we represent the relationship between students and their thesis supervisors. In this design, a student can have no more than one supervisor but may not (yet) have any. However, a lecturer may have any number, including zero, of students to supervise. Furthermore, we only allow a one directional link: Student has a property (say getSupervisor/setSupervisor) but there is no direct way, starting with a Lecturer object, to find the students that the lecturer supervises.

If we start with the simple, non-related, base entity mapping files for Student and Lecturer, we add this association by adding the following, as a sub-element of the class element, to the mapping file for Student:

<many-to-one name="supervisor"	column="supervisor"/>

This element acts very much like a normal property element in that it defines the mapping between the supervisor property of Student and the column in the students table. However, it also sets up the relationship so that, after getting a Student object from the database, if we use the supervisor accessor of that Student object, we will get the corresponding Lecturer object (or a proxy thereof if we have enabled lazy loading of the Lecturer objects). Finally, it ensures that the underlying database is created with a foreign key constraint that the supervisor column is a foreign key into the lecturers table.

As things stand, there is now a question of what you want the cascade behaviour of the relationship to be (see the section on cascade above). Without adding the optional cascade attribute to the many-to-one element, then the Lecturer object on the other end of the association is ignored when the Student object is saved, updated, deleted or when its supervisor property is reset away from it. Certainly we would not want the lecturer to be removed from the database when the student is deleted or when the student no longer has that lecturer as his or her supervisor; so none of the delete or all options are appropriate. But what about save-update?. There are two scenarios under which this might have an effect:

  1. If you create a new (transient) lecturer and make a persistent student refer to it. In fact, for this particular object design, one would never do such a thing: the obvious semantics of the situation dictate that you cannot just invent new lecturers on demand: you would always have to have the lecturer as a currently existing object in the database before setting the student's supervisor property to that lecturer. Since the scenario will never arise, this is neither a vote for or against using the save-update option.

  2. If the Student, and associated Lecturer objects were detached, and now you reattach the Student object, then you need the save-update option if you want the Lecturer object to be reattached automatically. Without that, you need to reattach it directly yourself — an easy task to overlook and therefore a source of bugs. This therefore, is a vote for set the cascade="save-update" option.

Note that you can specify unique="true" as an attribute of the many-to-one element. This has the effect of disallowing the possibility of having two student rows with the same supervisor values, i.e., turning the "*" on the Student side of the class diagram into a "0..1" or limiting each lecturer to having at most one supervisee. Similarly, specifying not-null="true" adds the requirement that every student must have a valid supervisor, i.e., it changes the "0..1" on the Lecturer side of the diagram to a "1".

10.5.  One-to-Many, Unidirectional Associations

This relationship is essentially the same as last one, but now we choose the opposite direction for navigating the connection. Thus our Lecturer object now has a property which is a collection of Student objects while the Student objects have no properties which refer to their supervising Lecturer.

[Warning]Warning

For reasons described below, we would (almost) never use such an association: it is inefficient and there is almost no overhead in converting it to a much more efficient bidirectional association. Nonetheless, it is useful to discuss this case as a first step towards the bidirectional version of the association.

A One-to-Many, Unidirectional Association

A One-to-Many, Unidirectional Association

As far as the database is concerned, there is no difference between this and the unidirectional many-to-one association: there will still be a single column in the table holding the Student objects that contains a foreign key into the table holding the Lecturer objects.

Now that entities are being stored in collections, it becomes critical that you have appropriately implemented equals and hashCode methods for those entities. In particular, you should ensure that these methods are independent of the generated surrogate keys and that trivial changes to the object do not effect the methods while the objects are in the collections.

The simplest collection, for our purposes, is a Set. To create the association, we add a Set valued property to Lecturer

<set name="advisees" cascade="save-update" lazy="true">
    <key column="lecturer_id"/>
    <one-to-many class="Student"/>
</set>

Here, we define a property of Lecturer which is Set valued. The name of the property is advisees. This property is to capture a one-to-many association to the Student class and it this association is to be implemented in the database as a foreign key to the table holding Lecturer objects stored in the column lecturer_id in the table holding Student objects.

There are a number of constraints imposed by the use of this one-to-many association which arises from the fact that it is represented by this "reverse" link from the contained object side of the association:

  1. From a Java point of view, we could potentially have two different Lecturer objects, both of which have the same Student object in their container. However, this is not possible for a one-to-many association because, in the database, each Student row refers to the single Lecturer row which contains it. If you want the true Java semantics, you have to represent the association as a many-to-many one.

  2. You cannot have the same object multiple times in the same collection. This is obvious when the collection property is of type Set, but one could use other types, such as List. However, the implementation of association by the reverse foreign key makes this impossible. Again, a many-to-many association can provide the appropriate semantics

Finally, there is the question of why one-to-many associations between entities cause problems. Consider the following code:

tx = session.beginTransaction();
Lecturer lect = new Lecturer("Gordon Brown") ;
lect.getAdvisees().add(new Student("Tony Blair")) ;
lect.getAdvisees().add(new Student("Michael Howard")) ;
session.save(lect);
tx.commit();

Note that the association belongs to the Lecturer class (as it is defined in Lecturer's mapping file). This means that adding a student to a lecturer's advisees is considered an operation on a lecturer, not on a student. Thus the SQL statements that would be generated for the above statements would include an insert for the lecturer object, together with an insert each for the two connected student objects (because the students are referred to by the lecturer and we have put the cascade="save-update" declaration in the Lecturer's mapping file. But because the association does not belong to the students, the saving of the students would not set the foreign key value to the advising lecture. Thus there would be two extra update statements for adding the lecturer's foreign key value into the student records. These extra two update statements are not just an efficiency problem: If every student should have a supervisor, then we would like to add the not-null="true" attribute to the key element in the mapping file for the association. However this would cause errors as the above sequence of inserts and updates does insert nulls (if only to immediately update them) where they should never occur.

The solution is to only create such one-to-many associations as the inverse end of a bidirectional many-to-one association. This gives ownership of the association to the Student end and, as we see below, leads to the foreign key being created as part of the initial insert of the Student record instead of after it as a consequence of the Lecturer insert.

10.6.  Many-to-one, bidirectional Associations

In this case we allow navigation in both directions between the two classes. On the Many side it is a standard java reference. on the One side it is a collection. However, the two associations are not independent of each other but rather, one is the inverse of the other and the same foreign key column is used for both associations. Thus our Lecturer object now has a property which is a collection of Student objects while the Student objects have a properties which refers to the Student's supervising Lecturer.

A Many-to-One, bidirectional Association

A Many-to-One, bidirectional Association

To achieve this, we start by using the many-to-one element as before in the mapping file for the Student class, and the Set element as before in the mapping file for the Lecturer class, ensuring that both associations use the same column in Student's table to encode the association. Then we add a new attribute, inverse="true" to the set element in Lecturer's mapping file. Without this, adding a new Student as an advisee to a Lecturer would trigger Hibernate to set the foreign key column of the Student table twice: once for each association that has been changed. The inverse attribute tells hibernate that Student owns the association and that Hibernate should not trigger updates of the foreign key column when it changes on the Lecturer side.

Thus the mapping file for Lecturer looks like this:

<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
          "-//Hibernate/Hibernate Mapping DTD//EN"
          "http://hibernate.sourceforge.net/hibernate-mapping-2.0.dtd">
<hibernate-mapping>
    <class name="Lecturer" table="lecturers">
        <id name="id" column="lecturer_id">
            <generator class="sequence"/>
        </id>
        <version name="version" column="version"/>
        <property name="name" column="name"/>
        <set name="advisees" inverse="true" cascade="save-update" lazy="true">
            <key column="lecturer_id"/>
            <one-to-many class="Student"/>
        </set>
    </class>
</hibernate-mapping>

The mapping file for student is as follows:

<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
          "-//Hibernate/Hibernate Mapping DTD//EN"
          "http://hibernate.sourceforge.net/hibernate-mapping-2.0.dtd">
<hibernate-mapping>
    <class name="Student" table="students">
        <id name="id" column="student_id">
            <generator class="sequence"/>
        </id>
        <version name="version" column="version"/> 
        <property name="name" column="name"/>
        <property name="regNo" column="reg_no"/>
        <many-to-one name="advisor" column="lecturer_id" cascade="save-update"/>
    </class>
</hibernate-mapping>

Now all the programmer has to do is to ensure that, when the Lecturer's advisee property property is changed, the corresponding correct changes are made to the appropriate Student's advisor property. So long as both are done together, the Java object graph will be correct and the correct update on disk will be made as well. Furthermore, since the association belongs to the Student, there will never be an insert of a Student record with a null Lecturer foreign key if the Student has an advisor, thus avoiding not-null constraint breaking. To ensure that these updates are made together, it is usual to add some convenience methods: in Lecturer, change the getAdvisees() and setAdvisees() methods to private and add a convenience method to update the object graph correctly when adding a new Student advisee to a Lecturer:

    public void addAdvisee(Student st)
    {
        Lecturer oldAdvisor = st.getAdvisor() ;
        if (oldAdvisor != this)
        {
            if (oldAdvisor != null)
                oldAdvisor.getAdvisees().remove(st) ;
            st.setAdvisor(this);
            advisees.add(st) ;
        }
    }

Note how we are careful to correctly handle removal of a Student from a previous advising Lecturer before adding it to this one. Whether you need to do something similar for your code will depend on your detailed design.

If we have a true composition relationship, i.e., a parent-child relationship where if the parent gets deleted then the child should also be deleted etc., then we should change the cascade attribute on the set element in the Lecturer mapping file to be all-delete-orphan.

11. Patterns

12. Going Further

There is still plenty more to learn about Hibernate. There are one-to-one and many-to-many associations, value (as opposed to entity) collections, outer-join and batch fetching, Iterate queries, Criteria queries and the whole of the HQL query language, not to mention explicit SQL queries. There are alternative strategies for inheritance hierarchy mapping and polymorphism handling. There are user-defined data types and mappings, Interceptors, caching and all the Hibernate related tools. All of this and more are discussed on the Hibernate web site and in the book.

References

[BCN91] Carlo Batini, Stefano Ceri, and Shamkant Navathe. Conceptual Database Design . Benjamin Cummings. 1991. 0805302441.

[BK05] Christian Bauer and Gavin King. Hibernate in Action . Manning Publications Co.. 2005. 1932394-15-X.