Software System Components 2 - Year 2, Semester 2, Week 9
Concurrency - Dr Steve Vickers
3: Thread examples
This week we look at some particular aspects and examples of threads.
At the end of this week's lectures you should know -
- the idea of thread safety;
- how to use collections safely with threads;
- how to use Swing safely with threads;
- the existence of some useful thread algorithms in the Java
library.
Examples from the library
Now you have seen the basic tools - threads, runnables, locks and
conditions - for dealing with concurrency. It is possible to use these
in many different kinds of thread interaction. In older courses you
would have been taught how to implement some standard examples, but in
fact in the Java 1.5 library this is already done for you. In Java, it is much better to use those library
utilities.
- It saves you work.
- They are standardized, which makes your code easier to understand
for other programmers.
- They have been very carefully written to be safe and efficient;
and any remaining errors can be corrected in future releases.
If you move on to a language other than Java you may have to write your
own implementations; but you should not find this so hard if you first
get used to the standard patterns. Also, the Java library source code
is readable, so you can see what algorithms are used there.
When I describe these utilities, I shall also mention some new features
of Java 1.5 that you may not have seen before.
Producer-consumer buffer
This is a classic pattern which you can find described in many texts
about concurrency.
Imagine a program where a number of threads (the producers) are
producing items, and a number of other threads (the consumers) are
taking those items and processing them in some way. Each consumer can
process any item - it doesn't matter which thread produced it.
When an item has been produced, it is stored temporarily in a buffer,
until a consumer is available to process it.
All these threads share a common resource: the buffer. When a producer
stores an item in the buffer, or when a consumer removes it, they must
have exclusive access to the buffer. This is because the buffer must
store various pieces of information to stay, for instance, how many
items it holds, where the next one to read is, etc. The code that
stores or removes an item is a critical region, since if it interfered
with part way through, the buffer will probably be in an inconsistent
state (the buffer has invariants
that must be maintained). Hence the buffer must be protected with a
lock.
However, there are also two conditions
involved. Even if a producer gets the lock, it cannot store an item if
the buffer is full - it must wait for an item to be consumed. And if a
consumer gets the lock, it cannot remove an item if the buffer is empty
- it must wait for an item to be produced.
Hence the standard pattern here is to have some kind of collection for
the buffer, protected by a lock with two conditions. (Exercise: try
this. It is not too hard. One solution is to use an array for the
buffer, with extra fields to say how many items it has and where the
next one is to be stored. You must imagine the array arranged in a
circle.)
In Java this can be done using the interface java.util.concurrent.BlockingQueue<E>.
This is a new part of the collections framework.
What does <E> mean? In older versions of Java,
the elements of a collection were treated as of type Object. That meant that when
you read an element you had to cast it to the type it actually was. In
Java 1.5 you can tell the compiler what type to use for the elements of
a collection, and it can then check that you only store elements of the
right type. The interface BlockingQueue<E>
is a generic type, which
means it covers a range of more specific possibilities specified by
replacing E (which is
like a type parameter) by an actual type. For example, BlockingQueue<String> is
the type of blocking queue whose elements are all strings.
Other collection types, such as ArrayList,
have also been made generic. This is a very useful feature of Java 1.5,
and you should use it if you can. However, if you are unsure about
generic types you can still use collection types in the old way, so
that they are collections of Objects.
API for BlockingQueue
The API says -
Interface BlockingQueue<E>
- Type Parameters:
E - the type of elements held in this
collection
|
The type parameter E will
also be used in the method declarations. The most useful are put and take.
void put (E o) - adds the
specified element o
(which must be of type E)
to the queue, waiting if necessary for space to become available. o must not be null. You cannot
store null elements in the queue.
Notice how the type parameter is used. For example, if your blocking
queue has specific type BlockingQueue<String>,
then put is really void put
(String o).
E take() - gets the head
of the queue, removing it from the queue. Waits if necessary for an
element to be available.
Note - Both put and take can throw InterruptedException, for the
same reason as sleep and await.
BlockingQueue itself is an
interface, and so you cannot instantiate it. There are various classes
implementing it. The one that most closely matches the classic
producer-consumer buffer is java.util.concurrent.ArrayBlockingQueue<E>
(generic, again!). Its constructor can have an integer parameter to say
how many elements the buffer can hold.
Example: producer-consumer
The example is in four classes NamedInt, IntProducer,
IntConsumer
and ProdConTest. NamedInt is for objects that
store a string and an int. These pairs are the items that are produced
and consumed. An IntProducer
is a run object that repeatedly reads integers from an input frame,
then pairs them with its own ID and stores them in a buffer. An IntConsumer is a run object
that repeatedly takes NamedInts
from the buffer, and then displays messages showing the producer and
consumer IDs, the integer n, and the nth Fibonacci number. The
Fibonacci algorithm used is very inefficient, so the time taken by the
consumers can vary a lot. The main program (in ProdConTest) creates a buffer,
of type BlockingQueue<NamedInt>,
and starts two producer threads (1 and 2) and two consumer threads (A
and B) using it.
When you run this, you get two
input frames, for the two producers. Actually, that's fairly pointless,
since you cannot possibly type in more than one at a time. However, you
will see that they both work. They are both sending items to the same
buffer, so the choice of consumer is independent of the choice of
producer.
Normally the consumer threads alternate, but if you give one of them a
large n (say 42 or more) the other consumer will take a succession of
small n's. If you give two large n's then you can fill up the buffer
because both consumers are busy.
The example uses an array blocking queue of capacity 5, created in the
main method by
BlockingQueue<NamedInt>
queue = new ArrayBlockingQueue<NamedInt> (5);
Algorithm for BlockingQueue
As I said, at present you don't actually need to know how to implement
the
algorithm for these buffers, since the BlockingQueue classes do that
for you. However, it is a very standard algorithm and you may one day
have need to implement it yourself. If you are interested you can find
a short
description on the API page for java.util.concurrent.locks.Condition.
It uses one lock and two conditions, created as
final Lock lock = new ReentrantLock();
final Condition notFull = lock.newCondition();
final Condition notEmpty = lock.newCondition();
For a more detailed description, you can look at the actual code for ArrayBlockingQueue. Find that
in the projects pane of netBeans under Libraries>>JDK 1.5
(Default)>>rt.jar>>java.util.concurrent. You will find the
basic operations are in the private methods extract and insert. The public methods such
as take and put have been carefully
optimised. For example, they use a method signal instead of signalAll. The difference is
that signal only wakes up one thread waiting on a condition, instead of
all of them. This is a little bit more efficient but requires more
careful coding in case the thread woken up has been interrupted and so
no longer wants to use the lock - so it has to signal the next thread.
Other examples from the library
Other useful classes can be found in the java.util.concurrent package.
You do not need to know them all now, but be aware of their existence
and try to find out more about them as you develop your programming
skills.
The class java.util.concurrent.Semaphore
is a classic construct (invented by E. Dijkstra) described in lots of
text books. The best way to think of a semaphore is that it protects a
shared resource, in the same way as a lock does, but that there may be
several units of that resource available. The semaphore controls
permissions to access a unit of that resource.
The class
java.util.concurrent.CyclicBarrier is for situations where a
number of threads, at a certain point, have to wait for all the others
to catch up before they carry on.
Thread safety
Last week we saw different versions of a bank account class. The later
versions were thread safe.
The instances could safely be used by different threads running
concurrently, because access to the balance field was protected with
locks by all the methods.
The first version was not thread safe. If different threads, running
concurrently, accessed a bank account instance then they might
interfere with each other because of race conditions.
You might conclude that every class should be thread safe and protect
its fields with locks. However, this is not always necessary. The
problem arises only when more than one thread may share access to
something. If you know for sure that only one thread needs access to
that resource, then it is simpler and more efficient not to enforce
thread safety.
Many classes in the Java library are not thread
safe (though the API does not always make the situation clear). We
shall
look at two particular examples, collections and Swing.
Note that the problem arises only when updates are possible for a
class. If a class is immutable,
then it is automatically thread safe. Immutable means "cannot be
changed". So the field values cannot be changed. What is more, if the
field values are references to other objects, then those objects cannot
be changed in any way either. String
is an example of an immutable class. So is NamedInt, from an earlier
example.
Collections
In general, the collection classes are not thread safe although they
are thread compatible - you
can make them thread safe by using them with locks. Hence if you don't
need the thread safety, you can use the simpler implementation that is
not thread safe. We shall look at ArrayList
as an example.
There are three levels of problem for thread interaction on array
lists. Each has a different kind of critical region.
- When using the methods get
and set to read and write
elements, the array list behaves like an array. There are the usual
interference problems for threads reading and writing the same
elements. The critical region, just as for the bank accounts, is likely
to be a get-calculate-set sequence. If you have a get on its own, it is best to
make that a critical region because of the memory model rules.
- The methods add and
remove change the
structure of the array list. These can interfere in a worse way, since
potentially the interference can leave the array list in an
inconsistent state. The critical region here is just the add or remove.
- If one thread gets the size of an array list, and another thread
calls add or remove, the size will no longer
be correct. Similarly, indexes into the array list, or iterators over
it, will be interfered with by other threads calling add or remove. The critical region is
the piece of code where you call size()
and then use your knowledge of the result; or where you use your
knowledge of which elements have which indexes; or where you get the
iterator and then use it.
In general, to make a collection object thread safe you must protect it
with a lock, using lock; try{critical region}finally{unlock}.
Wrappers
A "wrapper" class for array lists (or other kinds of collections) is
one that has the array list as one of its fields, but also has the
necessary locks and redefines the methods to be thread safe, as far as
possible. For example, its add method would first get the lock and then
call add on the array list.
The class java.util.Collections
provides some useful factory methods for creating instances of wrapper
classes.
Collections.synchronizedList
takes as parameter any object o
(such as an array list) that compatible with the List interface, and returns
another object that is compatible with List, and stores its data in o, but whose methods are thread
safe.
An example given in the API for synchronizedList
is
List
list = Collections.synchronizedList(new ArrayList());
You can also do this using generic types, e.g.
List<String>
list = Collections.synchronizedList<String>(new
ArrayList<String>(...));
Each method on list now works the same way as it would have done for
the unwrapped array list, but in a thread safe way, taking ownership of
a lock.
Getting the lock on a wrapper object
That deals with simple method calls, but you also need to know how to
get the lock for larger critical regions such as using an iterator.
Again, there is an example in the API.
synchronized(list) {
Iterator i = list.iterator(); // Must be in synchronized block
while (i.hasNext())
foo(i.next());
}
To understand this, you must know that the lock used to protect the
synchronized list is of the old kind, before the new java.util.concurrent.locks
features of Java 1.5. Every object has an old-style lock associated
with it. You never call lock() or unlock() with those, but instead say
synchronized( object ) {
critical
region
}
and this means
get old-style lock on object;
try {
critical
region
} finally {
release
old-style lock on object
}
Example
The example is in class ArrayListTest.
It creates an array list of Integers, and starts three threads on it.
One adds elements, one removes them, and one prints all the elements.
Because ArrayList is not thread safe, there are various exceptions that
can arise.
IndexOutOfBoundsException
arises in the thread that adds elements, because that thread also calls
set on the last element. If there is a remove between the add and the
set, then that last element no longer exists.
ConcurrentModificationException
arises in the thread that prints elements, which uses an iterator.
Iterators can often detect if other threads have added or removed
elements, and then the iterator throws an immediate exception. (This is
not reliable for production versions of your software, but can be
useful for debugging.)
The thread safe version is ArrayTestList1.
It uses a wrapped array list, and the run methods have synchronized
blocks.
Java 1.5 notes
The example uses some new features of Java 1.5.
1. Integer is (and has
always been) a class for objects with a single int field, and so converts the
primitive type int into a
reference type Integer.
It is useful in collections, because the elements of a collection have
to be of a reference type. Java 1.5 now has automatic conversion
between Integer and int - this is called "automatic
boxing and unboxing" -, and you can see this used in the example.
2. Java 1.5 has a new form of for loop that makes it much easier to use
iterators. In the example, the printing thread Lister has a for loop
for
(Integer n: list) {
System.out.print(n + ", ");
}
Read the colon ":" here as in,
so "for Integer n in list". The new for loop is equivalent to
Iterator<Integer>
it = list.iterator();
while (it.hasNext()) {
System.out.print(it.next() + "");
}
Thread safe collections
You have already seen BlockingQueue,
which is thread safe. There are some others in java.util.concurrent, such as ConcurrentLinkedQueue and ConcurrentHashMap.
Swing
Swing is not thread safe. This
is
deliberate. If Swing components used locks then the event dispatch
thread might sometimes be blocked, and that risks freezing the GUI. In
fact when people have designed thread safe GUIs, they have often found
that it is easy to deadlock them.
Instead, the general rule is to manipulate Swing components only on the
event dispatch thread.
The repaint-paint pattern is a good example of this. Repaint can be
called on any thread, but it doesn't directly do the repainting.
Instead a paint is put on the queue of jobs for the event dispatch
thread to execute.
The method javax.swing.JComponent.revalidate()
is similar. It puts on the event dispatch queue a job to layout this
component and its containers.
There are a few exceptions to this rule. setText can normally be called
from any thread. Also, usually, if a component has not yet been set
visible or
added to a visible container, then any thread (typically the main
thread when it creates objects) can work safely on it. [Though Sun now
say even this is not completely reliable. Their tutorial page at http://java.sun.com/docs/books/tutorial/uiswing/misc/threads.html
recommends creating Swing components on the event dispatch thread.]
There are thus two opposing considerations.
- Manipulations of Swing components should be on the event dispatch
thread.
- Tasks that take a long time should
not be on the event dispatch thread (because otherwise they
delay the GUI).
There are many situations when a lengthy computation involves Swing
components. The thread that is performing the lengthy computation
therefore needs to be able to request the event dispatch thread to do
the Swing manipulations. This is normally done using a static method invokeLater in class javax.swing.SwingUtilities. To
use this, first create a
run-object (its class implements Runnable)
whose run method does the manipulation, and then use it as the
parameter for a
call of invokeLater. This
will put the task
into the queue of jobs for the event dispatch thread.
An alternative method to use is invokeAndWait.
This is similar to invokeLater,
but the thread calling it waits until the event dispatch thread has
done the task.
Some examples of code can be found on the Sun tutorial page mentioned
above. Here are some typical situations.
- When setting up the GUI you may need to load big graphics files
from memory or even over the network. This could take a long time and
so should be done on a separate thread. But when the files have been
loaded they need to be displayed in the GUI, and that must be done on
the event dispatch thread. For this your separate loading thread would
use invokeLater.
- While a big file is loading (whether for GUI graphics or for any
other reason) you will often want to display a progress bar on the
screen to show how much remains to be done. The progress bar is a GUI
component and should be updated in the event dispatch thread. invokeLater is one way to do
that. Another is to have a timer object (javax.swing.Timer) that
periodically finds out the amount loaded, and then updates the progress
bar. A timer object does not need to use invokeLater, because its actionPerformed method is
executed on the event dispatch thread.
SwingWorker
A common pattern is for a lengthy computation on one thread to be
followed up with a short computation on the event dispatch thread. This
has been made into an abstract class SwingWorker. It is not in
the libraries, but is available from Sun. To use it you extend it,
overriding two methods construct
and finished.
public Object construct()
is executed on a thread specially created for each SwingWorker instance. The
object it returns can be accessed by a method get().
public void finished() is
executed on the event dispatch thread after construct has finished.
When you have created a SwingWorker
instance, you must then call start()
on it to start its thread executing construct().
Summary
- Producer-consumer buffers can be implemented with locks and
conditions, but in Java it is easiest to use BlockingQueue.
- It is possible to see the code of BlockingQueue.
- There are various other convenient classes in java.util.concurrent.
- Most library classes are not thread
safe - they do not allow for concurrent access by different
threads.
- Collection classes are most not thread safe, but can be made
thread safe by using wrappers and synchronized
blocks.
- Swing is not thread safe. Swing components should all be
manipulated on the event dispatch thread. invokeLater and invokeAndWait help with this.
- Java 1.5 has generic types, for loops using iterators, and
automatic boxing and unboxing.