Recommended reading list for Java Developers/Software Engineers (December 2004)

There are some books you come across that are destined to become all time classics, well worth reading and recommending to others. This is a list of books that I’ve read that I recommend all software developers must read. The list has a strong Java bias as afterall I’m a Java developer myself, but has a mix of both technical and non-technical.

For the past several years of my career I neglected to read project management/software development process type books (after I finished my Computing degree anyway), but there is so much to be learnt from others sucesses and mistakes, regardless of whether or not you intend to become a team lead or expect to find yourself in a less technical role. Afterall, writing the code is only part of the process; a good understanding in tried and true practices, theories and software development processes leads (in my opinion) to a much stronger, well-rounded developer.

As Steve Maguire points out in his book, Debugging the Development Process, why spend years of your career learning by trial and error when you can pick up a good book and learn from insights from others who have already ‘been there, done that’.

In no particular order, here’s my recommended book list:

The Mythical Man Month – Fred Brooks

Written over 20 years ago, this book is a collection of essays on typical problems encountered on software development projects, and what is scarey is that the same problems still happen today. Although some of the examples refer to IBM mainframe development, the lessons learnt still apply to software development today.

Design Patterns – Gamma et al (GOF – Gang of Four)

The Bible of Design Patterns. Why reinvent the wheel when there is already a solution/approach to a commonly occurring problem?

Refactoring – Martin Fowler

Encourages continual review of exising code and during actual development to identify areas for improvement. Identifies typical problem areas (‘smells’..!) that occur in code, and suggests solutions for reworking code to become cleaner, easier to read and maintain.

This is one book I’ve known about for a while but never had time to sit down and read from cover to cover, and wish I had done much earlier…

UML Distilled – Martin Fowler

If you’re going to read one book on using UML notation for analysis and design then make it this one. Short and to the point, covers the main aspects of ech of the diagram types and notation to get you started.

Thinking in Java – Bruce Eckel

This is the best written and most comprehensive book on Java and it’s language features that I have read so far. It goes into far much more detail explaining concepts and the rationale for language features that I have come across anywhere else. If you are just starting out with Java, or even if you’ve beeen developing with Java for years, this is an excellent book

Java in a Nutshell – David Flanagan

This was one of my most referenced books when I first started out as a Java developer, and there were several very well-worn copies in the office at one time. As a reference it really is no more than the API in book form, which is great if you’d rather flick through pages than read API info on the screen. Also more useful is the very short and condensed introductory chapters which give a very quick highlevel intro to Java and most of the language’s core features. This is a great reference for language specific features like operator precedence and ranges of the primitive data types, info that is not too easily accessible elsewhere.

This book doesn’t seem to be as popular as it was a few years back, and most new developers nowdays have not come across this book. I suspect this book was more useful when the core API was smaller, but now Java encompasses so many areas there is no longer a single source of info you can go to, you need other books on more specific areas that you are working in (J2EE features etc)?

Mastering Enterprise Java Beans – Ed Roman

For learning EJBs, this is a very well written book. The examples are somewhat lengthy, but I think this is a very good bok. Again, covers more background info than most (transaction support, isolation levels, best practices etc).

Effective Java – Joshua Bloch
For more advanced Java developers looking to broaden their knowledge of using the language, this book is a series of tips for how to more effectively use the Java language for commonly encountered scenarios. This is one book I refer back to often.

Bitter Java – Bruce Tate

I’m not sure if this was the first book to introduce the concept of the ‘Anti-Pattern’, but this book is a great collection of war stories and lessons to be learnt. This book collects together commonly made bad design decisions and inappropriate uses of Java technology, and then makes suggestions for how to more effectivly use Java to solve the problem. Again – why learn from your own mistakes by trial and error when you can learn from others who have already made the mistakes?

Debugging the Development Process – Steve Maguire

This is an older book (published 1994) and the author uses examples from his experiences working on projects at Microsoft, but still, there are some good lessons to be learnt from this book that are still applicable to any software development project. This book is an easy read and makes several key points that are valuable to both developers and team leads to become more effective and focused at achieving goals and delivering quality products.

Hibernate in Action – Gavin King and Christian Bauer

This is the bible of all things Hibernate. If you are looking to implement a persistence solution other than writing your own JDBC or using Entity Beans then Hibernate is a must. If you are considered Entity Beans or JDBC then you should be considering Hibernate instead. If you’re already using Hibernate then this is still a good book for explaining concepts and features that may be harder to grasp from just the online docs.

Hearding Cats – A Primer for Programmers who Lead Programmers – Rainwater

This is a useful read for developers finding their career path moving in the direction of team lead type roles. Not all technically minded people make good managers, just as good people-type managers do not make good technical managers. The think the key takeway from this book is to recognize the skills and strengths of your team and use those to more effectively acheive your teams goals.

I also have a must read list of books that I haven’t got to yet – some of these are already on my bookshelf but I just haven’t had time to read yet:

  • Code Complete – Steve McConnell
  • Professional Software Development – Steve McConnell
  • Extreme Programming Explained – Kent Beck
  • Expert One-on-One J2EE Development without EJB – Rod Johnson
  • Joel on Software – a collection of articles from his website joelonsoftware.com. I’ve read some his articles online on his general thoughts and experiences from being involved in software development – his website is a good read so this book should be worthwhile.

Transaction Isolation and Database concurrency issues

To allow for concurrent access to a database and at the same time maintain ‘isolation’ between different concurrent transactions, most databases offer varying levels of Transaction Isolation.

The common isolation problems that can occur are:

  • The Dirty Read:

    This occurs when Transaction A reads data from Transaction B that has not yet been committed. Transaction A performs some action based on this uncommitted data and then commits the new processed values. Tranaction B now rolls back it’s data. Transaction A’s processing although now complete used data that was uncommitted and now is in an undetermined state.

  • The Unrepeatable Read:

    Transaction A reads data from the database to perform some processing. Transaction B at the same time is modifying the same data, but decides to rollback. Transaction A now re-reads the same data and gets different results from when it first read the data.

  • The Phantom Read:

    Transaction A reads a set of data from the database to perform some processing. Transaction B at the same time has inserted some new rows (which appear in Transaction A’s results), but decides to rollback. Transaction A now re-reads the same data and gets a different results set, which now no longer contains row that were there previously.

Isolation Levels

To ensure that the problems discussed above are not encountered, most databases offer levels of locking which may include the following (these are the same levels used by Java JDBC and EJB Transaction Isolation on Entity Beans):

  • READ UNCOMMITTED – offers no isolation, and Dirty Reads, Unrepeatable Reads, and Phantom Reads will still occur.
  • READ COMMITTED – ensures that only committed data is read, and avoids Dirty Reads. Unrepeatable Reads and Phantom Reads will still occur
  • REPEATABLE READ – ensures that subsequent queries of the same data return the same results. Avoids Dirt Reads, Unrepeatable Reads, but Phantom Reads still occur.
  • READ SERIALIZABLE – ensures that Dirty, Unrepeatable or Phantom Reads do not occur.

Unit Testing: Limit to 1 Assertion per test

In order to keep Unit Tests as simple as possible and easy to understand their intent, a good rule of thumb is to limit the number of assertions to 1 per test method.

This article by Dav Astelsdescribes this approach in more detail.

The main benefit from this approach is that each Unit Test method tests exactly one aspect of the system. If it fails then you (or someone else in the future who must debug the code) should know exactly the pupose of the method and therefore have a better idea of what has failed and where to start looking to resolve the issue.

Singletons, Static Methods, and Double-checked Locking (and why to not use Double-checked Locking)

Overview

This is brief discussion of the Singleton design pattern (refer to the GOF book,
Design Patterns), how Singleton classes differ
from classes with static methods, and why the ‘Double-locking’ technique should not be used to implement Singletons.

Why Singletons?

Certain conditions may exist in a design that dictate that there should only ever be one instance of a particular
class (or sometimes a limited number of controlled instances).

Usually these classes represent or allow access to a limited resource,
such as a printer, or a property file. In the case of the property file,
if the properties in this file are to be read by many components in the
system, then it would be inefficient if the file is loaded/read each time
a component requests one of the properties from the file – a more efficient
solution would be to read it once and allow access to this shared instance.

Two common solutions to implementing this shared class would be to implement:

  • a class with static methods
  • a singleton class
Static methods

Static methods (or Class methods) are associated with a particular Class, rather than an instance of a class.
You can invoke a static method without having an instance of the Class. For example, given this class:

public class ClassA
	{
	public static int returnValue()
		{
		return 1;
		}
	}

you can invoke the method like this:

ClassA.returnValue();

In some cases a static method may provide a good solution – they are useful for utility methods that provide
some common operation on some passed arguments that do not change the state or attributes of its own Class.

However there are drawbacks of static methods:

  • they are not a good fit in an Object Oriented world (ie no object instance has to exist to invoke the method)
  • they wouldn’t provide a good solution if the design changed and we needed a limited number of instances, say 2,
    rather than 1.

  • static methods can not be overridden by subclasses (explained in the
    next section), meaning that it would be more difficult to reuse a class
    with static methods in subclasses.
Overriding versus hiding

Static methods cannot be overridden in subclasses which you could argue
makes them less flexible than inherited methods. It also means that the
functionality of the static method is still accessible in the parent/superclass,
which could lead to the wrong method be invoked in subclasses, and possibly
future maintenance/debugging headaches.

If you create a static method in subclass with the same name and signature
as a static method in the superclass, you are merely creating another
method with the same name – you have not overridden the method in the
superclass, it is just hidden by the static method with the same name
(but possibly a different implementation).

Also, it is not valid to declare an instance method with the same signature
as a static method in the super class – this gives a compile time error.

Overriding and Hiding are discussed in further detail in the target=”_blank”>Java Language Specification.

An example of this concept is here

Implementing a Singleton (1)

Assuming we have decided that the singleton pattern is a good solution for our problem, and that using static
methods would not give us the future flexibility that we may need in our solution, how do we implement one? How can we
implement the singleton class so that it is also ‘lazy initialized’ – ie the singleton is only initialized when it
is first requested?

Heres the first example:

public class SingletonA
  {
  private SingletonA instance;

  /**
  * Constructor is hidden so uncontrolled instances can not be created
  */
  private SingletonA() {}

  public static SingletonA getInstance()
    {
    if(instance == null)
      {
      //create the singleton instance
      instance = new SingletonA();
      }
    return instance;
    }
  }

This appears to achieve what we need, however it is clearly not threadsafe.

Implementing a Singleton – threadsafe attempt 1

If we need the instantiation of the single instance to be threadsafe then we need to introduce synchronization.
Look at this next example:

public class SingletonB
  {
  private SingletonB instance;

  /**
  * Constructor is hidden so uncontrolled instances can not be created
  */
  private SingletonB() {}

  public static synchronized SingletonB getInstance()
    {
    if(instance == null)
      {
      //create the singleton instance
      instance = new SingletonB();
      }
    return instance;
    }
  }

If we synchronize the whole getInstance() method we solve
the multithreading issue, but now we have created a bottleneck in our
code – requests to obtain the SingletonB instance will execute sequentially
through this method. If there are to be many concurrent requests to this
method to get the Singleton instance, then this may be a performance problem.

The next refinement is to synchronize the least amount of code as possible (which is always good practice).

Implementing a Singleton – threadsafe attempt 2, with
Double-checked Locking

For this refinement we synchronize only the lines of code that we want to be
executed sequentially, in particular the actual instantiation of the Singleton.

public class SingletonC
  {
  private SingletonC instance;

  /**
  * Constructor is hidden so uncontrolled instances can not be created
  */
  private SingletonC() {}

  public static SingletonC getInstance()
    {
    if(instance == null)
      {
      synchronized
        {
        if(instance == null)
          {
          //create the singleton instance
          instance = new SingletonC();
          }
        }
      }
    return instance;
    }
  }

Heres where the ‘double-checked lock’ concept is introduced. The intent
of this is as follows:

  • if the singleton instance has not yet been created, attempt to enter the synchronized block
  • if another thread is already in the synchronized block then we will be blocked until they leave
  • once inside the synchronized block, check if the instance was possibly created
    by the last caller of this method (ie while thread ‘A’ was blocked on
    the synchronized block by thread ‘B’, thread ‘B’ could have already
    executed the instantiation. Therefore we need to check a second time
    to make sure we are not performing the instantiation a second (unnecessary)
    time when we enter the synchronized block.

  • if a thread calls getInstance() and the instance has already been created then it is returned without
    entering the synchronized block.

Logically this is correct and will fulfill all our requirements. Unfortunately it is not guaranteed to work. Why?

Double-checked Locking does not work

Although the last solution looks good, it has been proven to not work in all
situations [1]. Some reasons behind this are as follows:

  • although the line of code instance = new SingletonC() is only one line of code in the source, as
    generated Java bytecode this one statement will be more than one bytecode statement. In otherwords, the instantiation
    of a new instance is not an atomic operation – there will be a chance that the variable instance is not
    null even before the new instance has been assigned to it. In our code, the check if(instance == null) will
    be false and so instance will be returned in an undetermined state. This code is therefore unreliable in
    its current format.

  • related to the above statement, some Just-In-Time compilers such as Symantec
    JIT (and possibly Hotspot?) rearrange bytecode statements as part of
    their optimization. Again this means that instance will
    be non-null and so the code will return its current undetermined/unexpected
    value.

  • multiprocessor systems can also reorder statements as they are executed. For this same reason, it is possible that instance will be non-null and
    so the code will return an undetermined/unexpected value.

The reordering of bytecode statements is referred to as out of order
writes
and is defined in the Virtual Machine Specification.

Another problem with the double-checked locking is that in some circumstances
the code may work, but in other cases it will fail. The failures will
be sporadic. This is another reason to avoid using double-checked locking,
because it can not be guaranteed to work.

Solutions

According to David Bacon et al in their analysis of double-checked locking
(see here), there is no way to fix this approach so
that it will work reliably, no matter how elaborate the solution. The
only solution is to avoid use double-checked locking completely.

So how do you implement a threadsafe Singleton class in Java? Sometimes the
best solutions are the simplest:

  • introduce the instance as a static field (see example below)
  • or, accept the (potential) performance drawback and synchronize the
    whole method
public class Singleton
  {
  private static Singleton instance = new Singleton();

  /**
  * Constructor is hidden so uncontrolled instances can not be created
  */
  private Singleton() {}


  public static SingletonC getInstance()
    {
    return instance;
    }
  }

This avoids the need for synchronization, and the singleton instance will only
be created on its first reference.

References
1 The
Double-checked Locking is Broken’ Declaration
– David Bacon (IBM
Research) Joshua Bloch (Javasoft), Jeff Bogda, Cliff Click (Hotspot
JVM project), Paul Haahr, Doug Lea, Tom May, Jan-Willem Maessen, John
D. Mitchell (jGuru) Kelvin Nilsen, Bill Pugh, Emin Gun Sirer

Other Related Articles