To Layer, or not to Layer? That is the Architectural Question

It’s a commonly accepted practice when designing large scale enterprise applications (or even smaller applications for that matter) to layer your architecture code as well as your application code. ‘Separation of Concerns’, for example separating your business logic from your data access logic, or your presentation logic from your business logic, is a common technique for ensuring that your application code is well defined in terms of it’s responsibilities. This approach give you code that is easier to maintain and easier to debug – if you have a problem with the data access then you know to look in code in your data access layer.

In Java technology based applications it seems like we have become the masters of taking this approach to the nth degree. A typical J2EE web-based system might comprise of:

  • Presentation tier:
    • coded using JSP pages
    • an MVC framework like Struts, to loosely couple the page navigation, control of the page navigation and interaction with adjacent layers: ActionForms as containers for submitted data, Actions for controlling the navigation between pages (using stuts-config.xml) and interface to the next adjacent layer
  • Business layer facade: an additional layer to decouple the presentation tier from the business layer technology, to avoid coupling the Struts Actions directly to Session Beans
  • Business component layer: implemented using Stateless and Stateful Session beans
  • Business layer: the actual business logic itself. Not directly coded inside the Session beans so it can be reused elsewhere where Session Beans are not deployable
  • Data Access Layer: code to interact with the database, providing data retrieval and storage facilities.

At its simplest, maintaining a separation between presentation, business and data access seems to be the minimum degree of logical layering you should require in your system.

So why in a typical J2EE application have we ended up with so many more layers? It seems like we’ve becoming obsessed with decoupling the base technologies that we’re using to build our J2EE systems, which is increasing the amount of code we have to develop, and rather than simplifying application development, and done more to complicate our architectures.

My reason for thinking about this right now is because I started to use PHP to put together some simple database driven web pages for some pages that interact with my weather monitoring station (http://www.kevinhooke.com/weather). In all the PHP books that I’ve looked at so far, I haven’t seen any mention of layering my system, or separating my presentation logic from my data access logic. Instead, the general approach for PHP seems to encourage data access from your presentation pages. From being in the frame of mind where I am encouraged to ‘separate, separate’, ‘build more layers!’, this is a refreshing change – I can develop pages in a fraction of the time than it would take me with a typical J2EE layered approach! Why? Because I don’t have to develop additional plumbing code to interact between my many layers.

In the majority of cases in heavily layered applications, simple data access requires justs ‘call throughs’ from one layer to the next in order to access the database and bring back the data you need – the additional layers you must call through in order to get to the database don’t add any additional functionality (of course this is not always the case, but in simple cases this is the case).

So when should you layer? I still believe in the benefits of layering applications, but I think the benefits of developing easier to maintain, well layered code comes at the cost of development time and additional effort to write the additional code required in each layer. Small web-based applications, such as a forum application, may not benefit from this additional overhead of layering the applicaiton. However, I cannot imagine working on a large development effort with a medium to large development team (> 50 developers) and with hundreds of front end pages, without having well defined layers between logical responsibilities in the system – it just wouldn’t work.

As a development community though, we need to spend time thinking about how we can maximise the benefits gained from approaches such as architectural and application layering, while reducing the overhead of this type of approach, somehow avoiding the overhead of having to write the additonal code calling through from layer to layer. I haven’t spent much time with Ruby on Rails, for example, but from what I can see with their approach they have maintained the design patterns typical in J2EE applications, but have replaced the need for the developer to spend so much time writing plumbing code – this is handled pretty much by the framework itself. This is where I believe we need to heading.

Avoiding excessive overtime in Software Development

News.com have an article on their site today stating that software developers are starting to work shorter, more normal work weeks (40 hours a week), instead of the 50 – 80 or even more that was not uncommon in the late 90s, early 2000s.

This is a great trend, not because I don’t love developing software, because I do, but I have plenty of other things I would rather be doing outside of work, like spending time at home with my wife and pets, and on my hobbies. After all, the main motivation to work for the majority of people is probably a) to pay the bills, and b) to earn money to spend on having a good time, whatever that may be – travelling, hobbies, eating out, going to movies etc.

The problem is, if you spend 12 hours in work every day, where is the free time to spend on the activities outside of work that you enjoy doing? And if you are working to earn money to enable you to participate in activities that you enjoy doing outsie of work, then when do you have any free time to take part in these activities? In extreme circumstances you might as well not be working at all because you end up with zero free time.

Software development is not a production line, or some machine that produces lines of code. If you can produce 40 lines of well written tested, working code in an 8 hour day, increasing the working hours to 16 hours a day is not going to get you 80 lines of code from each developer – software development just does not work like that. Software development is a cerebral exercise, one that takes a lot of thought, experiment, and dare I say it, creativity. Spending more time in one day on a task is not going to result in the task being completed sooner. Why? Because people get tired. On average, the average human attention span is something like 20 minutes – after that point the mind starts to wander, people need to take frequent breaks, tea/coffee/smoking breaks, and then start again on the task feeling more focused.

This is a point that project management and some project leadership just do not understand, and I cannot understand why, since most software development management were themselves programmers at some point.

The classic example that has happened to almost all of us I am sure is the late night debugging on a problem that you just can’t work out. Eventually you call it a night after having spent several hours on the one task. The next morning you come in, look at the code, and instantly find the problem in the code. Why? Because after working 10 hours plus you are tired, you are not focused, you are probably restless from sitting still for such a long time, you’re late for getting home for dinner, and there’s countless other chores nagging in the back of your head that you know you need to get home to do – in essence you are not focused on the problem. You go home have a good rest, and now the next morning the problem seems obvious. It’s not that it is now obvious, it just that you are now addressing the problem when you are at your freshest and most focused.

One of the other side-effects of working longer hours is when people get tired they tend to make more mistakes, or make judgements or decisions that they wouldn’t normally make if they were more alert and focused. So what happens after a late nate programming session? You come in the next morning and the first thing you do is you spend a couple of hours fixing the bugs from the previous night, or even reworking the code because what you wrote when you were tired is not the most effective solution to the problem.

So from spending an extra 4 hours at the end of the day writing some poorly written code and introducing a few bugs her and there, you waste a few hours the next morning correcting the mess from the previous night. This does not sound like an effective use of time to me. Over the longer term if your team is continually working long hours for 2 weeks or more, the problems become more severe – your team becomes demoralized and just generally worn out – people are not machines – to work effectively people need time to recouperate and rest in order to work effectively.

One of the rules of Extreme Programming is that no-one shall work more than 40 hours a week. This is not a dream – it’s realistic. If you demand more from your staff you often end up with less.

In software development overtime is never the solution to a problem – overtime is always symptom of a wider problem that needs to be addressed. If you don’t look for the wider problem, identify and address the issue, then the root cause of why you find yourself in the situation of having to ask you team to work overtime will not go away by itself – you’ll find yourself instead demanding more and more from your team which inevitably will result in lowering the morale of your team. Prolonged periods of overtime (ie more than 1 week in a row) will always have a far greater negative effect on the team and on productivity as a whole, than any short term positive gains.

Recommended reading list for Java Developers/Software Engineers (December 2004)

There are some books you come across that are destined to become all time classics, well worth reading and recommending to others. This is a list of books that I’ve read that I recommend all software developers must read. The list has a strong Java bias as afterall I’m a Java developer myself, but has a mix of both technical and non-technical.

For the past several years of my career I neglected to read project management/software development process type books (after I finished my Computing degree anyway), but there is so much to be learnt from others sucesses and mistakes, regardless of whether or not you intend to become a team lead or expect to find yourself in a less technical role. Afterall, writing the code is only part of the process; a good understanding in tried and true practices, theories and software development processes leads (in my opinion) to a much stronger, well-rounded developer.

As Steve Maguire points out in his book, Debugging the Development Process, why spend years of your career learning by trial and error when you can pick up a good book and learn from insights from others who have already ‘been there, done that’.

In no particular order, here’s my recommended book list:

The Mythical Man Month – Fred Brooks

Written over 20 years ago, this book is a collection of essays on typical problems encountered on software development projects, and what is scarey is that the same problems still happen today. Although some of the examples refer to IBM mainframe development, the lessons learnt still apply to software development today.

Design Patterns – Gamma et al (GOF – Gang of Four)

The Bible of Design Patterns. Why reinvent the wheel when there is already a solution/approach to a commonly occurring problem?

Refactoring – Martin Fowler

Encourages continual review of exising code and during actual development to identify areas for improvement. Identifies typical problem areas (‘smells’..!) that occur in code, and suggests solutions for reworking code to become cleaner, easier to read and maintain.

This is one book I’ve known about for a while but never had time to sit down and read from cover to cover, and wish I had done much earlier…

UML Distilled – Martin Fowler

If you’re going to read one book on using UML notation for analysis and design then make it this one. Short and to the point, covers the main aspects of ech of the diagram types and notation to get you started.

Thinking in Java – Bruce Eckel

This is the best written and most comprehensive book on Java and it’s language features that I have read so far. It goes into far much more detail explaining concepts and the rationale for language features that I have come across anywhere else. If you are just starting out with Java, or even if you’ve beeen developing with Java for years, this is an excellent book

Java in a Nutshell – David Flanagan

This was one of my most referenced books when I first started out as a Java developer, and there were several very well-worn copies in the office at one time. As a reference it really is no more than the API in book form, which is great if you’d rather flick through pages than read API info on the screen. Also more useful is the very short and condensed introductory chapters which give a very quick highlevel intro to Java and most of the language’s core features. This is a great reference for language specific features like operator precedence and ranges of the primitive data types, info that is not too easily accessible elsewhere.

This book doesn’t seem to be as popular as it was a few years back, and most new developers nowdays have not come across this book. I suspect this book was more useful when the core API was smaller, but now Java encompasses so many areas there is no longer a single source of info you can go to, you need other books on more specific areas that you are working in (J2EE features etc)?

Mastering Enterprise Java Beans – Ed Roman

For learning EJBs, this is a very well written book. The examples are somewhat lengthy, but I think this is a very good bok. Again, covers more background info than most (transaction support, isolation levels, best practices etc).

Effective Java – Joshua Bloch
For more advanced Java developers looking to broaden their knowledge of using the language, this book is a series of tips for how to more effectively use the Java language for commonly encountered scenarios. This is one book I refer back to often.

Bitter Java – Bruce Tate

I’m not sure if this was the first book to introduce the concept of the ‘Anti-Pattern’, but this book is a great collection of war stories and lessons to be learnt. This book collects together commonly made bad design decisions and inappropriate uses of Java technology, and then makes suggestions for how to more effectivly use Java to solve the problem. Again – why learn from your own mistakes by trial and error when you can learn from others who have already made the mistakes?

Debugging the Development Process – Steve Maguire

This is an older book (published 1994) and the author uses examples from his experiences working on projects at Microsoft, but still, there are some good lessons to be learnt from this book that are still applicable to any software development project. This book is an easy read and makes several key points that are valuable to both developers and team leads to become more effective and focused at achieving goals and delivering quality products.

Hibernate in Action – Gavin King and Christian Bauer

This is the bible of all things Hibernate. If you are looking to implement a persistence solution other than writing your own JDBC or using Entity Beans then Hibernate is a must. If you are considered Entity Beans or JDBC then you should be considering Hibernate instead. If you’re already using Hibernate then this is still a good book for explaining concepts and features that may be harder to grasp from just the online docs.

Hearding Cats – A Primer for Programmers who Lead Programmers – Rainwater

This is a useful read for developers finding their career path moving in the direction of team lead type roles. Not all technically minded people make good managers, just as good people-type managers do not make good technical managers. The think the key takeway from this book is to recognize the skills and strengths of your team and use those to more effectively acheive your teams goals.

I also have a must read list of books that I haven’t got to yet – some of these are already on my bookshelf but I just haven’t had time to read yet:

  • Code Complete – Steve McConnell
  • Professional Software Development – Steve McConnell
  • Extreme Programming Explained – Kent Beck
  • Expert One-on-One J2EE Development without EJB – Rod Johnson
  • Joel on Software – a collection of articles from his website joelonsoftware.com. I’ve read some his articles online on his general thoughts and experiences from being involved in software development – his website is a good read so this book should be worthwhile.

Transaction Isolation and Database concurrency issues

To allow for concurrent access to a database and at the same time maintain ‘isolation’ between different concurrent transactions, most databases offer varying levels of Transaction Isolation.

The common isolation problems that can occur are:

  • The Dirty Read:

    This occurs when Transaction A reads data from Transaction B that has not yet been committed. Transaction A performs some action based on this uncommitted data and then commits the new processed values. Tranaction B now rolls back it’s data. Transaction A’s processing although now complete used data that was uncommitted and now is in an undetermined state.

  • The Unrepeatable Read:

    Transaction A reads data from the database to perform some processing. Transaction B at the same time is modifying the same data, but decides to rollback. Transaction A now re-reads the same data and gets different results from when it first read the data.

  • The Phantom Read:

    Transaction A reads a set of data from the database to perform some processing. Transaction B at the same time has inserted some new rows (which appear in Transaction A’s results), but decides to rollback. Transaction A now re-reads the same data and gets a different results set, which now no longer contains row that were there previously.

Isolation Levels

To ensure that the problems discussed above are not encountered, most databases offer levels of locking which may include the following (these are the same levels used by Java JDBC and EJB Transaction Isolation on Entity Beans):

  • READ UNCOMMITTED – offers no isolation, and Dirty Reads, Unrepeatable Reads, and Phantom Reads will still occur.
  • READ COMMITTED – ensures that only committed data is read, and avoids Dirty Reads. Unrepeatable Reads and Phantom Reads will still occur
  • REPEATABLE READ – ensures that subsequent queries of the same data return the same results. Avoids Dirt Reads, Unrepeatable Reads, but Phantom Reads still occur.
  • READ SERIALIZABLE – ensures that Dirty, Unrepeatable or Phantom Reads do not occur.