Monday, June 25, 2007

Java Strings and Hash Codes

The question often arises as to how Java can use String values in switch/case statements; e.g.:
switch(s) {
case "Foo": doSomething(); break;
case "Bar": doSomethingElse(); break;
The short answer is that one can't. In Java (versions before 5), the expression evaluated in the switch statement must be a char, byte, short, or int. With auto-unboxing and the new enum capabilities of Java 5 and later versions, switch expression may also include Character, Byte, Short, Integer, or enum types. Furthermore, every case expression must be assignable to the same type as what is declared in the switch statement. This means that if your switch statement is an int type, your case statements can be int types, byte types, short types, etc. If, on the other hand, your switch statement is a byte type, the compiler will bark if you try to use an int type in a case expression. In short, however, switch statements with String types (and any type not mentioned above) are simply not available in Java.

Back to the problem of using String values... Every so often, someone will suggest using the string's hash code (an int value) as the value to use as a switch. There are a couple problems with this idea. First, the case expressions must be constant, meaning that the compiler isn't going to accept the result of hashCode() as a valid case. Second, and perhaps a more fundamental issue, string hash codes aren't guaranteed to be unique. Far from it, in fact. The only thing string hash codes guarantee is that strings with different hash codes are guaranteed to be not equal.

So, what is an appropriate solution? It all depends on the design, of course. Most likely, using enums would be a fair approach. In his seminal book Refactoring, Martin Fowler recommends looking at polymorphism for a more object-oriented design. And if one must have switchy strings in a Java-ish environment, there is always Groovy...


Thursday, June 21, 2007

Final Variables and Anonymous Inner Classes

Anonymous inner classes require final variables because of the way they are implemented in Java. An anonymous inner class (AIC) uses local variables by creating a private instance field which holds a copy of the value of the local variable. The inner class isn't actually using the local variable, but a copy. It should be fairly obvious at this point that a "Bad Thing"™ can happen if either the original value or the copied value changes; there will be some unexpected data synchronization problems. In order to prevent this kind of problem, Java requires you to mark local variables that will be used by the AIC as final (i.e., unchangeable). This guarantees that the inner class' copies of local variables will always match the actual values.