Monday, June 25, 2007

Java Strings and Hash Codes

The question often arises as to how Java can use String values in switch/case statements; e.g.:
switch(s) {
case "Foo": doSomething(); break;
case "Bar": doSomethingElse(); break;
}
The short answer is that one can't. In Java (versions before 5), the expression evaluated in the switch statement must be a char, byte, short, or int. With auto-unboxing and the new enum capabilities of Java 5 and later versions, switch expression may also include Character, Byte, Short, Integer, or enum types. Furthermore, every case expression must be assignable to the same type as what is declared in the switch statement. This means that if your switch statement is an int type, your case statements can be int types, byte types, short types, etc. If, on the other hand, your switch statement is a byte type, the compiler will bark if you try to use an int type in a case expression. In short, however, switch statements with String types (and any type not mentioned above) are simply not available in Java.

Back to the problem of using String values... Every so often, someone will suggest using the string's hash code (an int value) as the value to use as a switch. There are a couple problems with this idea. First, the case expressions must be constant, meaning that the compiler isn't going to accept the result of hashCode() as a valid case. Second, and perhaps a more fundamental issue, string hash codes aren't guaranteed to be unique. Far from it, in fact. The only thing string hash codes guarantee is that strings with different hash codes are guaranteed to be not equal.

So, what is an appropriate solution? It all depends on the design, of course. Most likely, using enums would be a fair approach. In his seminal book Refactoring, Martin Fowler recommends looking at polymorphism for a more object-oriented design. And if one must have switchy strings in a Java-ish environment, there is always Groovy...

~

No comments:

Post a Comment