In the Java programming language, every class must contain a hashCode() method. The output of this method must be a 32-bit signed integer and the results are intended to be evenly distributed for varied inputs so as to avoid clustering when used in hash maps and other data structures that store objects in buckets based on their returned hashCode() values.

The algorithm involved in computing the hash code of an object will vary depending on the class of the latter. As such, while a default hashCode() method is inherited by every class, many programmers will choose to override this method in favor of an algorithms custom built for their respective classes. Many of the built in object types within Java do a similar override. An instance s of the java.lang.String class, for example, would have a hash code h(s) defined by

h(s)=\sum_{i=0}^{n-1}s[i] \cdot 31^{n-1-i},

where terms are summed using Java 32-bit int addition, s[i] denotes the ith character of the string, and n is the length of s.

Contents

Usage in Java

The following example shows how a custom object (here named Employee) might implement the public int hashCode() method:

public class Employee{
    private int employeeId;
    private String firstName;
    private String lastName;
    private Department dept;
 
    /* other methods would be in here somewhere */
 
    public int hashCode() {
        int hash = 1;
        hash = hash * 31 + lastName.hashCode();
        hash = hash * 29 + firstName.hashCode();
        hash = hash * 17 + employeeId;
        hash = hash * 13 + (dept == null ? 0 : dept.hashCode());
        return hash;
    }
}

As with any hash, collisions are possible. For example, the following code:

int hash1 = "Ea".hashCode();
int hash2 = "FB".hashCode();
System.out.println(hash1==hash2);

Will result in an output of 'true' since both Strings are hashed into the same value, because the hashcode implementation of String is using the prime number 31 and the difference between 'a' and 'B' is just 31, so the formula is 70 x 31 + 66 = 69 x 31 + 97.

Sample Implementations of the Algorithm in Various Languages

In Java

NOTE: This is not the implementation of hashCode() in java.lang.String, which is overly complicated for this example. This is an example of how one might implement the algorithm as a static method.

public static int hashCode(String input) {
    int h = 0;
    int len = input.length();
    for (int i = 0; i < len; i++) {
        h = 31 * h + input.charAt(i);
    }
    return h;
}

In Visual Basic for Applications (VBA, the MS office scripting language)

Function hashCode(Value)
 
Const maxInt = 4294967295#
Const maxPostInt = 2147483647
 
Dim h As Currency
Dim div As Long
 
h = 0
 
For i = 1 To Len(Value)
    h = h * 31 + Asc(Mid(Value, i, 1))
 
    If (h > maxInt) Then
        div = Int(h / (maxInt + 1))
        h = h - (div * (maxInt + 1))
    End If
Next i
 
If h > maxPostInt Then
    h = h - maxInt - 1
End If
 
hashCode = h
 
End Function

In ActionScript (Flash)

function hashCode(string) {
    var h = 0;
    for (var i = 0; i < string.length; i++) {
        h = (((31 * h) >> 0) + string.charCodeAt(i)) >> 0;
    }
    return h;
}

NOTE: The >> operator behaves as described in the ActionScript documentation, which includes conversion of the inputs to 32-bit integers:
"Operator (bitwise); converts expression1 and expression2 to 32-bit integers, and shifts all of the bits in expression1 to the right by the number of places specified by the integer resulting from the conversion of expression2 . Bits that are shifted to the right are discarded. To preserve the sign of the original expression , the bits on the left are filled in with 0 if the most significant bit (the bit farthest to the left) of expression1 is 0, and filled in with 1 if the most significant bit is 1."

References


No comments have been added.



Your name:

City:

Country:

Your comments:

Security check *
(Please enter the number into adjoining box)