<< Chapter < Page Chapter >> Page >

linear probing 

in which the interval between probes is fixed--often at 1.

quadratic probing 

in which the interval between probes increases linearly (hence, the indices are described by a quadratic function).

double hashing 

in which the interval between probes is fixed for each record but is computed by another hash function.

The main tradeoffs between these methods are that linear probing has the best cache performance but is most sensitive to clustering, while double hashing has poor cache performance but exhibits virtually no clustering; quadratic probing falls in-between in both areas. Double hashing can also require more computation than other forms of probing. Some open addressing methods, such as last-come-first-served hashing and cuckoo hashing move existing keys around in the array to make room for the new key. This gives better maximum search times than the methods based on probing.

A critical influence on performance of an open addressing hash table is the load factor; that is, the proportion of the slots in the array that are used. As the load factor increases towards 100%, the number of probes that may be required to find or insert a given key rises dramatically. Once the table becomes full, probing algorithms may even fail to terminate. Even with good hash functions, load factors are normally limited to 80%. A poor hash function can exhibit poor performance even at very low load factors by generating significant clustering. What causes hash functions to cluster is not well understood, and it is easy to unintentionally write a hash function which causes severe clustering.

Example pseudocode

The following pseudocode is an implementation of an open addressing hash table with linear probing and single-slot stepping, a common approach that is effective if the hash function is good. Each of the lookup, set and remove functions use a common internal function findSlot to locate the array slot that either does or should contain a given key.

record pair { key, value }

var pair array slot[0..num_slots-1]

function find_slot(key)

i := hash(key) modulo num_slots

// search until we either find the key, or find an empty slot.

while ( (slot[i] is occupied) and ( slot[i].key ≠ key ) ) do

i := (i + 1) modulo num_slots

repeat

return i

function lookup(key)

i := find_slot(key)

if slot[i] is occupied // key is in table

return slot[i].value

else // key is not in table

return not found

function set(key, value)

i := find_slot(key)

if slot[i] is occupied

slot[i].value := value

else

if the table is almost full

rebuild the table larger (note 1)

i := find_slot(key)

slot[i].key := key

slot[i].value := value

Another example showing open addressing technique. Presented function is converting each part(4) of an internet protocol address, where NOT is bitwise NOT, XOR is bitwise XOR, OR is bitwise OR, AND is bitwise AND and<<and>>are shift-left and shift-right:

// key_1,key_2,key_3,key_4 are following 3-digit numbers - parts of ip address xxx.xxx.xxx.xxx

function ip(key parts)

j := 1

do

key := (key_2<<2)

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Data structures and algorithms. OpenStax CNX. Jul 29, 2009 Download for free at http://cnx.org/content/col10765/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Data structures and algorithms' conversation and receive update notifications?

Ask