RED-BLACK TREE SEARCH THE FOLLOWING METHOD IS IN TreeMap.java:
IMPLEMENTATION OF THE HashMap CLASS
public HashMap(int initialCapacity, float loadFactor) { if (initialCapacity < 0) throw new IllegalArgumentException ("Illegal Initial Capacity: "+ initialCapacity); if (loadFactor <= 0) throw new IllegalArgumentException ("Illegal Load factor: "+ loadFactor); if (initialCapacity==0) initialCapacity = 1; this.loadFactor = loadFactor; table = new Entry[initialCapacity]; threshold = (int)(initialCapacity * loadFactor); } // constructor
FOR THE containsKey, get, put, AND remove METHODS, THE INITIAL STRATEGY IS THE SAME: HASH key TO index ; SEARCH LINKED LIST AT table [index].
TIME ESTIMATES: LET n = count, LET m = table.length. ASSUME THE UNIFORM HASHING ASSUMPTION HOLDS.
THE AVERAGE SIZE OF EACH LIST IS n / m
FOR THE containsKey METHOD, averageTime S (n, m) n / 2m iterations. BUT n / m <= loadFactor, A CONSTANT (ASSIGNED IN THE CONSTRUCTOR) SO averageTime S (n, m) < A CONSTANT. averageTime S (n, m) IS CONSTANT.
FIELDS IN THE HashIterator CLASS: Entry[ ] table = HashMap.this.table; int index = table.length; START AT BACK Entry entry = null; Entry lastReturned = null; // where the iterator is // currently positioned int type; // 0 for keys, 1 for values, 2 for entries private int expectedModCount = modCount;
public boolean hasNext() { while (entry==null && index>0) entry = table[--index]; return entry != null; } // method hasNext
public Object next( ) { if (modCount != expectedModCount) throw new ConcurrentModificationException( ); while (entry==null && index>0) entry = table[--index]; if (entry != null) { Entry e = lastReturned = entry; entry = e.next; return type == KEYS ? e.key : (type == VALUES ? e.value : e); } // non-null entry throw new NoSuchElementException( ); } // method next
KEYS null PART OF table null
remove (new Integer ( )); 260 null 261 null null containsKey ( new Integer ( )) WILL RETURN false BECAUSE INDEX 261 HAS null.
SOLUTION: boolean markedForRemoval; put SETS markedForRemoval TO false; remove SETS markedForRemoval TO true.
AFTER INITIAL PUTS: 260 null false false false false 265 null
AFTER remove ( new Integer ( )); 260 null true false false false 265 null
HERE IS PART OF remove : Entry e = table [index]; if (!e.markedForRemoval && e.hash == hash && key.equals(e.key)) { count--; Object oldValue = e.value; e.markedForRemoval = true; return oldValue; } // if match index = (index + 1) % table.length; OFFSET OF 1
THIS SOLUTION LEADS TO ANOTHER PROBLEM: SUPPOSE table.length = 1000 put remove ( put AND remove A TOTAL OF 950 TIMES) … put (REPEATED 40 TIMES) count = 40, SO NO NEED TO RE-HASH
TOO MANY!
SOLUTION: KEEP TRACK OF REMOVALS: private transient int countPlus; // the value of count + the // number of removals since // table.length was last changed public Object put(Object key, Object value) {... if (countPlus >= threshold) rehash( );...
private void rehash( ) { … for (int i = 0; i < oldCapacity ; i++) if (oldMap [i] != null && !oldMap [i].markedForRemoval) put (oldMap [i].key, oldMap [i].value); …
private void rehash( ) { DON’T REHASH IF … MARKED FOR REMOVAL for (int i = 0; i < oldCapacity ; i++) if (oldMap [i] != null && !oldMap [i].markedForRemoval) put (oldMap [i].key, oldMap [i].value); …
CLUSTER: A SEQUENCE OF NON-EMPTY LOCATIONS 260 null false false false false 265 null KEYS THAT HASH TO 261 FOLLOW THE SAME PATH AS KEYS THAT HASH TO 262, …
SOLUTION: DOUBLE HASHING, THAT IS, OBTAIN BOTH INDICES AND OFFSETS BY HASHING: index = hash & 0x7FFFFFFF % table.length; offset = hash & 0x7FFFFFFF / table.length; NOW THE OFFSET DEPENDS ON THE KEY, SO DIFFERENT KEYS WILL USU- ALLY HAVE DIFFERENT OFFSETS, SO NO MORE PRIMARY CLUSTERING!
TO GET A NEW INDEX: index = (index + offset) % table.length;
EXAMPLE: table.length = 11 key index offset WHERE WOULD THESE KEYS GO IN table ?
index key
PROBLEM: WHAT IF OFFSET IS MULTIPLE OF table.length ? EXAMPLE: table.length = 11 key index offset // BUT 15 IS AT INDEX 4 FOR KEY 246, NEW INDEX = (4 + 22) % 11 = 4. OOPS!
SOLUTION : if (offset % table.length == 0) offset = 1; ON AVERAGE, offset % table.length WILL EQUAL 0 ONLY ONCE IN EVERY table.length TIMES.
PROBLEM: WHAT IF table.length HAS SEVERAL FACTORS? EXAMPLE: table.length = 20 key index offset // BUT 30 IS AT INDEX 10 FOR KEY 110, NEW INDEX = (10 + 5) % 20 = 15, WHICH IS OCCUPIED, SO NEW INDEX = (15 + 5) % 20, WHICH IS OCCUPIED, SO NEW INDEX =...
SOLUTION: MAKE table.length A PRIME.