Problem
I’ve got a task to implement custom hash map with only get, put, size functions without remove and implementing base Map interface in Java. I’ve made one but not sure if it is good one.
Can You check it out please and maybe give some tips? https://github.com/AlistratenkoNikita/CustomHashMap
package com.alistratenko.customMap;
/**
* An interface for custom hashmap with open addressing
* An object that maps int keys to long values.
*
* @author Nikita
* @version 1.0
* @since 26.08.2017
*/
public interface CustomMap {
/**
* Returns the value to which the specified key is mapped, or 0 if this map contains no mapping for the key.
*
* @param key the key whose associated value is to be returned
* @return the value to which the specified key is mapped, or null if this map contains no mapping for the key
*/
long get(int key);
/**
* Associates the specified value with the specified key in this map. If the map previously contained a
* mapping for the key, the new value will be put in free spot in the array which will be found with linear probing.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
*/
void put(int key, long value);
/**
* Returns the number of key-value mappings in this map.
*
* @return the number of key-value mappings in this map
*/
int size();
}
package com.alistratenko.customMap;
/**
* Base implementation of CustomMap interface
*
* @author Nikita
* @version 1.0
* @since 26.08.2017
*/
public class CustomMapOpenAddressing implements CustomMap {
private Pair[] array;
private static final int DEFAULT_CAPACITY = 8;
private int size;
public CustomMapOpenAddressing() {
array = new Pair[DEFAULT_CAPACITY];
size = 0;
}
/**
* {@inheritDoc}
*/
public long get(int key) {
return array[hash(key)] != null ? array[hash(key)].getValue() : 0;
}
/**
* {@inheritDoc}
*/
public void put(int key, long value) {
//the index of the array cell to store the value
int index = hash(key);
//if the standard index of the array cell is busy linear probing takes place
if (array[index] != null) {
int tryNumber = 1;
do {
index = linearProbing(index, tryNumber);
} while (array[index] != null);
}
array[index] = new Pair(key, value);
size++;
//rehashes the map if the number of elements crosses 0.75 times the capacity
if (array.length * 0.75 < size) {
rehashing();
}
}
/**
* {@inheritDoc}
*/
public int size() {
return size;
}
/**
* Returns the hashcode(array index) for a key in this array
*
* @param key a value to get hashcode for
* @return the hashcode(array index) for a key in this array
*/
private int hash(int key) {
return Integer.hashCode(key) % array.length;
}
/**
* Rehashes the {@link CustomMap}, increases its capacity in 1.5 times and copies elements from old array to a new one
*/
private void rehashing() {
int newCapacity = (int) (this.array.length * 1.5);
Pair[] oldArr = this.array;
this.array = new Pair[newCapacity];
//setting number of the elements in the new array to zero
//the value will be restored after copying all elements from old array to the new one
size = 0;
for (Pair pair : oldArr) {
if (pair != null) {
put(pair.getKey(), pair.getValue());
}
}
}
/**
* Looks up array cell one by one
*
* @param hash the hash of the value
* @param i the number of the linear probing element
* @return the index of the array cell
*/
private int linearProbing(int hash, int i) {
return (hash + i) % array.length;
}
/**
* Represents a pair of key-value set for a hashmap
*/
protected class Pair {
private final int key;
private long value;
Pair(int key, long value) {
this.key = key;
this.value = value;
}
int getKey() {
return key;
}
long getValue() {
return value;
}
@Override
public String toString() {
return "Pair{" +
"key=" + key +
", value=" + value +
'}';
}
}
}
public class MapTest {
private CustomMap map;
@Before
public void init() {
map = new CustomMapOpenAddressing();
}
@Test
public void testMapForCasualInput() {
map.put(0, 0);
map.put(1, 1);
map.put(2, 2);
assertEquals(3, map.size());
}
@Test
public void testMapForDuplicates1() {
map.put(0, 0);
map.put(8, 8);
assertEquals(map.size(), 2);
}
@Test
public void testMapForDuplicates2() {
map.put(0, 0);
map.put(1, 1);
map.put(8, 8);
assertEquals(map.size(), 3);
assertEquals(map.get(0), 0);
assertEquals(map.get(1), 1);
assertEquals(map.get(8), 0);
}
@Test
public void testMapForEmptyValues() {
map.put(0, 0);
assertTrue(map.get(1) == 0);
}
@Test
public void testMapForRehashing() {
for (int i = 0; i < 100_000; i++) {
map.put(i, i);
}
assertEquals(100_000, map.size());
}
}
Solution
Code review
Advice 1
package com.alistratenko.customMap;
: According to general Java conventions, the package names should not use uppercase characters. Consider renaming to, say, package com.alistratenko.custom_map;
.
Advice 2
public long get(int key) {
return array[hash(key)] != null ? array[hash(key)].getValue() : 0;
}
Here, you return a long
0 whenever the key
is not present in the map. Basically, that arrangement may confuse the possible user since if they get 0, they are not quite sure which is true: 1) the key was explicitly mapped to zero, 2) the key is not present. In order to circumvent, I would throw an exception whenever the key is not in the map, and would not rely on returning a sentinel value.
Advice 3
class CustomMapOpenAddressing
; I would rephrase the class name a bit, how about OpenAddressingCustomMap
?
Advice 4
private int hash(int key) {
return Integer.hashCode(key) % array.length;
}
You can write:
private int hash(int key) {
return key % array.length;
}
since Integer.hashCode
returns its argument unmodified.
Speaking of which, your implementation will throw ArrayIndexOutOfBoundsException
on most negative keys. Just put the key into Math.abs
and you are ready to go. However, as Roland Illig pointed out, the Math.abs
of Integer.MIN_VALUE
is negative, so we have to deal with this corner case:
private int hash(int key) {
// Math.abs of Integer.MIN_VALUE < 0.
if (key == Integer.MIN_VALUE) {
return 0;
}
return Math.abs(key) % array.length;
}
Advice 5
protected class Pair
: redeclare as private static final class Pair
. The problem here is the fact that each Pair
will contain an explicity reference to the outer object. static
removes those unnecessary references and helps to save memory a little bit. Also, I believe Entry
is a more conventional name.
First of all, it’s good to see some documentation and tests! I don’t use Java a lot so I won’t say much about that specifically, but let’s have a look:
- The documentation of an interface should describe what implementations need to do, not how they need to do it. Things like ‘open addressing’ and ‘linear probing’ are implementation details that belong to the
CustomMapOpenAddressing
class, they should not be mentioned in theCustomMap
documentation. - It would be useful to see the reasoning behind certain decisions, like when to resize (you’ve chosen 75% capacity) or how to resolve collisions (you’ve chosen open addressing and linear probing).
- Some method names aren’t very clear:
- the main purpose of
rehashing
is to resize the hash map, so I’d call itresize
. - Likewise, I’d rename
testMapForRehashing
totestMapForResizing
. hash
is used to determine an entry’s index, so I’d probably call thatgetIndex
instead.testMapForDuplicates1
is testing for hash collisions, not duplicate keys or values, so I’d call ittestMapForHashCollisions
.
- the main purpose of
- The
linearProbing
method doesn’t actually do any probing, the loop that calls it does. I’d refactor this into agetFirstEmptyIndex(int desiredIndex)
method. - coderodde already said this, but returning 0 if a key doesn’t exist is probably not a good idea: it’s a valid value, so calling code cannot distinguish between ‘key does not exist’ and ‘key is mapped to 0’. Throwing an exception and/or providing a
containsKey
method would solve that problem. - There’s a bug in
get
: it doesn’t take the open addressing/probing into account. This means that hash collisions cause wrong results. ThetestMapForDuplicates2
test method actually triggers this bug, but the test is also broken, because it expectsmap.get(8)
to return0
, while it clearly calledmap.put(8, 8)
. - Besides a broken test, some of the other tests can also be improved:
- The documentation of
put
specifically mentions that calling it again for the same key will overwrite the existing value, but there is no test for that. - The resize test does not verify that
get
still returns the correct value for each key – just checking the size is not enough. - All tests always use the same value for both key and value. This means that a bug that mixes up keys and values won’t be detected.
- I would remove the
testMapForDuplicates1
test, becausetestMapForDuplicates2
does the same thing, and more.
- The documentation of