In regular arrays, we access an element through its index. The index, however, is always an integer. We want to store some details about a student, for example, name, age, and GPA. We can keep this information in an array, i.e., index 0 will have the name, age at index 1, and GPA at index 2. While this implementation fulfills the purpose of storing the data, it does not have good readability; we need to remember at which indices we have placed a specific value.
Since there are only three attributes in the scenario above, it is still manageable. But, what if we store additional information, such as the current semester, major, and term GPA? It will become very inconvenient to manage and work with such an array now.
What is an associative array?
A better approach would be to change the regular index to a named index. For example, array[name]
to store the name, array[age]
to store the age, and so on. These types of arrays are known as associative arrays. Associative arrays are arrays whose index can be of any data type, such as integer, float, or string, etc. While regular arrays only contain values, associative arrays have (key, value) pairs, where the key is the named index, and it is unique.
Implementation of an associative array
There are many ways to implement associative arrays, such as using hash tables and search trees. In this article, we will use hash tables. The key will be passed to a hash function that will generate a hash code, which will provide us an index into an array of buckets.
To handle collisions, i.e., when multiple keys get mapped to the same index, we use the separate chaining method. Each item in an array points to a linked list that contains (key, value) pairs. These pairs have the same hash code. Consider the following figure for illustration.
Functionalities
An associative array contains the following functionalities:
- Insert: It takes a key and a value. If the key already exists, then it updates its value. Otherwise, it inserts a new pair.
- Delete: It takes a key to remove the corresponding (key, value) pair. It also returns the value of a deleted item. If the key does not exist, It will throw an error.
- Lookup: It takes a key and returns its value. If no key is found, then it throws an error.
- Traverse: prints all the (key, value) pairs
Code
The implementation of associative arrays is given below.
class Node: def __init__(self, key=None, value=None): self.key = key self.value = value self.next = None # implementation of linked list for the hashtable class LinkedList: def __init__(self): self.head = None def insert(self, key, value): node = Node(key, value) temp = self.head if temp == None: # if the list is empty self.head = node else: while temp.next: # iterate through the list till last node is found temp = temp.next temp.next = node def search(self, key): temp = self.head while temp: if temp.key == key: return temp.value else: temp = temp.next return None def update(self, key, value): temp = self.head while temp: if temp.key == key: temp.value = value return else: temp = temp.next raise KeyError(f"Key {key} does not exist") def delete(self, key): if self.head: # check if the list is not empty if self.head.key == key: # if the node to be deleted is the first node value = self.head.value self.head = self.head.next return value else: temp = self.head while temp.next: if temp.next.key == key: value = temp.next.value temp.next = temp.next.next return value else: temp = temp.next raise KeyError(f"Key {key} does not exist") def traverse(self): # traverse through the list and print (key, value) pairs temp = self.head while temp: print(temp.key, temp.value) temp = temp.next class AssociativeArray(object): def __init__(self): self.capacity = 10 # capacity of the bucket self.length = 0 # capacity of the bucket self.buckets = [LinkedList() for i in range(0, self.capacity)] def getHashValue(self, key): # get the bucket index return hash(key) % self.capacity def len(self): return self.length # if the element exist then update it, else insert a new item def insert(self, key, value): bucket_index = self.getHashValue(key) # get the index bucket = self.buckets[bucket_index] exists = bucket.search(key) if exists: bucket.update(key, value) else: bucket.insert(key, value) self.length = 1 def lookup(self, key): bucket_index = self.getHashValue(key) bucket = self.buckets[bucket_index] value = bucket.search(key) if value: return value else: raise KeyError(f"Key {key} does not exist") def delete(self, key): bucket_index = self.getHashValue(key) bucket = self.buckets[bucket_index] value = bucket.delete(key) self.length -= 1 return value def getAll(self): for bucket in self.buckets: bucket.traverse() dictt = AssociativeArray() dictt.insert("name", "ashton") # add a new item dictt.insert("age", 20) # add a new item dictt.insert("gpa", 3.74) # add a new item dictt.insert("name", "agar") # update print("Total Length:", dictt.len()) value = dictt.lookup("gpa") # searching value of gpa print("gpa:", value) value = dictt.delete("gpa") print("Deleted item value:", value) print("Print all the (key, value) pairs") dictt.getAll()
Output
Total Length: 3
gpa: 3.74
Deleted item value: 3.74
Print all the (key, value) pairs
name agar
age 20