A Sparse Array is an array that contains mostly zero values. On the contrary, the majority of the elements are non-zero in the dense array. One question that might come to your mind that why sparse arrays are discussed separately and not treated like regular arrays? The thing is, most of the elements are zero in a sparse array, and they won’t be of any use when we work with them or perform operations on them.
Moreover, storing zero-values will waste memory space as well. For example, if we have a matrix of 20 by 20, and only 50 values are non-zero; storing the complete 2D array will waste space and consume more time as well. Therefore, some other strategy needs to be employed that represents such data efficiently. Many methods exist for the implementation of sparse arrays. We will talk about three such ways in this article.
Representation using Arrays.
The main catch here is that we only need to store the non-zero values. Therefore, one of the implementations is to create a 2D array, in which each internal array contains three columns. The first column has the row index, the second column contains the column index, and the third one stores the value. The size of this 2D would be a number of non-zero values×3.
Taking the above example, we will have a 50 by three array now, instead of a 20 by 20. You can see here that this method is relatively more efficient.
Consider the code given below.
sparseArray = [ [1, 0, 3, 0, 0], [0, 0, 1, 0, 0], [2, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 5, 0, 0, 9], ] rows = len(sparseArray) cols = len(sparseArray[0]) #count number of non-zero values count = 0 compactArray = [] for i in range(0, rows): for j in range(0, cols): if sparseArray[i][j]!=0: compactArray.append([i, j, sparseArray[i][j]]) count += 1 #traverse the array print("row", "col", "val") for i in range(0, len(compactArray)): print(compactArray[i][0], "\t", compactArray[i][1], "\t", compactArray[i][2]) #sparsity size = rows*cols sparsity = (size-count)/size print("Sparsity", sparsity)
Output
row col val
0 0 1
0 2 3
1 2 1
2 0 2
4 1 5
4 4 9
Sparsity 0.76
Representation using List of Lists
This is another representation of sparse Arrays. As the name suggests, it includes a list of lists (Linked Lists). A node in the outer linked list contains the index of a row and an inner list. It also points to the next row node. Moreover, a node in a sublist has a column index, a value at that row and column, and a pointer to the next node. Consider the figure below that illustrates the above concept.
Let’s go ahead and see its implementation.
class RowNode(object): def __init__(self, index=None, nextRowNode=None, colNode=None): self.index=index self.nextRowNode=nextRowNode self.colNode=colNode class ColumnNode(object): def __init__(self, index=None, value=None, nextNode=None ): self.index=index self.value=value self.nextNode=nextNode def createColNodes(sparseArray, j, curRowNode): curColNode = ColumnNode(j, sparseArray[curRowNode.index][j]) #create the current inner node rightHead = curRowNode.colNode if (rightHead==None): #if it is the first node in the innerList curRowNode.colNode = curColNode else: while (rightHead.nextNode): rightHead = rightHead.nextNode rightHead.nextNode=curColNode def createRowNodes(sparseArray, m, n): #m rows and n columns in the sparse Array global head for i in range(0, m): curRowNode = RowNode(i) #creating a row node with the current row index if i==0: #first row node head = curRowNode #set the head else: tempRowNode = head while (tempRowNode.nextRowNode): tempRowNode = tempRowNode.nextRowNode tempRowNode.nextRowNode = curRowNode #add the value to the inner node for j in range(0, n): if sparseArray[i][j]!=0: createColNodes(sparseArray, j, curRowNode) rows = len(sparseArray) cols = len(sparseArray[0]) head = RowNode() #initial start node for row index 0 createRowNodes(sparseArray, rows,cols)
In the above code, class RowNode
is responsible for creating a node in the outer linked list, and class ColumnNode
creates a node for the inner linked list. The head variable refers to the first node in the outer rows list, i.e., its index value will be 0
. As you can observe in the createRowNodes()
method, we iterate through the sparseArray
row by row. First, we create a row node and add it at the appropriate place of the outer linked list, i.e., iterate through the list until we find the last node. Then, we iterate through columns in that row and find non-zero values. The non-zero value, column index, and row node gets passed to the createColNodes()
method, where we create the inner node and add it to the sub list.
Let’s now traverse through this list of lists.
def traverse(): global head if (head.index==None): print("Empty Sparse Array") return; st = head print("row", "col", "val") while (st): col = st.colNode while(col): print(st.index, "\t", col.index, "\t", col.value) col = col.nextNode st = st.nextRowNode
Output
row col val
0 0 1
0 2 3
1 2 1
2 0 2
4 1 5
4 4 9
Representation using Dictionary of Keys
In this representation, we use a dictionary to store the sparse array. The key is a row and column pair, i.e., (row, column), and the value is the data stored at the position.
Consider the following code below.
sparseArray = [ [1, 0, 3, 0, 0], [0, 0, 1, 0, 0], [2, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 5, 0, 0, 9], ] rows = len(sparseArray) cols = len(sparseArray[0]) dok = {} #dictionay of keys for i in range(0, rows): for j in range(0, cols): if sparseArray[i][j]!=0: dok[(i, j)] = sparseArray[i][j] #add the value to the dictionary for key, value in dok.items(): #iterate through the dictionary print("Key:", key,"Value:", value)
Output
Key: (0, 0) Value: 1
Key: (0, 2) Value: 3
Key: (1, 2) Value: 1
Key: (2, 0) Value: 2
Key: (4, 1) Value: 5
Key: (4, 4) Value: 9