Implementing a Hash Map Data Structure in Python - Data Structures and Algorithms (DSA) Guide

Introduction

Hash maps, also known as hash tables or dictionaries, are data structures that provide fast access, insertion, and deletion operations. The basic idea behind a hash map is to use a hash function to map keys to indices in an array, where the corresponding values can be stored. In this blog post, we’ll walk through implementing a simple hash map from scratch in Python.

Hash Map Structure

A hash map is made up of two main components:

Keys: Unique identifiers used to locate data.
Values: The actual data associated with keys.

The key-value pairs are stored in an array, known as “buckets,” and a hash function maps keys to appropriate indices in this array.

Example:

Hash Map:
  ---------------
  | 0 | -> | a, 1 |
  ---------------
  | 1 | -> | b, 2 | -> | x, 5 |
  ---------------
  | 2 | -> | c, 3 |
  ---------------
  | 3 | -> | d, 4 |
  ---------------
  | 4 | -> | e, 5 |
  ---------------

In the above representation, the hash map has 5 buckets. Each bucket can contain one or more key-value pairs, connected as linked lists.

Python Implementation

Let’s implement a basic hash map in Python. We’ll use Python lists and custom linked list nodes to implement our hash map.

Node Class

First, we’ll implement a Node class to hold key-value pairs.

class Node:
    def __init__(self, key, val):
        self.key = key
        self.val = val
        self.next = None

HashMap Class

Now, let’s build the HashMap class with essential functionalities.

class HashMap:
    def __init__(self, size):
        self.size = size
        self.map = [None] * self.size

    def _hash(self, key):
        return hash(key) % self.size

    def set(self, key, value):
        key_hash = self._hash(key)
        key_value = [key, value]
        if self.map[key_hash] is None:
            self.map[key_hash] = Node(key, value)
        else:
            cur_node = self.map[key_hash]
            while cur_node:
                if cur_node.key == key:
                    cur_node.val = value
                    return
                if cur_node.next is None:
                    cur_node.next = Node(key, value)
                    return
                cur_node = cur_node.next

    def get(self, key):
        key_hash = self._hash(key)
        if self.map[key_hash] is not None:
            cur_node = self.map[key_hash]
            while cur_node:
                if cur_node.key == key:
                    return cur_node.val
                cur_node = cur_node.next
        return None

    def delete(self, key):
        key_hash = self._hash(key)
        if self.map[key_hash] is None:
            return False
        cur_node = prev_node = self.map[key_hash]
        while cur_node:
            if cur_node.key == key:
                if cur_node == prev_node:
                    self.map[key_hash] = cur_node.next
                else:
                    prev_node.next = cur_node.next
                return True
            prev_node = cur_node
            cur_node = cur_node.next
        return False

Method Details:

set: Adds a new key-value pair to the hash map or updates the value if the key already exists.
get: Retrieves the value for a given key.
delete: Removes a key-value pair from the hash map.
_hash: A simple hash function to calculate the bucket index.

Putting It All Together

Here’s an example of how to use our HashMap class.

# Create a new HashMap
h = HashMap(5)

# Set key-value pairs
h.set("a", 1)
h.set("b", 2)
h.set("c", 3)

# Get values
print(h.get("a"))  # Output: 1
print(h.get("b"))  # Output: 2

# Delete a key-value pair
print(h.delete("a"))  # Output: True
print(h.get("a"))  # Output: None

Using Python’s In-Built Hash Map

Python comes with a built-in hash map implementation in the form of dictionaries. The syntax is extremely user-friendly and the underlying operations are optimized. You can perform all standard operations like inserting a key-value pair, retrieving a value by its key, and deleting a key-value pair.

Here’s a short example:

# Initialize a dictionary
my_dict = {}

# Insert key-value pairs
my_dict['a'] = 1
my_dict['b'] = 2
my_dict['c'] = 3

# Retrieve values
print(my_dict.get('a'))  # Output: 1

# Delete a key-value pair
del my_dict['a']

# Check if key exists
print('a' in my_dict)  # Output: False

If you find this guide useful, please share with others, and spread the knowledge!