Introduction
A Hash Set is a data structure that stores unique elements in no particular order. Just like a hash map, a hash set uses a hash table for storage, but it only stores keys, not key-value pairs. In this blog post, we’ll discuss the hash set data structure and implement one from scratch in Python.
Hash Set Structure
A hash set stores unique elements in an array, usually referred to as “buckets”. A hash function is used to map each element to an index in this array. Since it only stores keys, you can think of it as a simplified hash map.
Example:
Hash Set:
---------------
| 0 | -> | 1 |
---------------
| 1 | -> | 3 | -> | 9 |
---------------
| 2 | -> | 2 |
---------------
| 3 | -> | 4 |
---------------
| 4 | -> | 8 |
---------------
In the above example, the hash set has 5 buckets. Each bucket can contain zero or more elements, stored as linked lists.
Python Implementation
Let’s implement a basic hash set in Python. Similar to our hash map implementation, we’ll use Python lists and custom linked list nodes.
Node Class
First, we’ll implement a Node class to hold elements.
class Node:
def __init__(self, val):
self.val = val
self.next = None
HashSet Class
Now, let’s build the HashSet
class with essential functionalities.
class HashSet:
def __init__(self, size):
self.size = size
self.set = [None] * self.size
def _hash(self, val):
return hash(val) % self.size
def add(self, val):
val_hash = self._hash(val)
if self.set[val_hash] is None:
self.set[val_hash] = Node(val)
else:
cur_node = self.set[val_hash]
while cur_node:
if cur_node.val == val:
return
if cur_node.next is None:
cur_node.next = Node(val)
return
cur_node = cur_node.next
def contains(self, val):
val_hash = self._hash(val)
if self.set[val_hash] is not None:
cur_node = self.set[val_hash]
while cur_node:
if cur_node.val == val:
return True
cur_node = cur_node.next
return False
def remove(self, val):
val_hash = self._hash(val)
if self.set[val_hash] is None:
return
cur_node = prev_node = self.set[val_hash]
while cur_node:
if cur_node.val == val:
if cur_node == prev_node:
self.set[val_hash] = cur_node.next
else:
prev_node.next = cur_node.next
return
prev_node = cur_node
cur_node = cur_node.next
Method Details:
add
: Adds a new element to the hash set if it doesn’t already exist.contains
: Checks if an element exists in the hash set.remove
: Removes an element from the hash set.
Putting It All Together
Here’s an example demonstrating how to use our HashSet class.
# Create a new HashSet
h = HashSet(5)
# Add elements
h.add(1)
h.add(3)
h.add(2)
# Check existence
print(h.contains(1)) # Output: True
print(h.contains(4)) # Output: False
# Remove an element
h.remove(1)
print(h.contains(1)) # Output: False
Using Python’s In-Built HashSet
Python has a built-in hash set data structure called set
. It’s extremely simple to use and optimized for performance:
# Create a new set
my_set = set()
# Add elements
my_set.add(1)
my_set.add(2)
my_set.add(3)
# Check existence
print(1 in my_set) # Output: True
# Remove an element
my_set.remove(1)
print(1 in my_set) # Output: False
If you find this guide useful, please share with others, and spread the knowledge!