Implementing Linked-Based Data Structures

Gregory M. Kapfhammer

March 11, 2024

What is a linked list?

  • Stores a sequential collection of elements
  • Uses individual nodes to store data values
  • Each node has a reference to the next node
  • The first node is called the head and the last is the tail
class ListNode:
    def __init__(self, data, link = None):
        self.data = data
        self.link = link

Implementing a LinkedList

class LinkedList:
    def __init__(self):
        self._head = None

    def addfirst(self, item):
        self._head = ListNode(item, self._head)

    def removefirst(self):
        item = self._head.data
        self._head = self._head.link
        return item
  • A LinkedList contains an instance of ListNode in _head
  • addfirst creates a new ListNode and updates _head
  • removefirst updates _head and returns the data

Making a Queue with a LinkedList

class LinkedList:
    def __init__(self):
        self._head = None

    def addfirst(self, item):
        self._head = ListNode(item, self._head)

    def addlast(self, item):
        if self._head is None:
            self.addfirst(item)
        else:
            currentnode = self._head
            while currentnode.link is not None:
                currentnode = currentnode.link
            currentnode.link = ListNode(item)

    def removefirst(self):
        item = self._head.data
        self._head = self._head.link
        return item

    def removelast(self):
        if self._head.link is None:
            return self.removefirst()
        else:
            currentnode = self._head
            while currentnode.link.link is not None:
                currentnode = currentnode.link
            item = currentnode.link.data
            currentnode.link = None
            return item

Using the LinkedList

LL = LinkedList()
LL.addfirst(3)
LL.addfirst(5)
print(LL.removefirst() == 5)
LL.addlast(9)
LL.addlast(13)
print(LL.removefirst() == 3)
print(LL.removefirst() == 9)
print(LL.removelast() == 13)
True
True
True
True
  • This LinkedList can be used to build a Queue
  • However, two of these methods have while loops!
  • This means it is not always better than the ListQueue

Concerns about this LinkedList?

def addlast(self, item):
    if self._head is None:
        self.addfirst(item)
    else:
        currentnode = self._head
        while currentnode.link is not None:
            currentnode = currentnode.link
        currentnode.link = ListNode(item)
  • The while currentnode.link is not None is not efficient
  • Access to the final node requires traversing the entire list
  • This is a \(O(n)\) (or linear-time) operation!
def removelast(self):
    if self._head.link is None:
        return self.removefirst()
    else:
        currentnode = self._head
        while currentnode.link.link is not None:
            currentnode = currentnode.link
        item = currentnode.link.data
        currentnode.link = None
        return item
  • while currentnode.link.link is not None not efficient
  • removelast traverses all nodes to find second-to-last one

Better Queue with a LinkedList

class LinkedListNew:
    def __init__(self):
        self._head = None
        self._tail = None

    def addfirst(self, item):
        self._head = ListNode(item, self._head)
        if self._tail is None: self._tail = self._head

    def addlast(self, item):
        if self._head is None:
            self.addfirst(item)
        else:
            self._tail.link = ListNode(item)
            self._tail = self._tail.link

    def removefirst(self):
        item = self._head.data
        self._head = self._head.link
        if self._head is None: self._tail = None
        return item

    def removelast(self):
        if self._head is self._tail:
            return self.removefirst()
        else:
            currentnode = self._head
            while currentnode.link is not self._tail:
                currentnode = currentnode.link
            item = self._tail.data
            self._tail = currentnode
            self._tail.link = None
            return item

Using the new LinkedList

LL = LinkedListNew()
LL.addfirst(3)
LL.addfirst(5)
print(LL.removefirst() == 5)
LL.addlast(9)
LL.addlast(13)
print(LL.removefirst() == 3)
print(LL.removefirst() == 9)
print(LL.removelast() == 13)
True
True
True
True
  • This LinkedList can be used to build a Queue
  • However, one of these methods still has a while loop!
  • This means it is not always better than the ListQueue

Concerns about new LinkedList?

def addlast(self, item):
    if self._head is None:
        self.addfirst(item)
    else:
        self._tail.link = ListNode(item)
        self._tail = self._tail.link
  • There is no longer a while loop in addlast
  • When there is no data, addfirst is called
  • Otherwise, self._tail is updated and new node is added
  • Adding a node to the end of the list is now \(O(1)\)!
def removelast(self):
    if self._head is self._tail:
        return self.removefirst()
    else:
        currentnode = self._head
        while currentnode.link is not self._tail:
            currentnode = currentnode.link
        item = self._tail.data
        self._tail = currentnode
        self._tail.link = None
        return item
  • while currentnode.link is not self._tail not efficient
  • removelast still traverses nodes to perform a removal!

Tracking the size of a LinkedList

class LinkedListPrime:
    def __init__(self):
        self._head = None
        self._tail = None
        self._length = 0

    def addfirst(self, item):
        self._head = ListNode(item, self._head)
        if self._tail is None: self._tail = self._head
        self._length += 1

    def addlast(self, item):
        if self._head is None:
            self.addfirst(item)
        else:
            self._tail.link = ListNode(item)
            self._tail = self._tail.link
            self._length += 1

    def removefirst(self):
        item = self._head.data
        self._head = self._head.link
        if self._head is None: self._tail = None
        self._length -= 1
        return item

    def removelast(self):
        if self._head is self._tail:
            return self.removefirst()
        else:
            currentnode = self._head
            while currentnode.link is not self._tail:
                currentnode = currentnode.link
            item = self._tail.data
            self._tail = currentnode
            self._tail.link = None
            self._length -= 1
            return item

    def __len__(self):
        return self._length

How do methods influence size?

__init__

  • Initializes the size to zero
  • self._length = 0 is a \(O(1)\) operation

addfirst and addlast

  • Both methods increment the size by one
  • self._length += 1 is a \(O(1)\) operation

removefirst and removelast

  • Both methods decrement the size by one
  • self._length -= 1 is a \(O(1)\) operation

Using improved LinkedList

LL = LinkedListPrime()
LL.addfirst(3)
LL.addfirst(5)
print(LL.removefirst() == 5)
LL.addlast(9)
LL.addlast(13)
print(LL.removefirst() == 3)
print(LL.removefirst() == 9)
print(LL.removelast() == 13)
print(len(LL) == 0)
True
True
True
True
True
  • LinkedListPrime has features equivalent to LinkedListNew
  • Represents an improved building block for a Queue

LinkedQueue with a LinkedList

class LinkedQueue:
    def __init__(self):
        self._L = LinkedListPrime()

    def enqueue(self, item):
        self._L.addlast(item)

    def dequeue(self):
        return self._L.removefirst()

    def peek(self):
        item = self._L.removefirst()
        self._L.addfirst(item)
        return item

    def display(self):
        current = self._L._head
        while current is not None:
            print(current.data, end=" ")
            current = current.link

    def __len__(self):
        return len(self._L)

    def isempty(self):
        return len(self) == 0

Using the LinkedQueue

def manipulate_queue():
    queue = LinkedQueue()
    queue.enqueue('umbrella')
    queue.enqueue('backpack')
    queue.enqueue('sandals')
    print("Queue contents after enqueue operations:", end=" ")
    queue.display()
    queue.dequeue()
    print("\nQueue contents after dequeue operation:", end=" ")
    queue.display()

manipulate_queue()
Queue contents after enqueue operations: umbrella backpack sandals 
Queue contents after dequeue operation: backpack sandals 
  • LinkedQueue is functionally equivalent to the ListQueue
  • The LinkedQueue uses display to reveal the contents

To achieve performance improvements, we need a doubly linked list!

  • Stores a sequential collection of elements
  • Uses individual nodes to store data values
  • Each node has a reference to the next and previous node
  • Enable list traversal in forward and backward directions

Creating a new ListNode

class ListNode:
    def __init__(self, data, prev = None, link = None):
        self.data = data
        self.prev = prev
        self.link = link
        if prev is not None:
            self.prev.link = self
        if link is not None:
            self.link.prev = self
  • self.data stores the data value inside of the ListNode
  • self.prev stores a reference to the previous ListNode
  • self.link stores a reference to the next ListNode
  • Key invariant for the ListNode: for any two nodes a and b, it must be true that b == a.link if and only if a = b.prev

Attempting the DoublyLinkedList

class DoublyLinkedList:
    def __init__(self):
        self._head = None
        self._tail = None
        self._length = 0

    def addfirst(self, item):
        if len(self) == 0:
            self._head = self._tail = ListNode(item, None, None)
        else:
            newnode = ListNode(item, None, self._head)
            self._head.prev = newnode
            self._head = newnode
        self._length += 1

    def addlast(self, item):
        if len(self) == 0:
            self._head = self._tail = ListNode(item, None, None)
        else:
            newnode = ListNode(item, self._tail, None)
            self._tail.link = newnode
            self._tail = newnode
        self._length += 1

    def __len__(self):
        return self._length

Wait, code redundancy suggests an opportunity to refactor the source code in addfirst and addlast!

def _addbetween(self, item, before, after):
    node = ListNode(item, before, after)
    if after is self._head:
        self._head = node
    if before is self._tail:
        self._tail = node
    self._length += 1
  • Define an internal method called _addbetween
  • The addfirst and addlast methods call _addbetween
  • Importantly, this method is an efficient \(O(1)\) operation!

Refactoring the DoublyLinkedList

class DoublyLinkedList:
    def __init__(self):
        self._head = None
        self._tail = None
        self._length = 0

    def __len__(self):
        return self._length

    def _addbetween(self, item, before, after):
        node = ListNode(item, before, after)
        if after is self._head:
            self._head = node
        if before is self._tail:
            self._tail = node
        self._length += 1

    def addfirst(self, item):
        self._addbetween(item, None, self._head)

    def addlast(self, item):
        self._addbetween(item, self._tail, None)

    def _remove(self, node):
        before, after = node.prev, node.link
        if node is self._head:
            self._head = after
        else:
            before.link = after
        if node is self._tail:
            self._tail = before
        else:
            after.prev = before
        self._length -= 1
        return node.data

    def removefirst(self):
        return self._remove(self._head)

    def removelast(self):
        return self._remove(self._tail)

    def display(self):
        current = self._head
        while current is not None:
            print(current.data)
            current = current.next

Using the DoublyLinkedList

def manipulate_doubly_linked_list():
    DLL = DoublyLinkedList()
    print(len(DLL))
    DLL.addfirst(3)
    DLL.addfirst(5)
    print(DLL.removefirst() == 5)
    DLL.addlast(9)
    DLL.addlast(13)
    print(DLL.removefirst() == 3)
    print(DLL.removefirst() == 9)
    print(DLL.removefirst() == 13)
    print(len(DLL))

manipulate_doubly_linked_list()
0
True
True
True
True
0

How do we join together structures like List or DoublyLinkedList?

A = [1,2,3]
B = [4,5,6]
C = A + B
print(A)
print(B)
print(C)
[1, 2, 3]
[4, 5, 6]
[1, 2, 3, 4, 5, 6]
  • The line C = A + B concatenates the two lists together
  • Python’s implementation of + creates a new list
  • Takes time proportional length of C without modifying A or B

If modification of DoublyLinkedList permitted, efficient concatenation!

def __iadd__(self, other):
    if other._head is not None:
        if self._head is None:
            self._head = other._head
        else:
            self._tail.link = other._head
            other._head.prev = self._tail
        self._tail = other._tail
        self._length = self._length + other._length
        other.__init__()
    return self
  • The internal __iadd__ method is called when += is used
  • The other DoublyLinkedList is concatenated to self

Concatenation for DoublyLinkedList

L = DoublyLinkedList()
[L.addlast(i) for i in range(11)]
B = DoublyLinkedList()
[B.addlast(i+11) for i in range(10)]

L += B

n = L._head
while n is not None:
    print(n.data, end = ' ')
    n = n.link
  • The DoublyLinkedList L is concatenated with B
  • Using += calls the __iadd__ method in DoublyLinkedList
  • The B list is drained of values when it is the other parameter

How can we now implement a Deque with a DoublyLinkedList?

  • A Deque, or a double-ended queue, will contain an instance of the DoublyLinkedList as was done previously
  • Enables efficient addition and removal at both Deque ends
  • Hooray, all methods in this Deque are \(O(1)\) operations!

Insights about the design and implementation of data structures

Design Considerations

  • One structure can be used to build another
  • A structure’s design can influence its performance
  • Fundamental limitations arise with certain designs

Implementation Trade-offs

  • Refactoring can improve a structure’s implementation
  • Assumptions about modification influence performance
  • Overloading operator built-ins enhances convenience