Data Structures in Python

Introduction

Available Data Structures

Sequence types: list, array, tuple, string, ...
Python List vs. Array - when to use?
Hash tables: dict
Additional types in the Python standard library: collections
Types defined in third party libraries: numeric arrays, matrices, ...
Your own types ... (node, route, ...)

Singly Linked List

class ListNode(object):
    """
    Simple implementation of a singly linked list node.
    """
    def __init__(self, data, succ=None):
        self.data = data
        self.succ = succ

# populating the list
head = ListNode(4)
tail = head
for i in range(3, 0, -1):
    tail.succ = ListNode(i)
    tail = tail.succ

# using the list (eg. print its elements)
list_iterator = head
while list_iterator:
    print(list_iterator.data)
    list_iterator = list_iterator.succ

Visualize execution

Doubly Linked List

class ListNode(object):
    """
    Simple implementation of a doubly linked list node.
    """
    def __init__(self, data, pred=None, succ=None):
        """
        `succ`: the current node's successor
        `pred`: the current node's predecessor
        """
        self.data = data
        self.pred = pred
        self.succ = succ

# populating the list
head = ListNode(10)
tail = head
for i in range(9, 0, -1):
    tail.succ = ListNode(i, tail)
    tail = tail.succ

# iterate over the list
list_iterator = head
while list_iterator:
    print(list_iterator.data)
    list_iterator = list_iterator.succ

# iterate from the back
list_iterator = tail
while list_iterator:
    print(list_iterator.data)
    list_iterator = list_iterator.pred

Visualize execution

Binary Tree

class TreeNode(object):
    """
    Simple implementation of a binary tree.
    """
    def __init__(self, data, left=None, right=None):
        self.data = data
        self.left = left
        self.right = right

# populating the tree
root = TreeNode(10)
root.left = TreeNode(5)
root.right = TreeNode(30)
root.left.left = TreeNode(2)
root.left.right = TreeNode(8)
root.right.left = TreeNode(21)

def print_tree(tree):
    """
    Print the elements of a tree in no specific order.
    """
    if tree:
        print(tree.data)
        print_tree(tree.left)
        print_tree(tree.right)
print_tree(root)

Visualize execution

Representing Graphs

Purpose

Graphs are useful in many operations research (OR) contexts like

Shortest and longest path problems
Maximum network flow
Minimum spanning trees
...
Ulrich Pferschy's course on graph algorithms is highly recommended!

Graph $G = (V, E)$
Consists of vertices $V$ and edges $E$ connecting some of the vertices

flowchart LR
  A(A) <--> B(B)
  A(A) <--> X(X)
  A(A) <--> Y(Y)
  B <--> C(C)
  X <--> C
  X <--> Y

Directed Graph

flowchart LR
  1(1) --> 2(2)
  2 --> 3(3)
  1 --> 3
  3 --> 3
  3 --> 4(4)
  4 --> 5(5)
  5 --> 1

Useful representations differ depending on the required algorithm
and the number of edges in the graph

Dedicated Class

#!/usr/bin/env python3

"""
One of the many ways to represent a graph in Python.
"""

import collections

Vertex = collections.namedtuple("Vertex", ("id", "data"))
Edge = collections.namedtuple("Edge", ("first", "second"))


class Graph(object):
    """
    This graph class uses two sets to store the vertices and edges.

    However, it would also be possible to use an adjacency matrix or adjacency
    list. Depending on the purpose of the graph and the algorithms using the
    graph, these alternative implementations might highly benefit the runtime.
    """
    def __init__(self, vertici, edges):
        self.__vertici = set(vertici)
        self.__edges = set(edges)

    def add_vertex(self, v):
        self.__vertici.add(v)

    def add_edge(self, e):
        self.__vertici.add(e.first)
        self.__vertici.add(e.second)
        self.__edges.add(e)

    def remove_vertex(self, v):
        try:
            self.__vertici.remove(v)
        except KeyError:
            pass
        to_remove = {e for e in self.__edges if v == e.first or v == e.second}
        self.__edges -= to_remove

    def is_adjacent(self, v1, v2):
        # this would be much faster with an adjacency matrix
        for e in self.__edges:
            if e.first == v1 and e.second == v2:
                return True
            if e.first == v2 and e.second == v1:
                return True
        return False

    # many more useful methods...

v1 = Vertex('1', '')
v2 = Vertex('2', '')
v3 = Vertex('3', '')
v4 = Vertex('4', '')
v5 = Vertex('5', '')
g = Graph(tuple(), (Edge(v1, v2), Edge(v1, v3), Edge(v2, v3), Edge(v3, v3),
                    Edge(v3, v4), Edge(v4, v5), Edge(v5, v1)))
print("1->2", g.is_adjacent(v1, v2))
print("1->5", g.is_adjacent(v1, v5))
print("1->4", g.is_adjacent(v1, v4))

Download graph.py

Adjacency Matrix

$n^{2}$ elements where $n$ is the number of vertices

$(\begin{array}{ccccc} 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 & 0 \end{array})$

Testing adjacency is $O (1)$
Traversing the neighborhood is $O (n)$
Edge weights can easily be represented

Adjacency Lists

Requires less memory for sparse graphs

graph = {1: [2, 3],
         2: [3],
         3: [3, 4],
         4: [5],
         5: [1]}

Testing adjacency is $O (# neighbors)$
Traversing the neighborhood is $O (# neighbors)$

Questions
and feedback...

Literature

Magnus Lie Hetland Python Algorithms: Mastering Basic Algorithms in the Python Language Apress (2010)

Mark Allen Weiss Data Structures and Algorithm Analysis in C++ Pearson (2013)

Data Structures in Python

Introduction

Available Data Structures

Singly Linked List

Doubly Linked List

Binary Tree

Representing Graphs

Purpose

Directed Graph

Dedicated Class

Adjacency Matrix

Adjacency Lists

Questions and feedback...

Literature

Questions
and feedback...