Graph Structures

Gregory M. Kapfhammer

April 22, 2024

What is a graph? Why is it a useful structure?

  • Mathematical object often used in computer science

  • A graph is a pair \((V, E)\) where \(V\) is any set of entities and \(E\) is a set of pairs of elements of \(V\)

  • \(V\) is called the vertex set and \(E\) is the edge set

Any examples of data we could store in a graph?

Representing a primitive graph

graph_one = {
    1: [2, 3, 4], 2: [], 3: [], 4: []
}

def add_edge(graph, node1, node2):
    if node1 in graph:
        graph[node1].append(node2)
    else:
        graph[node1] = [node2]

add_edge(graph_one, 2, 3)
print(graph_one)
{1: [2, 3, 4], 2: [3], 3: [], 4: []}

graph_two = ({1,2,3,4}, {(1,2), (1,3), (1,4)})
print(graph_two)
({1, 2, 3, 4}, {(1, 2), (1, 3), (1, 4)})

Methods for a simple directed graph

  • __init__(V, E): Initialize a new graph with vertex set V and edge set E
  • vertices(): Return an iterable collection of the vertices
  • edges(): Return an iterable collection of the edges
  • addvertex(v): Add vertex to graph that is identified with the v object
  • addedge(u, v): Add edge to graph between vertices with keys u and v
  • removeedge(u,v) : Remove the edge u,v from the graph
  • __contains__(v) : Return True if vertex v in graph; False otherwise
  • hasedge(u,v) : Return True if edge (u,v) in graph; False otherwise.
  • nbrs(v) : Return an iterable collection of (out)neighbors of v, i.e., those vertices w such that (v, w) is an edge. (For directed graphs, this is out-neighbors.)
  • __len__() : Return the number of vertices in the graph.

Defining the EdgeSetGraph class

class EdgeSetGraph:
    def __init__(self, V = (), E = ()):
        self._V = set()
        self._E = set()
        for v in V: self.addvertex(v)
        for u,v in E: self.addedge(u,v)

    def vertices(self):
        return iter(self._V)

    def edges(self):
        return iter(self._E)

    def addvertex(self, v):
        self._V.add(v)

    def addedge(self, u, v):
        self._E.add((u,v))

    def removeedge(self, u, v):
        self._E.remove((u,v))

    def __contains__(self, v):
        return v in self._V

    def hasedge(self, u, v):
        return (u,v) in self._E

    def nbrs(self, v):
        return (w for u,w in self._E if u == v)

    def __len__(self):
        return len(self._V)
  • Represents a graph using a set of edges between vertices

Using an EdgeSetGraph instance

graph = EdgeSetGraph()
graph.addvertex('A')
graph.addvertex('B')
graph.addvertex('C')
graph.addedge('A', 'B')
graph.addedge('B', 'C')

print('A' in graph)             # Output: True
print(graph.hasedge('A', 'B'))  # Output: True
print(list(graph.nbrs('B')))    # Output: ['C']
graph.removeedge('A', 'B')      # Remove edge
print(graph.hasedge('A', 'B'))  # Output: False
print(len(graph))               # Output: 3
True
True
['C']
False
3
  • How to create an un-directed version of the EdgeSetGraph?

UndirectedEdgeSetGraph class

class UndirectedEdgeSetGraph(EdgeSetGraph):
    def addedge(self, u, v):
        self._E.add(frozenset({u,v}))

    def removeedge(self, u, v):
        self._E.remove(frozenset({u,v}))

    def nbrs(self, v):
        for u, w in self._E:
            if u == v:
                yield w
            elif w == v:
                yield u
  • UnidirectedEdgeSetGraph subclass of EdgeSetGraph
  • Standard sets in Python do not support storing a set
  • The frozenset is an immutable version of the set

Problems with the EdgeSetGraph class

  • The EdgeSetGraph class is not efficient for large graphs
  • Very slow to enumerate all edges incident to a vertex!

Key insight: To find the vertices that are incident to a given vertex, it should not be necessary to go through all the edges! Alternatively, can we store a set of neighbors for each vertex?

Creating AdjacencySetGraph

class AdjacencySetGraph:
    def __init__(self, V = (), E = ()):
        self._V = set()
        self._nbrs = {}
        for v in V: self.addvertex(v)
        for e in E: self.addedge(*e)

    def vertices(self):
        return iter(self._V)

    def edges(self):
        for u in self._V:
            for v in self.nbrs(u):
                yield (u,v)

    def addvertex(self, v):
        self._V.add(v)
        self._nbrs[v] = set()

    def addedge(self, u, v):
        self._nbrs[u].add(v)

    def removeedge(self, u, v):
        self._nbrs[u].remove(v)

    def __contains__(self, v):
        return v in self._nbrs

    def nbrs(self, v):
        return iter(self._nbrs[v])

    def __len__(self):
      return len(self._nbrs)
  • Now nbrs works with return iter(self._nbrs[v])

Using AdjacencySetGraph

G = AdjacencySetGraph({1,2,3}, {(1,2),(2,1),(1,3)})
print("Neighbors of 1:", list(G.nbrs(1)))
print("Neighbors of 2:", list(G.nbrs(2)))
print("Neighbors of 3:", list(G.nbrs(3)))
Neighbors of 1: [2, 3]
Neighbors of 2: [1]
Neighbors of 3: []
  • Calling the nbrs method is now much faster!
  • Direct access to the neighbors of a vertex
  • Preserves same interface in the constructor
  • Does not explicitly store edges of the graph

How could we implement the un-directed version?

Finding cycles and paths in a graph

  • A path is a sequence of vertices connected by edges

A non-empty sequence of vertices \((v_0, v_1,\ldots, v_k)\) is a path from \(v_0\) to \(v_k\) as long as \((v_{i-1}, v_i)\in E\) for all \(i \in \{1,\ldots, k\}\).

  • A path is simple if it does not repeat any vertices
  • The length of a path is the number of edges in the path

Defining a cycle in a graph

  • A cycle is a path of length at least one that starts and ends at the same vertex. The cycle length is the number of edges.

  • A cycle is simple if it’s a cycle and removing last edge makes a simple path (i.e., no repeated vertices until the last one).

How could we implement these definitions in Python?

def iscycle(self, V):
    """Return True if and only if the vertices V form a cycle."""
    return self.ispath(V) and V[0] == V[-1]

def issimplecycle(self, V):
    """Return True if and only if the vertices V form a simple cycle."""
    return self.iscycle(V) and self.issimplepath(V[:-1])

Enhanced AdjacencySetGraph

class AdjacencySetGraphWithPath:
    def __init__(self, V = (), E = ()):
        self._V = set()
        self._nbrs = {}
        for v in V: self.addvertex(v)
        for e in E: self.addedge(*e)

    def vertices(self):
        return iter(self._V)

    def edges(self):
        for u in self._V:
            for v in self.nbrs(u):
                yield (u,v)

    def addvertex(self, v):
        self._V.add(v)
        self._nbrs[v] = set()

    def addedge(self, u, v):
        self._nbrs[u].add(v)

    def removeedge(self, u, v):
        self._nbrs[u].remove(v)

    def __contains__(self, v):
        return v in self._nbrs

    def nbrs(self, v):
        return iter(self._nbrs[v])

    def __len__(self):
      return len(self._nbrs)

    def hasedge(self, u, v):
        return v in self._nbrs[u]

    def ispath(self, V):
      return V and all(self.hasedge(V[i-1], V[i]) for i in range(1, len(V)))

    def issimplepath(self, V):
      return self.ispath(V) and len(V) == len(set(V))

    def iscycle(self, V):
        return self.ispath(V) and V[0] == V[-1]

    def issimplecycle(self, V):
        return self.iscycle(V) and self.issimplepath(V[:-1])
  • Add ispath, issimplepath, iscycle, and issimplecycle

Using AdjacencySetGraphWithPath

G = AdjacencySetGraphWithPath({1,2,3,4}, {(1,2),(3,1), (2,3), (3,4), (4,3)})
print("[1,2,3,1] is a path", G.ispath([1,2,3,1]))
print("[1,2,3,1] is a simple path", G.issimplepath([1,2,3,1]))
print("[1,2,3] is a simple path", G.issimplepath([1,2,3]))
print("[1,2,3] is a simple cycle:", G.issimplecycle([1,2,3]))
print("[1,2,3,1] is a simple cycle:", G.issimplecycle([1,2,3]))
print("[1,2,3,4] is a simple path:", G.issimplepath([1,2,3,4]))
print("[1,2,3,4] is a simple cycle:", G.issimplecycle([1,2,3,4]))
print("[1,2,3,4,3,1] is a cycle:", G.iscycle([1,2,3,4,3,1]))
print("[1,2,3,4,3,1] is a simple cycle:", G.issimplecycle([1,2,3,4,3,1]))
[1,2,3,1] is a path True
[1,2,3,1] is a simple path False
[1,2,3] is a simple path True
[1,2,3] is a simple cycle: False
[1,2,3,1] is a simple cycle: False
[1,2,3,4] is a simple path: True
[1,2,3,4] is a simple cycle: False
[1,2,3,4,3,1] is a cycle: True
[1,2,3,4,3,1] is a simple cycle: False

Traversing a graph

  • After defining a graph, we can search its nodes for data
  • However, we have to be super careful to avoid infinite loops!
  • Why is searching a Tree easier than searching a Graph?
def printall(G, v):
    print(v)
    for n in G.nbrs(v):
        printall(G, n)
  • printall might yield RecursionError for certain graphs G

  • Can we implement graph search without this defect?

Traversing a graph with a set

def printall(G, v, visited):
    visited.add(v)
    print(v)
    for n in G.nbrs(v):
        if n not in visited:
            printall(G, n, visited)

G = AdjacencySetGraphWithPath({1,2,3,4}, {(1,2), (2,3), (3,4), (4,1)})
printall(G, 1, set())
1
2
3
4
  • Intuitively, this approach leaves “bread crumbs” for tracking
  • Traversal only proceeds to vertices not visited before
  • Wait, does this really print all the nodes?

Depth-first search (DFS)

def dfs(G, v):
    visited = {v}
    _dfs(G, v, visited)
    return visited

def _dfs(G, v, visited):
    for n in G.nbrs(v):
        if n not in visited:
            visited.add(n)
            _dfs(G, n, visited)

G = AdjacencySetGraphWithPath({1,2,3,4}, {(1,2), (2,3), (3,4), (4,2)})
print('reachable from 1:', dfs(G, 1))
print('reachable from 2:', dfs(G, 2))
reachable from 1: {1, 2, 3, 4}
reachable from 2: {2, 3, 4}
  • Graph reachability is a key concept in graph theory
  • A depth-first search can be used to find the reachable nodes

Determining graph connectivity

def connected(G, u, v):
    return v in dfs(G, u)

G = AdjacencySetGraphWithPath({1,2,3,4}, {(1,2), (2,3), (3,4), (4,2)})
print("1 is connected to 4:", connected(G, 1, 4))
print("4 is connected to 3:", connected(G, 4, 3))
print("4 is connected to 1:", connected(G, 4, 1))
1 is connected to 4: True
4 is connected to 3: True
4 is connected to 1: False
  • connected uses the dfs function to determine connectivity
  • connected returns True if there is a path between u and v
  • Otherwise, it returns False as there is no connection
  • Useful in a wide variety of graph theoretic algorithms!

Additional avenues for exploring the Graph

  • Develop a non-recursion-based traversal algorithm
  • Implement a breadth-first search (BFS) algorithm
  • Handle weighted edges in the Graph data structure
  • Find the shortest path between two vertices in the Graph

Generalizing a Tree, a Graph can model real-world data!