Running Time Analysis through Analytical Evaluation

Gregory M. Kapfhammer

February 19, 2024

Programmers must write programs that are correct, efficient, and maintainable

  • Prior Focus: is the program implemented correctly?
  • New Focus: does the program have good performance?

A two-week road map for exploring Python program performance

  • Week 1: Empirical Evaluation of Runtime

    • How do you conduct timing experiments to measure the performance of a real Python programs?
  • Week 2: Analytical Evaluation of Time Complexity

    • How do you use an instruction cost model to prove that it belongs to a specific time complexity class?

Ultimate Goal: create a nuanced description of program efficiency that adapts to different inputs and to different computers

Algorithm analysis challenges

Different Inputs

  • Varying sizes of input data
  • Different types of data
  • Degrees of sortedness
  • Degrees of duplication
  • Different data distributions

Different Computers

  • Diverse hardware setups
  • Varying system performance
  • Diverse system building
  • Different operating systems
  • Variation in system load

Ultimate goal: experimental and analytical evaluation methods that yield actionable insights that support understanding and prediction

Review the “naive” duplicate detector

from typing import List
def duplicates(input_list: List[int]) -> bool:
    n = len(input_list)
    for i in range(n):
        for j in range(n):
            if i != j and input_list[i] == input_list[j]:
                return True
    return False

assert(duplicates([1,2,6,3,4,5,6,7,8]))
assert(not duplicates([1,2,3,4]))
print(duplicates([1,2,6,3,4,5,6,7,8]))
print(not duplicates([1,2,3,4]))
True
True
  • Function contains a double-nested for loop
  • How do we measure the performance of duplicates?

Review the timetrials function

from typing import Callable
import time

def timetrials(function: Callable, n: int, trials: int = 10) -> None:
    """Time the function with n inputs trials times"""
    totaltime = 0
    for _ in range(trials):
        start = time.time()
        function(list(range(n)))
        totaltime += time.time() - start
    print("average =%10.7f for n = %d" % (totaltime/trials, n))

# conduct a doubling experiment for a provided function
for n in [50, 100, 200, 400, 800, 1600, 3200]:
    timetrials(duplicates, n)
average = 0.0001422 for n = 50
average = 0.0005499 for n = 100
average = 0.0020999 for n = 200
average = 0.0088664 for n = 400
average = 0.0368137 for n = 800
average = 0.1390809 for n = 1600
average = 0.5626869 for n = 3200

Performance of duplicates

# calculate a doubling ratio
# for the last two performance
# values when run on the
# instructor computer; note
# that this will vary from
# the displayed values slightly
ratio = (0.2225006 / 0.0541614)
print(f"ratio = {ratio:.4f}")
ratio = 4.1081
  • Repeatedly double the size of the input when running experiments
  • A doubling ratio is the current time divided by the prior time
  • A ratio of approximately 4 reveals a quadrupling of performance
  • Offers experimental evidence that performance is quadratic!
  • What would a ratio of about 2 or 8 reveal about performance?

Connection between the empirical and the analytical evaluation of algorithm performance

  • Runtime: conduct experiments to evaluate performance
  • Running Time: program structure to characterize performance

Function to sum the first k integers

from typing import Tuple
def sumk(k: int) -> Tuple[int, float]:
    start = time.time()
    total = 0
    for i in range(k+1):
        total = total + i
    end = time.time()
    return total, (end - start)

print(sumk(5))
print(sumk(10))
print(sumk(15))
print(sumk(20))
print(sumk(25))
(15, 1.6689300537109375e-06)
(55, 1.430511474609375e-06)
(120, 1.430511474609375e-06)
(210, 1.430511474609375e-06)
(325, 2.1457672119140625e-06)

Function to sum the first k integers

def timetrials_int(func, k, trials = 10):
    totaltime = 0
    for _ in range(trials):
        totaltime += func(k)[1]
    print("average =%10.7f for k = %d" % (totaltime/trials, k))

timetrials_int(sumk, 10000)
timetrials_int(sumk, 100000)
timetrials_int(sumk, 1000000)
timetrials_int(sumk, 10000000)
average = 0.0003331 for k = 10000
average = 0.0036208 for k = 100000
average = 0.0371214 for k = 1000000
average = 0.3757293 for k = 10000000
  • The sumk function outputs the sum of the first k integers
  • This function has a for loop that iterates k times
  • Ratio 0.203677/0.0220115 = 9.25 reveals linear performance

What do we know about the performance of the sumk function?

  • As k goes up by a factor of 10, the timetrials function shows the sumk runtime also goes up by a factor of 10

  • sumk has to do about k additions and assignments

  • Or, the runtime of sumk is proportional to k

  • Find what runtime is proportional to, not the exact time

  • Relationship should hold regardless of the computer used!

  • Wait, \(\sum_{i = 1}^k i = 1 + 2 + 3 + \cdots + k = k (k + 1) / 2\)

Fast summing for the first k integers

def sumk_calc(k):
    start = time.time()
    total = (k*(k+1)//2)
    end = time.time()
    return total, end-start

timetrials_int(sumk_calc, 10000)
timetrials_int(sumk_calc, 100000)
timetrials_int(sumk_calc, 1000000)
timetrials_int(sumk_calc, 10000000)
timetrials_int(sumk_calc, 100000000)
average = 0.0000003 for k = 10000
average = 0.0000002 for k = 100000
average = 0.0000003 for k = 1000000
average = 0.0000002 for k = 10000000
average = 0.0000002 for k = 100000000
  • Wow, the sumk_calc function is much faster than sumk!
  • Direct calculation is faster than using for loop in sumk!

Modeling program running time

def f001(k):
    return [sum([i, i + 1] * 100) for i in range(k)]

print(f001(9))
[100, 300, 500, 700, 900, 1100, 1300, 1500, 1700]
  • Synthetic function f001 contains many Python constructs
  • Remember, lines of code by itself does not signal performance
  • Instead, focus on counts of the number of atomic operations
  • Contextualize the operation count based on input size
  • Faster functions require fewer atomic operations
  • Remember, as the input grows, the code slows!

Wait, what is an atomic operation?

  • Atomic operations include:
    • Arithmetic and boolean operations
    • Variable assignment
    • Accessing the value of a variable from its name
    • Branching (jumping to another part of the code for if/for/while statements)
    • Calling a function
    • Returning from a function
  • Atomic operations are roughly equivalent in required work

Atomic operations for list functions

Operation Name Code Cost
index access L[i] \(1\)
index assignment L[i] = newvalue \(1\)
Append L.append(newitem) \(1\)
Pop (from end of list) L.pop() \(1\)
Pop (from index i) L.pop(i) \(n - i\)
Insert at index i insert(i, newitem) \(n-i\)
Delete an item (at index i) del(item) \(n - i\)
Membership testing item in L \(n\)
Slice L[a:b] \(b-a\)
Concatenate two lists L1 + L2 \(n_1 + n_2\)
Sort L.sort() \(n \log_2 n\)

What do we know about the performance of List operations?

  • Table reports the asymptotic cost of List operations

  • Partly driven by which operations produce a new list

  • Concatenation and slicing both produce a new collection

  • Grasp the running time by intuiting the work completed

  • List and str have similar performance characteristics

Key Insight: counting the maximum number of atomic operations for a function’s input size yields its worst-case running time

Atomic operations for dict functions

Operation Name Code Cost
Get item D[key] \(1\)
Set item D[key] = value \(1\)
(key) membership testing key in D \(1\)
Delete an item by its key del D[key] \(1\)
  • The dict operations have a constant cost for all input sizes

  • A dictionary is fast due to its use of hash functions that map “hashable” data to a fixed-size backing array

  • Is this too good to be true? Well, yes, if not designed correctly!

Atomic operations for set functions

Operation Name Code Cost
Add a new item A.add(newitem) \(1\)
Delete an item A.delete(item) \(1\)
Union A | B \(n_A + n_B\)
Intersection A & B \(\min\{n_A, n_B\}\)
Set differences A - B \(n_A\)
Symmetric Difference A ^ B \(n_A + n_B\)
  • Can you explain these cost functions for set operations?

Asymptotic analysis and the order of growth

  • Predict order of growth of time as the input size grows

    • Sublinear: time grows slower than linearly with input size
    • Linear: time grows linearly with input size
    • Quadratic: time grows quadratically with input size
    • Cubic: time grows cubically with input size
    • Exponential: time grows exponentially with input size

What is the size of the input?

  • For a container, the size is the number of elements

  • Otherwise, it is the number of bits needed to encode

  • Or, if a word is 64 bits, we could use the number of words

  • Assume that an int or a float fits into constant bit depth

  • Yet, note that Python supports arbitrary precision integers

  • Our model is useful even though it is note entirely accurate

Key Insight: as the input size grows, the code slows! Models help to characterize the relationship between input size and running time

Focus on the worst-case running time!

  • Different inputs of the same size may have different:
    • Runtimes in actual software implementations
    • Running time according to the performance models
  • Worst-case running time will characterize performance
  • Describe running time as a function of the input size:
    • Slow programs have fast-growing functions
    • Fast programs have slow-growing functions
  • Count number of atomic operations for the input size

Exploring the big-O notation

  • Running time on an input of size \(n\) might be \(5n^2 + 3n + 2\)
  • The dominant term is \(n^2\) and the running time is \(O(n^2)\)
  • Wait, that is some new notation! What does it mean?
  • Formal definition of the big-O notation:

Given (non-decreasing) functions \(f\) and \(g\), we say \(f(n) = O(g(n))\) if there exist constants \(c\) and \(n_0\) such that for all \(n>n_0\) we have \(f(n) \le cg(n)\).

Very exciting! What are the important features of the big-O notation?

  • The big-O hides constant factors. Any term that does not depend on the size of the input is considered a constant and will be conveniently suppressed in the big-O

  • The big-O tells us what will eventually be true when the input is sufficiently large, which is the worst-case scenario

Notation Confusion: be careful because the definition of big-O refers to a “function” — this is not a Python function but rather a mathematical function that characterizes it!

Functions with big-O Notation

  • Constant Functions, \(O(1)\)
  • Logarithmic Functions, \(O(\log n)\)
  • Linear Functions, \(O(n)\)
  • n Log n, \(O(n\log n)\)
  • Quadratic Functions, \(O(n^2)\)
  • Polynomial Functions, \(O(n^k)\) for some constant \(k\).
  • Exponential Functions, \(O(2^n)\) (this is different from \(2^{O(n)}\))
  • Factorial Functions, \(O(n!)\)

f002: A linear-time algorithm

def f002(L):    
    newlist = []  # 2 creating a new list and variable assignment
    for i in L:   # loops n times
        if i % 2 == 0:  # 1
            newlist.append(i) # 1 (append is constant time on lists)
    return newlist  # 1 return
  • Assume the input is a list of length \(n\)
  • Count the cost of each line of code
  • Total cost is \(2n+3\) in worst case (i.e., all items are even)
  • Report the running time as \(O(n)\)
  • Classify this as a linear-time algorithm
  • Worst-case performance similar to all linear algorithms!

f003: A quadratic-time algorithm

def f003(L):
    x = 0   # 1
    for i in L:   # loop n times
        for j in L:   # loop n times
            x += i*j  # 3 two arithmetic operations and assignment
    return x # 1
  • Assume the input is a list of length \(n\)
  • Count the cost of each line of code
  • Total cost is \(3n^2+2\) in the worst case
  • Report the running time as \(O(n^2)\)
  • Classify this as a quadratic-time algorithm
  • Worst-case performance similar to all quadratic algorithms!

f004: A quadratic-time algorithm

def f004(L):
    x = 0 # 1
    for j in range(1,len(L))  # loops n-1 times
        for i in range(j) # loops j times
            x += L[i] * L[j] # 5 operations in total
    return x # 1
  • Assume the input is a list of length \(n\)
  • Count the cost of each line of code
  • Initialization and return cost \(2\) operations
  • Inner loop iterations change based on outer loop value
  • \(2 + \sum_{i=1}^{n-1}5i = 2 + 5\sum_{i=1}^{n-1}i = 2 + \frac{5n(n-1)}{2} = O(n^2)\)

Studying program performance

  • Runtime: conduct experiments to evaluate performance
  • Running Time: program structure to see performance
  • Worst-case time complexity with big-O notation
    • Count the number of atomic operations
    • Relate operation count to input size
    • Characterize performance using a \(O(f(n))\)