Measuring execution time

On my last post I talked about how to use cache to improve the performance of a recursive implementation of fibonacci. All charts were built with Gnuplot.

Since I wanted a significant amount of data for the charts, I could not generate them manually. I had to automate.

I had already used the time tool, a utility offered by bash, but had a hard time formatting it's output, which has the following form:

bla@laptop:~$ time python3 fibonacci.py 25
real    0m0.064s
user    0m0.056s
sys     0m0.008s

After countless internet searches I found out that there's a program unrelated to bash also called time that measures execution and accepts format operators. It is installed on /usr/bin (at least on my Debian system) and, despite this directory being on my PATH, it is obfuscated by bash's time. In order to call it I had to use the file path. By my own preference I used it's absolute path (/usr/bin/time).

This is how /usr/bin/time output looks like:

bla@laptop:~$ /usr/bin/time python3 fibonacci.py 25
0.06user 0.00system 0:00.06elapsed 95%CPU (0avgtext+0avgdata 7724maxresident)k
0inputs+0outputs (0major+865minor)pagefaults 0swaps

As I was only interested in the total time (elapsed in the above snippet), I was able to use a quite simple format string to retrieve only this field.

bla@laptop:~$ /usr/bin/time -f "%E" python3 fibonacci.py 25
0:00.06

To remove the minutes portion (what precedes :) and have an easily parseable output I used cut. In this specific case I simply had to specify the field delimiter (:) and the field I wanted (2).

bla@laptop:~$ /usr/bin/time -f "%E" python3 fibonacci.py 25 | cut -d: -f2
0:00.07

That was not as expected. What is happening here is that the pipe (|) is redirecting the stdout of the command on it's left to the stdin of the command on it's right, in this case cut. but /usr/bin/time doesn't print to stdout but to stderr! A way around this issue is to redirect stderr to stdout. Thankfully bash lets us acomplish it easily. The standard output (stdout) is represented by 1 and standard error (stderr) by 2. To redirect stderr to stdout we can use 2>&1. The other way around is also possible (send stdout to stderr) using 1>&2. Using the first form we get the expected result.

bla@laptop:~$ /usr/bin/time -f "%E" python3 fibonacci.py 25 2>&1 | cut -d: -f2
00.05

To test fibonacci.py with multiple numbers we can simply run it inside a loop and send the output to a file.

bla@laptop:~$ for n in {1..5}
> do
> /usr/bin/time -f "%E" python3 fibonacci.py $n 2>&1 | cut -d: -f2 > tempos.txt
> done
00.02
00.02
00.02
00.02
00.02

Thi is how I generated last post's graphs. Since then I've been asking myself if this is the most pythonic solution. I believe it's not. It's excessively complex. I've been digging into code so that I don't need to deal with all this shell.

Measuring time with python

The task of measuring execution time of pieces of code is usually part of the profiling process, which aims to identify performance bottlenecks. As this process is widely used, so are its subtasks. Measuring execution time is no exception.

Using `time.time()` directly

As most programming language, python provides a module to deal with time in its standard library. The function time, which belongs to the module with the same name, returns the amount of seconds elapsed since epoch (January 1, 1970). This popular time representation is also known as Unix time.

With this function it's possible to easily measure the execution time of any code block. For example:

import time

def slow_func():
    time.sleep(10)

start_time = time.time()
slow_func()
end_time = time.time()

total_time = end_time - start_time

print('{:.3f}s'.format(total_time))

This code can already be used inside a loop to get all of the measurements in the desired format in a much cleaner way than we did before but, can we do better?

Building a decorator

Yes! Using a decorator we can make this code much more reusable and pythonic!

import time
import functools


def timer(func):

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()

        print('{} ran in: {:.3f}s'.format(func.__name__, end - start))
        return result

    return wrapper

def _fibonacci(n):
    if n < 2:
        return n
    return _fibonacci(n - 1) + _fibonacci(n - 2)

@timer
def fibonacci(n):
    return _fibonacci(n)

Assuming the code above is in the file fibonacci_with_decorator.py we can use in the interactive shell like so:

>>> from fibonacci_with_decorator import fibonacci
>>> fibonacci(3)
fibonacci ran in: 0.000s
2
>>> fibonacci(25)
fibonacci ran in: 0.053s
75025

We had to add a wrapper to the fibonacci function due to its recursive nature. If the decorator were applied to the recursive function directly the execution time for every call would be shown.

This method allows us to measure execution time of multiple code blocks running it once. Besides, performance data can be used in logs, allowing the application to be analysed without having to stop it.

Using a context manager

I had completely overlooked this approach until I read a friend's article about speeding up MongoDB queries with tornado.

You've probably already seen a context manager, they are very common when handling files. A context manager is any object that implements the context manager's interface (methods __enter__ and __exit__). The objects returned by open are the most common example.

Using these objects in a with block we implicitly make __enter__ be called when entering the block and __exit__ when exiting it. For files we usually have something like:

with open('some_file_path') as f:
    pass # read from f

# f is no longer open

When we exit the block the file is automatically closed (i.e. f.close()), so we don't have to do so.

As any object that implements the context manager interface can be used with with, we can make a timer with it. We just have to start the timer on __enter__ and stop it on __exit__.

import time
import sys


class Timer:

    def __enter__(self):
        self.time = time.time()
        return self

    def __exit__(self, *args):
        self.time = time.time() - self.time
        print('Time elapsed: {}s'.format(self.time))


def slow_func():
    time.sleep(10)

with Timer():
    slow_func()

Running the above code we get:

bla@laptop:~$ python context_timer.py
Time elapsed: 10.008252143859863s

Measuring time with python

Using time.time() directly

Building a decorator

Using a context manager

Using `time.time()` directly