When debugging performance issues, I usually rely on the good old line_profiler. It’s very useful to easily identify which lines of a specific function are slow and needs to be investigated and/or fixed.
Using it is straightforward, you basically add the
@profile decorator on the function you want to profile:
@profile def slow_function(a, b, c): ...
Then you launch your script with:
$ kernprof -l script_to_profile.py
And generate the report with:
$ python -m line_profiler script_to_profile.py.lprof Timer unit: 1e-06 s File: pystone.py Function: Proc2 at line 149 Total time: 0.606656 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 149 @profile 150 def Proc2(IntParIO): 151 50000 82003 1.6 13.5 IntLoc = IntParIO + 10 152 50000 63162 1.3 10.4 while 1: 153 50000 69065 1.4 11.4 if Char1Glob == 'A': 154 50000 66354 1.3 10.9 IntLoc = IntLoc - 1 155 50000 67263 1.3 11.1 IntParIO = IntLoc - IntGlob 156 50000 65494 1.3 10.8 EnumLoc = Ident1 157 50000 68001 1.4 11.2 if EnumLoc == Ident1: 158 50000 63739 1.3 10.5 break 159 50000 61575 1.2 10.1 return IntParIO
Which is great… until it’s not anymore.
Too much magic
Line profiler is great for profiling small scripts, you add the decorator, you run it with
kernprof and voila. The problem is that you always need to launch it with
kernprof as it’s injecting the
profile decorator. If you try to launch your script without
kernprof you will have a nice
NameError: name 'profile' is not defined.
Moreover, in some cases, you just cannot use
kernprof, for example when trying to profile a web server or when the Python interpreter is launched by a bash/script or another process you cannot modify.
Luckily for us, it’s not that hard to use
The magic trick
The kernprof magic trick is not that complicated as you’ll see.
First it instantiate the right object:
import line_profiler prof = line_profiler.LineProfiler()
Then it inject it in the builtins:
builtins.__dict__['profile'] = prof
execfile(script_file, ns, ns)
No Rocket science involved here. With all these information we can now use it manually.
Let’s use a simple Python script for showing you how to use it manually. The following script answer this exercise:
Given a list of integers and a target integer, the function should answer True if the target could be created by adding exactly two integers from the list, False if not.
Here is a naive solution:
def is_addable(l, t): for i, n in enumerate(l): for m in l[i:]: if n + m == t: return True return False assert is_addable(range(20), 25) == True # 25 = 6 + 19 assert is_addable(range(20), 40) == False
The goal is to optimize this simple function. Let’s create a line_profiler and decorate our function:
import line_profiler profile = line_profiler.LineProfiler() @profile def is_addable(l, t): for i, n in enumerate(l): for m in l[i:]: if n + m == t: return True return False assert is_addable(range(20), 25) == True assert is_addable(range(20), 40) == False
Launch the script, it should run a bit slower, it’s normal as the script is now profiled. But you don’t have the report either.
That’s normal, we didn’t call the
print_stats function. But we need to call it at the end of the script. We could manually call it at the end of the script, but in some cases, it would be tedious to add it manually.
Instead, we can use the
atexit module to call it for us at the end of the current Python process:
import line_profiler import atexit profile = line_profiler.LineProfiler() atexit.register(profile.print_stats) @profile def is_addable(l, t): for i, n in enumerate(l): for m in l[i:]: if n + m == t: return True return False assert is_addable(range(20), 25) == True assert is_addable(range(20), 40) == False
Now let’s run the script once again:
$ python script.py Timer unit: 1e-06 s Total time: 0.000171 s File: script.py Function: is_addable at line 6 Line # Hits Time Per Hit % Time Line Contents ============================================================== 6 @profile 7 def is_addable(l, t): 8 28 12.0 0.4 7.0 for i, n in enumerate(l): 9 355 70.0 0.2 40.9 for m in l[i:]: 10 329 87.0 0.3 50.9 if n + m == t: 11 1 1.0 1.0 0.6 return True 12 13 1 1.0 1.0 0.6 return False
Hey much better! Optimizing the function is left as an exercise for the reader.
One last tip, if you want to profile several functions, only instantiate once the
LineProfiler and import it in the other files. If you don’t do that, you might have some issues and weird reporting.