Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] Python cProfile – A Helpful Guide with Prime Example

#1
Python cProfile – A Helpful Guide with Prime Example

<div><p>Python comes with different profilers. If you’re new to performance optimization, you may ask: <strong>what’s a profiler anyway?</strong></p>
<p><strong><em>A performance profiler allows you to monitor your application more closely. If you just run a Python script in your shell, you see nothing but the output produced by your program. But you don’t see how much bytes were consumed by your program. You don’t see how long each function runs. You don’t see the data structures that caused most memory overhead.</em></strong></p>
<p>Without those things, you cannot know what’s the bottleneck of your application. And, as you’ve already learned above, you cannot possibly start optimizing your code. Why? Because else you were complicit in “premature optimization”—one of the deadly sins in programming.</p>
<blockquote class="wp-block-quote">
<p>Instrumenting profilers insert special code at the beginning and end of each routine to record when the routine starts and when it exits. With this information, the profiler aims to measure the actual time taken by the routine on each call. This type of profiler may also record which other routines are called from a routine. It can then display the time for the entire routine and also break it down into time spent locally and time spent on each call to another routine.</p>
<p><cite><a href="https://smartbear.com/learn/code-profiling/fundamentals-of-performance-profiling/">Fundamentals Profiling</a></cite></p></blockquote>
<p>Fortunately, there are a lot of profilers. In the remaining article, I’ll give you an overview of the most important profilers in Python and how to use them. Each comes with a reference for further reading.</p>
<h2>Python cProfile</h2>
<p>The most popular Python profiler is called <a rel="noreferrer noopener" aria-label="cProfile (opens in a new tab)" href="https://docs.python.org/2/library/profile.html#module-cProfile" target="_blank">cProfile</a>. You can import it much like any other library by using the statement:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import cProfile</pre>
<p>A simple statement but nonetheless a powerful tool in your toolbox. </p>
<p>Let’s write a Python script which you can profile. Say, you come up with this (very) raw Python script to find 100 random prime numbers between 2 and 1000 which you want to optimize:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import random def guess(): ''' Returns a random number ''' return random.randint(2, 1000) def is_prime(x): ''' Checks whether x is prime ''' for i in range(x): for j in range(x): if i * j == x: return False return True def find_primes(num): primes = [] for i in range(num): p = guess() while not is_prime(p): p = guess() primes += [p] return primes print(find_primes(100)) '''
[733, 379, 97, 557, 773, 257, 3, 443, 13, 547, 839, 881, 997,
431, 7, 397, 911, 911, 563, 443, 877, 269, 947, 347, 431, 673,
467, 853, 163, 443, 541, 137, 229, 941, 739, 709, 251, 673, 613,
23, 307, 61, 647, 191, 887, 827, 277, 389, 613, 877, 109, 227,
701, 647, 599, 787, 139, 937, 311, 617, 233, 71, 929, 857, 599,
2, 139, 761, 389, 2, 523, 199, 653, 577, 211, 601, 617, 419, 241,
179, 233, 443, 271, 193, 839, 401, 673, 389, 433, 607, 2, 389,
571, 593, 877, 967, 131, 47, 97, 443] '''</pre>
<p>The program is slow (and you sense that there are many optimizations). But where to start?</p>
<p>As you’ve already learned, you need to know the bottleneck of your script. Let’s use the cProfile module to find it! The only thing you need to do is to add the following two lines to your script:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import cProfile
cProfile.run('print(find_primes(100))')</pre>
<p>It’s really that simple. First, you write your script. Second, you call the <code>cProfile.run()</code> method to analyze its performance. Of course, you need to replace the execution command with your specific code you want to analyze. For example, if you want to test function <code>f42()</code>, you need to type in <code>cProfile.run('f42()')</code>. </p>
<p>Here’s the output of the previous code snippet (don’t panic yet):</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[157, 773, 457, 317, 251, 719, 227, 311, 167, 313, 521, 307, 367, 827, 317, 443, 359, 443, 887, 241, 419, 103, 281, 151, 397, 433, 733, 401, 881, 491, 19, 401, 661, 151, 467, 677, 719, 337, 673, 367, 53, 383, 83, 463, 269, 499, 149, 619, 101, 743, 181, 269, 691, 193, 7, 883, 449, 131, 311, 547, 809, 619, 97, 997, 73, 13, 571, 331, 37, 7, 229, 277, 829, 571, 797, 101, 337, 5, 17, 283, 449, 31, 709, 449, 521, 821, 547, 739, 113, 599, 139, 283, 317, 373, 719, 977, 373, 991, 137, 797] 3908 function calls in 1.614 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 1.614 1.614 &lt;string>:1(&lt;module>) 535 1.540 0.003 1.540 0.003 code.py:10(is_prime) 1 0.000 0.000 1.542 1.542 code.py:19(find_primes) 535 0.000 0.000 0.001 0.000 code.py:5(guess) 535 0.000 0.000 0.001 0.000 random.py:174(randrange) 535 0.000 0.000 0.001 0.000 random.py:218(randint) 535 0.000 0.000 0.001 0.000 random.py:224(_randbelow) 21 0.000 0.000 0.000 0.000 rpc.py:154(debug) 3 0.000 0.000 0.072 0.024 rpc.py:217(remotecall) 3 0.000 0.000 0.000 0.000 rpc.py:227(asynccall) 3 0.000 0.000 0.072 0.024 rpc.py:247(asyncreturn) 3 0.000 0.000 0.000 0.000 rpc.py:253(decoderesponse) 3 0.000 0.000 0.072 0.024 rpc.py:291(getresponse) 3 0.000 0.000 0.000 0.000 rpc.py:299(_proxify) 3 0.000 0.000 0.072 0.024 rpc.py:307(_getresponse) 3 0.000 0.000 0.000 0.000 rpc.py:329(newseq) 3 0.000 0.000 0.000 0.000 rpc.py:333(putmessage) 2 0.000 0.000 0.047 0.023 rpc.py:560(__getattr__) 3 0.000 0.000 0.000 0.000 rpc.py:57(dumps) 1 0.000 0.000 0.047 0.047 rpc.py:578(__getmethods) 2 0.000 0.000 0.000 0.000 rpc.py:602(__init__) 2 0.000 0.000 0.026 0.013 rpc.py:607(__call__) 2 0.000 0.000 0.072 0.036 run.py:354(write) 6 0.000 0.000 0.000 0.000 threading.py:1206(current_thread) 3 0.000 0.000 0.000 0.000 threading.py:216(__init__) 3 0.000 0.000 0.072 0.024 threading.py:264(wait) 3 0.000 0.000 0.000 0.000 threading.py:75(RLock) 3 0.000 0.000 0.000 0.000 {built-in method _struct.pack} 3 0.000 0.000 0.000 0.000 {built-in method _thread.allocate_lock} 6 0.000 0.000 0.000 0.000 {built-in method _thread.get_ident} 1 0.000 0.000 1.614 1.614 {built-in method builtins.exec} 6 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance} 9 0.000 0.000 0.000 0.000 {built-in method builtins.len} 1 0.000 0.000 0.072 0.072 {built-in method builtins.print} 3 0.000 0.000 0.000 0.000 {built-in method select.select} 3 0.000 0.000 0.000 0.000 {method '_acquire_restore' of '_thread.RLock' objects} 3 0.000 0.000 0.000 0.000 {method '_is_owned' of '_thread.RLock' objects} 3 0.000 0.000 0.000 0.000 {method '_release_save' of '_thread.RLock' objects} 3 0.000 0.000 0.000 0.000 {method 'acquire' of '_thread.RLock' objects} 6 0.071 0.012 0.071 0.012 {method 'acquire' of '_thread.lock' objects} 3 0.000 0.000 0.000 0.000 {method 'append' of 'collections.deque' objects} 535 0.000 0.000 0.000 0.000 {method 'bit_length' of 'int' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 3 0.000 0.000 0.000 0.000 {method 'dump' of '_pickle.Pickler' objects} 2 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects} 553 0.000 0.000 0.000 0.000 {method 'getrandbits' of '_random.Random' objects} 3 0.000 0.000 0.000 0.000 {method 'getvalue' of '_io.BytesIO' objects} 3 0.000 0.000 0.000 0.000 {method 'release' of '_thread.RLock' objects} 3 0.000 0.000 0.000 0.000 {method 'send' of '_socket.socket' objects} </pre>
<p>Let’s deconstruct it to properly understand the meaning of the output. The filename of your script is ‘code.py’. Here’s the first part:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>>import cProfile
>>>cProfile.run('print(find_primes(100))')
[157, 773, 457, 317, 251, 719, 227, 311, 167, 313, 521, 307, 367, 827, 317, 443, 359, 443, 887, 241, 419, 103, 281, 151, 397, 433, 733, 401, 881, 491, 19, 401, 661, 151, 467, 677, 719, 337, 673, 367, 53, 383, 83, 463, 269, 499, 149, 619, 101, 743, 181, 269, 691, 193, 7, 883, 449, 131, 311, 547, 809, 619, 97, 997, 73, 13, 571, 331, 37, 7, 229, 277, 829, 571, 797, 101, 337, 5, 17, 283, 449, 31, 709, 449, 521, 821, 547, 739, 113, 599, 139, 283, 317, 373, 719, 977, 373, 991, 137, 797]
...</pre>
<p>It still gives you the output to the shell—even if you didn’t execute the code directly, the <code>cProfile.run()</code> function did. You can see the list of the 100 random prime numbers here.</p>
<p>The next part prints some statistics to the shell:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""> 3908 function calls in 1.614 seconds</pre>
<p>Okay, this is interesting: the whole program took 1.614 seconds to execute. In total, 3908 function calls have been executed. Can you figure out which?</p>
<ul>
<li>The print() function once.</li>
<li>The find_primes(100) function once.</li>
<li>The find_primes() function executes the for loop 100 times.</li>
<li>In the for loop, we execute the range(), guess(), and is_prime() functions. The program executes the guess() and is_prime() functions multiple times per loop iteration until it correctly guessed the next prime number.</li>
<li>The guess() function executes the randint(2,1000) method once.</li>
</ul>
<p>The next part of the output shows you the detailed stats of the function names ordered by the function name (not its performance):</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""> Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 1.614 1.614 &lt;string>:1(&lt;module>) 535 1.540 0.003 1.540 0.003 code.py:10(is_prime) 1 0.000 0.000 1.542 1.542 code.py:19(find_primes) ...</pre>
<p>Each line stands for one function. For example the second line stands for the function is_prime. You can see that is_prime() had 535 executions with a total time of 1.54 seconds. </p>
<p>Wow! You’ve just found the bottleneck of the whole program: is_prime(). Again, the total execution time was 1.614 seconds and this one function dominates 95% of the total execution time!</p>
<p>So, you need to ask yourself the following questions: Do you need to optimize the code at all? If you do, how can you mitigate the bottleneck?</p>
<p>There are two basic ideas: </p>
<ul>
<li>call the function is_prime() less frequently, and</li>
<li>optimize performance of the function itself.</li>
</ul>
<p>You know that the best way to optimize code is to look for more efficient algorithms. A quick <a href="https://stackoverflow.com/questions/15285534/isprime-function-for-python-language">search </a>reveals a much more efficient algorithm (see function <code>is_prime2()</code>). </p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import random def guess(): ''' Returns a random number ''' return random.randint(2, 1000) def is_prime(x): ''' Checks whether x is prime ''' for i in range(x): for j in range(x): if i * j == x: return False return True def is_prime2(x): ''' Checks whether x is prime ''' for i in range(2,int(x**0.5)+1): if x % i == 0: return False return True def find_primes(num): primes = [] for i in range(num): p = guess() while not is_prime2(p): p = guess() primes += [p] return primes import cProfile
cProfile.run('print(find_primes(100))')
</pre>
<p>What do you think: is our new prime checker faster? Let’s study the output of our code snippet:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[887, 347, 397, 743, 751, 19, 337, 983, 269, 547, 823, 239, 97, 137, 563, 757, 941, 331, 449, 883, 107, 271, 709, 337, 439, 443, 383, 563, 127, 541, 227, 929, 127, 173, 383, 23, 859, 593, 19, 647, 487, 827, 311, 101, 113, 139, 643, 829, 359, 983, 59, 23, 463, 787, 653, 257, 797, 53, 421, 37, 659, 857, 769, 331, 197, 443, 439, 467, 223, 769, 313, 431, 179, 157, 523, 733, 641, 61, 797, 691, 41, 751, 37, 569, 751, 613, 839, 821, 193, 557, 457, 563, 881, 337, 421, 461, 461, 691, 839, 599] 4428 function calls in 0.074 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.073 0.073 &lt;string>:1(&lt;module>) 610 0.002 0.000 0.002 0.000 code.py:19(is_prime2) 1 0.001 0.001 0.007 0.007 code.py:27(find_primes) 610 0.001 0.000 0.004 0.000 code.py:5(guess) 610 0.001 0.000 0.003 0.000 random.py:174(randrange) 610 0.001 0.000 0.004 0.000 random.py:218(randint) 610 0.001 0.000 0.001 0.000 random.py:224(_randbelow) 21 0.000 0.000 0.000 0.000 rpc.py:154(debug) 3 0.000 0.000 0.066 0.022 rpc.py:217(remotecall)</pre>
<p>Crazy – what a performance improvement! With the old bottleneck, the code takes 1.6 seconds. Now, it takes only 0.074 seconds—a 95% runtime performance improvement! </p>
<p>That’s the power of bottleneck analysis.</p>
<p>The cProfile method has many more functions and parameters but this simple method cProfile.run() is already enough to resolve many performance bottlenecks. <a href="https://docs.python.org/2/library/profile.html#module-cProfile">If you want to learn more, study the official documentation.</a></p>
<h2>Where to Go From Here?</h2>
<p>You’ve learned how to use the cProfile module in Python to find the bottleneck of your application.</p>
<p><strong>If you’re already optimizing performance of your Python apps, chances are that you can already earn six figures by selling your Python skills. Would you like to learn how?</strong></p>
<p>Join the free webinar that shows you how to become a thriving coding business owner online!</p>
<p><a href="https://blog.finxter.com/webinar-freelancer/">[Webinar] Are You a Six-Figure Freelance Developer?</a></p>
<p>Join us. It’s fun! <img src="https://s.w.org/images/core/emoji/12.0.0-1/72x72/1f642.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
</div>


https://www.sickgaming.net/blog/2020/02/...e-example/
Reply



Forum Jump:


Users browsing this thread:
3 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016