Create an account

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] How to Calculate the Column Standard Deviation of a DataFrame in Python Pandas?

How to Calculate the Column Standard Deviation of a DataFrame in Python Pandas?

<div><p>Want to calculate the standard deviation of a column in your <a rel="noreferrer noopener" href="" target="_blank">Pandas </a>DataFrame?</p>
<p>In case you’ve attended your last statistics course a few years ago, let’s quickly recap the <strong>definition of variance</strong>: it’s the <em>average squared deviation of the list elements from the average value.</em></p>
<figure class="wp-block-image size-large is-resized"><img src="" alt="" class="wp-image-7490" width="185" height="66" srcset=" 305w, 300w" sizes="(max-width: 185px) 100vw, 185px" /></figure>
<div class="wp-block-image">
<figure class="aligncenter size-large"><img src="" alt="" class="wp-image-7491"/></figure>
<p><strong>You can do this by using the <code>pd.std()</code> function that calculates the standard deviation along all columns. You can then get the column you’re interested in after the computation.</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd # Create your Pandas DataFrame
d = {'username': ['Alice', 'Bob', 'Carl'], 'age': [18, 22, 43], 'income': [100000, 98000, 111000]}
df = pd.DataFrame(d) print(df)</pre>
<p>Your DataFrame looks like this:</p>
<figure class="wp-block-table is-style-stripes">
<p>Here’s how you can calculate the standard deviation of all columns:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">print(df.std())</pre>
<p>The output is the standard deviation of all columns:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">age 13.428825
income 7000.000000
dtype: float64</pre>
<p>To get the variance of an individual column, access it using simple indexing:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">print(df.std()['age'])
# 180.33333333333334</pre>
<p>Together, the code looks as follows. Use the interactive shell to play with it!</p>
<p> <iframe src="" scrolling="no" allowtransparency="true" allowfullscreen="true" sandbox="allow-forms allow-pointer-lock allow-popups allow-same-origin allow-scripts allow-modals" width="100%" height="700px" frameborder="no"></iframe> </p>
<h2>Standard Deviation in NumPy Library</h2>
<p>Python’s package for data science computation <a rel="noreferrer noopener" href="" target="_blank">NumPy</a> also has great statistics functionality. You can calculate all basic statistics functions such as <a rel="noreferrer noopener" href="" target="_blank">average</a>, median, <a rel="noreferrer noopener" href="" target="_blank">variance</a>, and <a rel="noreferrer noopener" href="" target="_blank">standard deviation</a> on NumPy arrays. Simply import the NumPy library and use the <code>np.var(a)</code> method to calculate the average value of NumPy array <code>a</code>.</p>
<p>Here’s the code:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import numpy as np a = np.array([1, 2, 3])
# 0.816496580927726
<h2>Where to Go From Here?</h2>
<p>Before you can become a data science master, you first need to master Python. <a rel="noreferrer noopener" href="" target="_blank">Join my free Python email course </a>and receive your daily Python lesson directly in your INBOX. It’s fun!</p>
<p><a rel="noreferrer noopener" href="" target="_blank">Join The World’s #1 Python Email Academy [+FREE Cheat Sheets as PDF]</a></p>

Possibly Related Threads…
Thread Author Replies Views Last Post
  [Tut] Making $65 per Hour on Upwork with Pandas xSicKxBot 0 671 05-24-2023, 08:16 PM
Last Post: xSicKxBot
  [Tut] Pandas Series Object – A Helpful Guide with Examples xSicKxBot 0 644 05-01-2023, 01:30 AM
Last Post: xSicKxBot
  [Tut] Python List of Tuples to DataFrame ? xSicKxBot 0 659 04-22-2023, 06:10 AM
Last Post: xSicKxBot
  [Tut] Dictionary of Lists to DataFrame – Python Conversion xSicKxBot 0 692 04-17-2023, 03:46 AM
Last Post: xSicKxBot
  [Tut] Pandas Boolean Indexing xSicKxBot 0 654 04-16-2023, 10:54 AM
Last Post: xSicKxBot
  [Tut] Python List of Dicts to Pandas DataFrame xSicKxBot 0 728 04-11-2023, 04:15 AM
Last Post: xSicKxBot
  [Tut] How to Create a DataFrame From Lists? xSicKxBot 0 564 12-17-2022, 03:17 PM
Last Post: xSicKxBot
  [Tut] How to Filter Data from an Excel File in Python with Pandas xSicKxBot 0 634 10-31-2022, 05:36 AM
Last Post: xSicKxBot
  [Tut] How to Convert Pandas DataFrame/Series to NumPy Array? xSicKxBot 0 615 10-24-2022, 02:13 PM
Last Post: xSicKxBot
  [Tut] Python – Finding the Most Common Element in a Column xSicKxBot 0 585 09-06-2022, 10:19 PM
Last Post: xSicKxBot

Forum Jump:

Users browsing this thread:
1 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016