Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] How to Use Pandas Rolling – A Simple Illustrated Guide

#1
How to Use Pandas Rolling – A Simple Illustrated Guide

<div></p>
<figure class="wp-block-image"><img src="https://lh4.googleusercontent.com/oZyocnFR62x1vikMg_8MVm2WHSkCFjGdTHUIzqVj2Ium47ZKd-_-IYZo8xy7nLtkzK97gWfeRgWgWLQtAT5E9nl-FDNBlusIL7ujR5UKuzOL1icKvPVPBwhYZlxPrjhgjIY1-KM4" alt=""/></figure>
<p>This article will demonstrate how to use a pandas dataframe method called <code>rolling()</code>. </p>
<p><strong>What does the <code>pandas.DataFrame.rolling()</code> method do? </strong></p>
<p>In short, it performs <strong><em>rolling windows calculations</em></strong>. </p>
<p>It is often used when working with time-series data or signal processing. I will shortly dive into a few practical examples to clarify what this means in practice. </p>
<p>The method will be given a parameter that specifies how big the window the desired calculations should be performed in. </p>
<p>A simple example of using time series data could be that each row of a pandas dataframe represents a day with some values. </p>
<p>Let’s say that the desired window size is five days. The rolling method is given a five as input, and it will perform the expected calculation based on steps of five days. </p>
<p>Before an example of this, let’s see the method, its syntax, and its parameters.  </p>
<h2>pandas.DataFrame.rolling()</h2>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Dataframe.rolling(window, min_periods=None, center=False, win_type=None, on=None, axis=0, closed=None, method=’single’)</pre>
<p>Let’s dive into the parameters one by one:</p>
<h3>window</h3>
<p><strong><code>window</code></strong>: <strong><code>int</code></strong>, <strong><code>offset</code>, or <code>BaseIndexer</code> subclass</strong></p>
<p>This is the size of the moving window. </p>
<p>If an integer, the fixed number of observations is used for each window. </p>
<p>If an offset, the time period of each window. Each window will be variable-sized based on the observations included in the time period. This is only valid for <code><a href="https://blog.finxter.com/how-to-work-with-dates-and-times-in-python/" data-type="post" data-id="4927" target="_blank" rel="noreferrer noopener">datetime</a></code>-like indexes.</p>
<p>If a <code>BaseIndexer</code> subclass, the window boundaries are based on the defined <code>get_window_bounds()</code> method. Additional rolling keywords argument, namely <code>min_periods</code>, <code>center</code>, and <code>closed</code> will be passed to <code>get_window_bounds()</code>.</p>
<h3>min_periods</h3>
<p><strong><code>min_periods</code>: <code>int</code>, default <code>None</code></strong></p>
<p>This is the minimum number of observations in the window required to have a value. </p>
<p>Otherwise, the result is assigned <code><a href="https://blog.finxter.com/check-for-nan-values-in-python/" data-type="post" data-id="273492" target="_blank" rel="noreferrer noopener">np.nan</a></code>. </p>
<ul>
<li>For a window that is specified by an offset, <code>min_periods</code> will default to 1. </li>
<li>For a window specified by an integer, <code>min_periods</code> will default to the size of the window. </li>
</ul>
<h3>center</h3>
<p><strong><code>center</code>: <code>bool</code>, default <code>False</code></strong></p>
<p>If <code>False</code>, set the window labels as the right edge of the window index. If <code>True</code>, set the window labels as the center of the window index. </p>
<h3>win_type</h3>
<p><strong><code>win_type</code>: <code>str</code>, default <code>None</code></strong></p>
<p>If <code>None</code>, all points are evenly weighted. </p>
<p>If a string, it must be a valid window function from <code><a href="https://docs.scipy.org/doc/scipy/reference/signal.windows.html#module-scipy.signal.windows" data-type="URL" data-id="https://docs.scipy.org/doc/scipy/reference/signal.windows.html#module-scipy.signal.windows" target="_blank" rel="noreferrer noopener">scipy.signal</a></code>. </p>
<p>Some of the Scipy window types require additional parameters to be passed in the aggregation function. </p>
<p>The additional parameters must match the keywords specified in the Scipy window type method signature. </p>
<h3>on</h3>
<p><strong><code>on</code>: <code>str</code>, optional</strong></p>
<p>For a Dataframe, a column label or index level on which to calculate the rolling window, rather than the Dataframes index. </p>
<p>The provided integer column is ignored and excluded from the result since an integer index is not used to calculate the rolling window. </p>
<h3>axis</h3>
<p><strong><code>axis</code>: <code>int</code> or <code>str</code>, default 0</strong></p>
<p>If 0 or <code>'index'</code>, roll across the rows. If 1 or <code>'columns'</code>, roll across the columns. </p>
<h3>closed</h3>
<p><strong><code>closed</code>: <code>str</code>, default <code>None</code></strong></p>
<ul>
<li>If <code>'right'</code>, the first point in the window is excluded from calculations. </li>
<li>If <code>'left'</code>, the last point in the window is excluded from calculations.</li>
<li>If <code>'both'</code>, then no points in the window are excluded from the calculations. </li>
<li>If <code>'neither'</code>, the first and last points in the window are excluded from the calculations. </li>
</ul>
<p>Default <code>None</code> means <code>'right'</code>.</p>
<h3>method</h3>
<p><strong><code>method</code>: <code>str</code> <code>{'single', 'table'}</code>, default <code>'single'</code></strong></p>
<p>Execute the rolling operation per single column or row for <code>'single'</code> or over the entire object for <code>'table'</code>. </p>
<p>This argument is only implemented when specifying <code>engine='numba'</code> in the method call. </p>
<hr class="wp-block-separator"/>
<p>This part was obtained from the <a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rolling.html" data-type="URL" data-id="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rolling.html" target="_blank" rel="noreferrer noopener">official pandas documentation</a>. </p>
<h2>Data</h2>
<p>The data I will be working with for this tutorial is historical data for a stock, the amazon stock. </p>
<p>I use the python package <code>yfinance</code> to import the data. I will use data starting from <code>2021-04-01</code> and running one year forward in time. </p>
<p>The data only includes trading days, i.e., days when the stock market was open.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Get the stock data from Yahoo finance
AmazonData1y = yfinance.Ticker("AMZN").history(period='1y', actions=False, end='2022-04-01')
display(AmazonData1y.head(20))</pre>
<div class="wp-block-image">
<figure class="aligncenter"><img src="https://lh6.googleusercontent.com/zvwb_9G7ndEgIGlYaImPZXbTlNp-XAHIhp9RfR7zVxmGuvRZ4ZB9AUNKnChtgd30CShiCZ6KjDvQpHWGkgWfc6OMhVD2r1l6BmlI3FpR5nh3ElE6munQWbxfF3LKQEhz8XymsCTv" alt=""/></figure>
</div>
<p>The resulting dataframe contains data about the opening price, the highest price, the lowest price, the closing price, and the trading volume for each day.&nbsp;</p>
<h2>Calculating moving averages</h2>
<p>The first calculations using the rolling method I will do are some different moving averages values. They are often applied in stock analysis. </p>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> A moving average value is a statistic that captures the average change in a data series over time. (<a rel="noreferrer noopener" href="https://www.investopedia.com/terms/m/movingaverage.asp" data-type="URL" data-id="https://www.investopedia.com/terms/m/movingaverage.asp" target="_blank">source</a>)</p>
<p>Let’s calculate the moving averages for seven days and 15 days for the stock closing price and add those values as new columns to the existing amazon dataframe. </p>
<p>They are named <code>'MA7'</code> and <code>'MA15'</code>.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Calculating the 7 and 15 day moving averages based on closing price
# and adding them as new columns
AmazonData1y['7MA'] = AmazonData1y['Close'].rolling(7).mean()
AmazonData1y['15MA'] = AmazonData1y['Close'].rolling(15).mean() display(AmazonData1y.head(20))</pre>
<div class="wp-block-image">
<figure class="aligncenter"><img src="https://lh3.googleusercontent.com/clZ14KnZ_PAwayYCoeaWT16QUiQCelMHYdf7uF6d4K915xmqSSaMK9R0tafg3hAuVbACgjCPdsIsE_xgir1LV07rUd4hhZ3lsiP_WrOR1R-UelQLXEeDWjxnrtVOkER9TNf6Ab_p" alt=""/></figure>
</div>
<p>Since there is no data before <code>2021-04-01</code>, no seven-day moving average can be calculated before <code>2021-04-13</code> and no 15-day moving average before <code>2021-04-23</code>. </p>
<h2>Calculating the Sum of Trading Volume</h2>
<p>Let’s now instead use the rolling method to calculate the sum of the volume from the last five trading days to spot if there was any spike in volume.</p>
<p>It is done in the same way as for the moving average, but here the <a rel="noreferrer noopener" href="https://blog.finxter.com/python-sum/" data-type="post" data-id="22221" target="_blank"><code>sum()</code></a> method is used together with the rolling method instead of the <code>mean()</code> method. </p>
<p>I will also add this as a new column to the existing Amazon dataframe. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Calculating 5 day volume using rolling
AmazonData1y['5VOL'] = AmazonData1y['Volume'].rolling(5).sum() display(AmazonData1y.head(20))</pre>
<div class="wp-block-image">
<figure class="aligncenter"><img src="https://lh6.googleusercontent.com/rSd107FYEgpg88bQgfdDZpvSn942PyuygydvwdAQ3SDpZM02lee31bHlvPKKhmvuwvwSQZN7qHTVB6sDcg3nNq6DYVmAP88wmmGSkgT0tu7hGL_-ikMqJD2qbjKNGdv9krkDrxhQ" alt=""/></figure>
</div>
<p>This metric might not be the most useful but it is a good way to explain how you could use the rolling method together with the <code>sum()</code> method. </p>
<h2>Using rolling() with Aggregation</h2>
<p>If combining the <code>rolling()</code> method with the aggregation method <code>agg()</code>, it is easy to perform rolling calculations on multiple columns simultaneously. </p>
<p>Say that I would like to find the highest high and the lowest low for the last seven days. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Performing rolling calculations on multiple columns at the
# same time using .agg()
SevenHighAndLow = AmazonData1y.rolling(7).agg({'High': 'max', 'Low': 'min'}) display(SevenHighAndLow.head(20))</pre>
<div class="wp-block-image">
<figure class="aligncenter"><img src="https://lh6.googleusercontent.com/UtKxjPzoAASwzbcdDubHkvW2bV9n-3T3ZOGdLCR62k7JsJ4ma-ScQGZDW1ONBJmxYaZOtJQdldJWyzIjrAr4sgC8QzXeYacPOv2YDz9TLsV2wFckJv2WZlMofgthOnZ7Ifp5wnpW" alt=""/></figure>
</div>
<h2>Plotting the Values</h2>
<p>This part will be included to visualize the value calculated. It’s a bit more appealing than simply just looking at columns of a dataframe. </p>
<p>First, let’s plot the calculated moving averages values alongside the closing price. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Plotting the closing price with the 7 and 15 day moving averages
AmazonData1y.plot(y=['Close', '7MA', '15MA'], kind='line', figsize=(14,12)) plt.title('Closing price, 7MA and 15MA', fontsize=16)
plt.xlabel('Date')
plt.ylabel('Stock price($)')
plt.show()</pre>
<div class="wp-block-image">
<figure class="aligncenter"><img src="https://lh6.googleusercontent.com/jQSp1ogckMdep922IJMkK3PKa6ENu1-o8GWNm2g0AhyID1JYYhO9La2nPac0QWvCFe9V0_Nz4KnMl490DE5EGXhg7IallhEjy8C9iQGR2NV7TiMXa8oqv0517ue6mUc5IRgY2_p1" alt=""/></figure>
</div>
<p>And then the accumulated 5 day volume alongside the closing price. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Plotting the closing price alongside the 5 day volume
AmazonData1y.plot(y=['Close', '5VOL'], secondary_y='5VOL', kind='line', ylabel='Stock Price ($)', figsize=(14,12)) plt.title('Closing price and 5 day accumulated volume', fontsize=16)
plt.xlabel('Date')
plt.ylabel('Volume')
plt.show()</pre>
<div class="wp-block-image">
<figure class="aligncenter"><img src="https://lh5.googleusercontent.com/83qk16yeQwY9zCJJY3w2KbOIvQWZzCX2RjCjUHZtkADH9smK10IfGK4y5-gHLclys9eCUnNfoaUXjjeWzPJS2B4_umhPg0mOviZ2W8Hlg1tNfkWhej4k4_TZRSHMF6dDscI5IaEB" alt=""/></figure>
</div>
<h2>Summary</h2>
<p>This was a short tutorial on applying the <code>rolling()</code> method on a pandas dataframe using some statistics. </p>
<p>The goal of this article was to demonstrate some simple examples of how the <code>rolling()</code> method works, and I hope that it did accomplish that goal. </p>
<p>The <code>rolling()</code> method can be used for most statistics calculations, so try and explore it using other methods than those used for this article. </p>
<hr class="wp-block-separator"/>
</div>


https://www.sickgaming.net/blog/2022/04/...ted-guide/
Reply



Forum Jump:


Users browsing this thread:
2 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016