Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] How to Get the Last N Rows of a Pandas DataFrame?

#1
How to Get the Last N Rows of a Pandas DataFrame?

<div>
<div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload="{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;439075&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;0&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;0&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;0\/5 - (0 votes)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;width&quot;:&quot;0&quot;,&quot;_legend&quot;:&quot;{score}\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}">
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 0px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend" style="font-size: 19.2px;"> <span class="kksr-muted">Rate this post</span> </div>
</div>
<p>In this tutorial we will unearth the solutions to three commonly asked questions that users come across while dealing with huge sets of data.</p>
<h2><strong>Problem Formulation</strong></h2>
<p><strong>Given: </strong>Consider the following csv file (Note: You need to use it as a Pandas DataFrame).</p>
<figure class="wp-block-image size-full is-style-default"><img loading="lazy" width="329" height="244" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-57.png" alt="" class="wp-image-463115" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-57.png 329w, https://blog.finxter.com/wp-content/uplo...00x222.png 300w" sizes="(max-width: 329px) 100vw, 329px" /></figure>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd df = pd.read_csv('countries.csv')
print(df)</pre>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="atomic" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""> Country Capital Population Area
0 Germany Berlin 84,267,549 348,560
1 France Paris 65,534,239 547,557
2 Spain Madrid 46,787,468 498,800
3 Italy Rome 60,301,346 294,140
4 India Delhi 1,404,495,187 2,973,190
5 USA Washington 334,506,463 9,147,420
6 China Beijing 1,449,357,022 9,388,211
7 Poland Warsaw 37,771,789 306,230
8 Russia Moscow 146,047,418 16,376,870
9 England London 68,529,747 241,930</pre>
<p>Here’s the list of the questions that we will be focusing upon in this article:</p>
<ul class="has-background" style="background:linear-gradient(135deg,rgb(255,245,203) 0%,rgb(182,227,212) 37%,rgb(51,167,181) 100%)">
<li><strong>How to get the last N rows of a Pandas DataFrame?</strong></li>
<li><strong>How to get last N rows from last N columns of a Pandas DataFrame?</strong></li>
<li><strong>How to read last N rows of a large csv file in Pandas?</strong></li>
</ul>
<p><strong>Recommended Read: <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-select-rows-from-a-dataframe-based-on-column-values/" target="_blank">How to Select Rows From a DataFrame Based on Column Values?</a></strong></p>
<p>Without further delay, let us dive into the solutions to the first question and learn how to get the last N rows of a Pandas DataFrame.</p>
<h2><strong>Method 1: Using iloc</strong></h2>
<p class="has-global-color-8-background-color has-background"><strong>Approach: </strong>Use the <code>iloc</code> property as <code>pandas.DataFrame.iloc[-n:]</code>.</p>
<p>The <code>iloc</code> property is used to get or set the values of specified indices. Select the last <strong>n</strong> rows using the square bracket notation syntax <strong>[-n:]</strong> with the <code>iloc</code> property. Here, <strong>-n</strong> represents the index of the last <strong>n</strong> rows of the given pandas DataFrame. </p>
<p><strong>Code:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="4" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd df = pd.read_csv('countries.csv')
rows = df.iloc[-5:]
print(rows)</pre>
<p><strong>Output:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""> Country Capital Population Area
5 USA Washington 334,506,463 9,147,420
6 China Beijing 1,449,357,022 9,388,211
7 Poland Warsaw 37,771,789 306,230
8 Russia Moscow 146,047,418 16,376,870
9 England London 68,529,747 241,930</pre>
<h2><strong>Method 2: Using tail()</strong></h2>
<p class="has-global-color-8-background-color has-background"><strong>Approach: </strong>Use the <code>pandas.DataFrame.tail(n)</code> to select the last <strong>n </strong>rows of the given DataFrame.</p>
<p>The <code>tail(n)</code> method returns <strong>n</strong> number of methods from the bottom end of the DataFrame. Here, <strong>n</strong> represents an integer that denotes the number of rows you want to fetch from the bottom end of the DataFrame. </p>
<p><strong>Code:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd df = pd.read_csv('countries.csv')
rows = df.tail(5)
print(rows)</pre>
<p><strong>Output:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""> Country Capital Population Area
5 USA Washington 334,506,463 9,147,420
6 China Beijing 1,449,357,022 9,388,211
7 Poland Warsaw 37,771,789 306,230
8 Russia Moscow 146,047,418 16,376,870
9 England London 68,529,747 241,930</pre>
<p>Well, that brings us to the next question in line – <strong>“How to get the last N rows from last N columns of a Pandas DataFrame?”</strong></p>
<h2><strong>Method 1: </strong>Integer Based Indexing</h2>
<p><strong>Approach: </strong>Call <code>pandas.DataFrame.iloc[-n:, -m:]</code> to display last <strong>n </strong>rows from the last <strong>m </strong>columns of the given DataFrame.</p>
<p><strong>Code:</strong> In the following code snippet we will fetch the last 5 rows from the last 2 columns, i.e., <em>Population</em> and <em>Area</em>.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd df = pd.read_csv('countries.csv')
rows = df.iloc[-5:, -2:]
print(rows)</pre>
<p><strong>Output:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""> Population Area
5 334,506,463 9,147,420
6 1,449,357,022 9,388,211
7 37,771,789 306,230
8 146,047,418 16,376,870
9 68,529,747 241,930</pre>
<h2><strong>Method 2: </strong>Name Based Indexing</h2>
<p>In case, you happen to know the names of the specific columns and you want to get the last <strong>N</strong> records from the DataFrame from those columns then you can follow a two step process.</p>
<ul>
<li>Call the <code>Pandas.DataFrame.loc(:, 'start_column_name':'end_column_name')</code> selector. It allows you to use slicing on column names instead of integer identifiers which can be more comfortable. </li>
<li><code>.loc</code>&nbsp;is for label based indexing. Hence, the negative indices are not found and reindexed to&nbsp;<code>NaN</code>. Thus, to deal with this you have to use the <code>tail()</code> method to extract the last <strong>N</strong> records from the selected columns. </li>
</ul>
<p><strong>Code:</strong> The following code snippet shows how you can use the column names and fetch the corresponding values from the last 5 rows of the given Dataframe.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd df = pd.read_csv('countries.csv')
rows = df.loc[:, 'Population':'Area']
print(rows.tail(5))</pre>
<p><strong>Output:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""> Population Area
5 334,506,463 9,147,420
6 1,449,357,022 9,388,211
7 37,771,789 306,230
8 146,047,418 16,376,870
9 68,529,747 241,930</pre>
<p>Last but not least, let us solve the third and final problem of today’s tutorial – “<strong>How to read last N rows of a large csv file in Pandas?</strong>” </p>
<p>Unfortunately, <code>read_csv()</code> does not facilitate us with any parameter that allows you to directly read the last <strong>N</strong> lines from a file. This can be a troublesome issue to handle when you are dealing with large datasets. </p>
<p>Thus, a workaround to this problem is to first find out the total number of lines/records in the file. Then use the <code>skiprows</code> parameter to directly jump to the row/line from which you want to select the records. </p>
<p><strong>Code:</strong> In the following code snippet we will fetch the first 5 rows from the csv file into our DataFrame.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd def num_of_lines(fname): with open(fname) as f: for i, _ in enumerate(f): pass return i + 1 num_lines = num_of_lines("countries.csv")
n = 5
df = pd.read_csv("countries.csv", skiprows=range(1, num_lines - n))
print(df)</pre>
<p><strong>Output:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""> Country Capital Population Area
0 USA Washington 334,506,463 9,147,420
1 China Beijing 1,449,357,022 9,388,211
2 Poland Warsaw 37,771,789 306,230
3 Russia Moscow 146,047,418 16,376,870
4 England London 68,529,747 241,930</pre>
<h2><strong>Conclusion</strong></h2>
<p>Phew! We have successfully solved all the problems that were presented to us in this tutorial. &nbsp;I hope this tutorial helped you to sharpen your coding skills. Please&nbsp;<strong><a rel="noreferrer noopener" href="https://www.youtube.com/channel/UCRlWL2q80BnI4sA5ISrz9uw" target="_blank">stay tuned</a></strong>&nbsp;and&nbsp;<strong><a rel="noreferrer noopener" href="https://blog.finxter.com/subscribe/" target="_blank">subscribe</a></strong>&nbsp;for more interesting coding problems.</p>
<p><strong>Recommended Reads: </strong></p>
<ul class="has-base-background-color has-background">
<li><strong><a href="https://blog.finxter.com/pandas-dataframe-head-method/" target="_blank" rel="noreferrer noopener">Pandas DataFrame head() and tail() Method</a></strong></li>
<li><strong><a href="https://blog.finxter.com/delete-column-pandas-dataframe/" target="_blank" rel="noreferrer noopener">Delete Column from Pandas DataFrame</a></strong></li>
<li><strong><a href="https://blog.finxter.com/change-column-type-in-pandas/" target="_blank" rel="noreferrer noopener">Change Column Type in Pandas</a></strong></li>
</ul>
<hr class="wp-block-separator has-alpha-channel-opacity" />
<h2>Learn Pandas the Fun Way by Solving Code Puzzles</h2>
<p>If you want to boost your Pandas skills, consider checking out my puzzle-based learning book <a href="https://amzn.to/3lyM5iZ" title="https://amzn.to/3lyM5iZ" target="_blank" rel="noreferrer noopener">Coffee Break Pandas</a> (Amazon Link). </p>
<div class="wp-block-image">
<figure class="aligncenter is-resized"><a href="https://amzn.to/3lyM5iZ" target="_blank" rel="noopener"><img loading="lazy" src="https://blog.finxter.com/wp-content/uploads/2020/11/cover.jpg" alt="Coffee Break Pandas Book" class="wp-image-16780" width="340" height="511" title="Coffee Break Pandas Book" srcset="https://blog.finxter.com/wp-content/uploads/2020/11/cover.jpg 680w, https://blog.finxter.com/wp-content/uplo...00x300.jpg 200w, https://blog.finxter.com/wp-content/uplo...50x225.jpg 150w" sizes="(max-width: 340px) 100vw, 340px" /></a></figure>
</div>
<p>It contains 74 hand-crafted Pandas puzzles including explanations. By solving each puzzle, you’ll get a score representing your skill level in Pandas. Can you become a Pandas Grandmaster?</p>
<p><a href="https://amzn.to/3lyM5iZ" target="_blank" rel="noreferrer noopener" title="https://amzn.to/3lyM5iZ">Coffee Break Pandas</a> offers a fun-based approach to data science mastery—and a truly gamified learning experience.</p>
</div>


https://www.sickgaming.net/blog/2022/07/...dataframe/
Reply



Forum Jump:


Users browsing this thread:
4 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016