Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] How to Read Specific Columns from CSV File in Python

#1
How to Read Specific Columns from CSV File in Python

Rate this post
A Quick Glance at The Solutions [Each solution stays for 5-10 secs.]

Problem: Given a CSV file, how to read only specific column(s) from the csv file? (Reading a specific column from a csv file will yield all the row values pertaining to that column.)

Example: Consier the following csv file (countries.csv):

Country,Capital,Population,Area
Germany,Berlin,"84,267,549","348,560"
France,Paris,"65,534,239","547,557"
Spain,Madrid,"46,787,468","498,800"
Italy,Rome,"60,301,346","294,140"
India,Delhi,"1,404,495,187","2,973,190"
USA,Washington,"334,506,463","9,147,420"
China,Beijing,"1,449,357,022","9,388,211"
Poland,Warsaw,"37,771,789","306,230"
Russia,Moscow,"146,047,418","16,376,870"
England,London,"68,529,747","241,930"

Question: How will you read the above csv file and display the following columns

  1. Country column along with the Capital column?
  2. All values in the population column?

Method 1: Using Pandas


Using the Pandas library is probably the best option if you are dealing with csv files. You can easily read a csv file and store an entire column within a variable.

Code:

import pandas as pd df = pd.read_csv("countries.csv")
country = df['Country']
# or
# country = df.Country
capital = df['Capital']
# or
# capital = df.Capital # displaying selected columns (Country and Capital)
for x, y in zip(country, capital): print(f"{x} {y}") # displaying a single column (Country)
print()
print(df['Population'])

Output:

Germany Berlin
France Paris
Spain Madrid
Italy Rome
India Delhi
USA Washington
China Beijing
Poland Warsaw
Russia Moscow
England London 0 84,267,549
1 65,534,239
2 46,787,468
3 60,301,346
4 1,404,495,187
5 334,506,463
6 1,449,357,022
7 37,771,789
8 146,047,418
9 68,529,747
Name: Population, dtype: object

Explanation:

  • Read the csv file using pd.read_csv() Pandas function.
  • Save all the information of the columns Country and Capital within independent variables using
    • country = df['Country']
      • Alternatively, you can also use country = df.Country
    • capital = df['Capital']
      • Alternatively, you can also use capital = df.Capital
  • To display the country names and their capitals simultaneously, you can bind the two columns, country and capital, using the zip() function and then display each country along with its capital using a for loop upon the zipped object.
  • To display all the values in the population column, you can simply use df['Population'].

TRIVIA
zip() is a built-in function in Python that takes an arbitrary number of iterables and binds them into a single iterable, a zip object. It combines the n-th value of each iterable argument into a tuple.
Read more about zip() here.

List-Based Indexing of a DataFrame


In case you are not comfortable with using zip() to display multiple columns at once, you have another option. You can simply use list-based indexing to accomplish your goal.

List-based indexing is a technique that allows you to pass multiple column names as a list within the square-bracket selector.

Example:

import pandas as pd df = pd.read_csv("countries.csv")
print()
print(df[['Country', 'Capital']])

Output:

 Country Capital
0 Germany Berlin
1 France Paris
2 Spain Madrid
3 Italy Rome
4 India Delhi
5 USA Washington
6 China Beijing
7 Poland Warsaw
8 Russia Moscow
9 England London

Method 2: Integer Based Indexing with iloc


Approach: The idea here is to use the df.iloc[rows, columns].values to access individual columns from the DataFrame using indexing. Note that the first column always has the index 0, while the second column has index 1, and so on.

  • rows is used to select individual rows. Use the slicing colon: to ensure all rows have been selected.
  • columns is used to select individual columns.
    • Use country = data.iloc[:, 0].values to save the values of the Country column.
    • capital = data.iloc[:, 1].values to save the values of the Capital column.
    • population = data.iloc[:, 2].values to save the values of the Population column.
import pandas as pd data = pd.read_csv('countries.csv')
country = data.iloc[:, 0].values
capital = data.iloc[:, 1].values
population = data.iloc[:, 2].values
# displaying selected columns
print(data[['Country', 'Capital']])
print()
# displaying a single column (Population)
print(population)

Output:

Country Capital
0 Germany Berlin
1 France Paris
2 Spain Madrid
3 Italy Rome
4 India Delhi
5 USA Washington
6 China Beijing
7 Poland Warsaw
8 Russia Moscow
9 England London ['84,267,549' '65,534,239' '46,787,468' '60,301,346' '1,404,495,187' '334,506,463' '1,449,357,022' '37,771,789' '146,047,418' '68,529,747']

Method 3: Name-Based Indexing with loc()


Instead of selecting the columns by their index, you can also select them by their name using the df.loc[] selecter.

The following example shows how to select the columns Country and Capital from the given DataFrame.

import pandas as pd data = pd.read_csv('countries.csv')
val = data.loc[:, ['Country', 'Capital']]
print(val)

Output:

Country Capital
0 Germany Berlin
1 France Paris
2 Spain Madrid
3 Italy Rome
4 India Delhi
5 USA Washington
6 China Beijing
7 Poland Warsaw
8 Russia Moscow
9 England London

Related Tutorial: Slicing Data from a Pandas DataFrame using .loc and .iloc

Method 4: Using csv Module


csv module is yet another spectacular option in Python that allows you to play with csv files. Let us have a look at the code that helps us to read the given csv file and then read specific columns from it:

import csv population = []
with open('countries.csv', newline='', encoding='utf-8-sig') as csvfile: data = csv.DictReader(csvfile) for r in data: print("Country", ":", "Capital") # append values from population column to population list population.append(r['Population']) # displaying specific columns (Country and Capital) print(r['Country'], ":", r['Capital']) # display the population list print(population)

Output:

Country : Capital
Germany : Berlin
Country : Capital
France : Paris
Country : Capital
Spain : Madrid
Country : Capital
Italy : Rome
Country : Capital
India : Delhi
Country : Capital
USA : Washington
Country : Capital
China : Beijing
Country : Capital
Poland : Warsaw
Country : Capital
Russia : Moscow
Country : Capital
England : London
['84,267,549', '65,534,239', '46,787,468', '60,301,346', '1,404,495,187', '334,506,463', '1,449,357,022', '37,771,789', '146,047,418', '68,529,747']

Explanation:

  • Import the csv module and open up the csv file. Ensure that you feed in the encoding argument as it helps to eliminate any unreadable characters that may occur in the given csv file.
    • with open('countries.csv', newline='', encoding='utf-8-sig') as csvfile
  • Allow Python to read the csv file as a dictionary using csv.Dictreader object.
  • Once the file has been read in the form of a dictionary, you can easily fetch the values from respective columns by using the keys within square bracket notation from the dictionary. Here each column represents the key within the given dictionary.

Bonus: Here’s a quick look at how the DictReader() class looks like:

import csv population = []
with open('countries.csv', newline='', encoding='utf-8-sig') as csvfile: data = csv.DictReader(csvfile) for row in data: print(row)

Output:

{'Country': 'Germany', 'Capital': 'Berlin', 'Population': '84,267,549', 'Area': '348,560'}
{'Country': 'France', 'Capital': 'Paris', 'Population': '65,534,239', 'Area': '547,557'}
{'Country': 'Spain', 'Capital': 'Madrid', 'Population': '46,787,468', 'Area': '498,800'}
{'Country': 'Italy', 'Capital': 'Rome', 'Population': '60,301,346', 'Area': '294,140'}
{'Country': 'India', 'Capital': 'Delhi', 'Population': '1,404,495,187', 'Area': '2,973,190'}
{'Country': 'USA', 'Capital': 'Washington', 'Population': '334,506,463', 'Area': '9,147,420'}
{'Country': 'China', 'Capital': 'Beijing', 'Population': '1,449,357,022', 'Area': '9,388,211'}
{'Country': 'Poland', 'Capital': 'Warsaw', 'Population': '37,771,789', 'Area': '306,230'}
{'Country': 'Russia', 'Capital': 'Moscow', 'Population': '146,047,418', 'Area': '16,376,870'}
{'Country': 'England', 'Capital': 'London', 'Population': '68,529,747', 'Area': '241,930'}

It is evident from the output that csv.DictReader() returns a dictionary for each row such that the column header is the key while the value in the row is the associated value in the dictionary.

Conclusion


To sum things up, there are majorly four different ways of accessing specific columns from a given csv file:

  • List-Based Indexing.
  • Integer-Based Indexing.
  • Name-Based Indexing.
  • Using csv modules DictReader class.

Feel free to use the one that suits you best. I hope this tutorial helped you. Please subscribe and stay tuned for more interesting tutorials. Happy learning!


Learn Pandas the Fun Way by Solving Code Puzzles


If you want to boost your Pandas skills, consider checking out my puzzle-based learning book Coffee Break Pandas (Amazon Link).

Coffee Break Pandas Book

It contains 74 hand-crafted Pandas puzzles including explanations. By solving each puzzle, you’ll get a score representing your skill level in Pandas. Can you become a Pandas Grandmaster?

Coffee Break Pandas offers a fun-based approach to data science mastery—and a truly gamified learning experience.



https://www.sickgaming.net/blog/2022/05/...in-python/
Reply



Possibly Related Threads…
Thread Author Replies Views Last Post
  [Tut] How to Create and Run a Batch File That Runs a Python Script? xSicKxBot 0 1,240 11-09-2022, 09:53 PM
Last Post: xSicKxBot
  [Tut] Python Create JSON File xSicKxBot 0 1,252 11-03-2022, 01:09 PM
Last Post: xSicKxBot
  [Tut] How to Filter Data from an Excel File in Python with Pandas xSicKxBot 0 1,227 10-31-2022, 05:36 AM
Last Post: xSicKxBot
  [Tut] How to Return a File From a Function in Python? xSicKxBot 0 1,329 10-21-2022, 09:47 AM
Last Post: xSicKxBot
  [Tut] How to Delete a Line from a File in Python? xSicKxBot 0 1,216 09-24-2022, 10:31 AM
Last Post: xSicKxBot
  [Tut] How to Convert a Log to a CSV File in Python? xSicKxBot 0 1,291 08-30-2022, 02:11 AM
Last Post: xSicKxBot
  [Tut] How to Append a New Row to a CSV File in Python? xSicKxBot 0 1,154 08-18-2022, 10:44 AM
Last Post: xSicKxBot
  [Tut] How to Convert Tab-Delimited File to CSV in Python? xSicKxBot 0 1,265 08-14-2022, 01:12 AM
Last Post: xSicKxBot
  [Tut] How to Convert a List of Dicts to a CSV File in Python [4 Ways] xSicKxBot 0 1,182 08-13-2022, 04:32 AM
Last Post: xSicKxBot
  [Tut] How to Read and Convert a Binary File to CSV in Python? xSicKxBot 0 1,379 08-09-2022, 04:41 AM
Last Post: xSicKxBot

Forum Jump:


Users browsing this thread:
1 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016