Reading and Writing CSV Files in Python using CSV Module & Pandas

By | October 24, 2021
Reading and Writing CSV Files in Python

Python provides many ways to read and write data between CSV files. Amongst all the different ways to read a CSV file in Python, the python Standard csv module and pandas libraries provide simplistic and straightforward methods to read a CSV file. Although like a simple text file we can also use Python file handling and open() method to read a CSV file in Python.

In this Python tutorial, we will walk through How to read and write data between the CSV files in Python. And by the end of this tutorial, you will have a solid idea about what is a CSV file and how to handle CSV files in Python?

What is a CSV file?

A CSV aka Comma Separated Values file is a simple text file, with a file extension of .csv. But unlike a text file, the data inside the CSV file must be organized in a specific format. The data in the CSV file should be stored in a tabular like format, and as the name suggests the data values inside the CSV files must be separated by commas.

Like tabular data of Relational Databases, every row or line of the CSV file represents a record, and every column represents a specific data field.

#movies.csv

movieId,title,genres
1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
2,Jumanji (1995),Adventure|Children|Fantasy
3,Grumpier Old Men (1995),Comedy|Romance
4,Waiting to Exhale (1995),Comedy|Drama|Romance
5,Father of the Bride Part II (1995),Comedy
6,Heat (1995),Action|Crime|Thriller
7,Sabrina (1995),Comedy|Romance

A CSV file can also be opened using the Excel sheet, and there you can see the proper representation of the CSV data.

Vamware

From the above movies.csv file you can see that every data value in a column is separated with a comma, and every new record is terminated with a new line.

Now let’s discuss how can we read and write data in a CSV file in Python.

Python CSV Module

Python comes with a powerful Standard CSV module to read and write CSV files. To use the Python dedicated standard csv module we have to import it first, using the Python import statement.

import csv

Create a CSV file in Python and write data

Let’s start by creating a CSV file using Python and write some data in it. Although we can simply use the Python file handling write() method to write data in a CSV file, but here we will be using csv.writer() and csv.writerow()method to write data row by row.

Example: Write a CSV file in Python

import csv

#open or create file
with open("movies.csv", 'w', newline="") as file:
    writer = csv.writer(file)
    
    #write data
    writer.writerow(["movieId", "title", "genres"])
    writer.writerow(["1","Toy Story (1995)","Adventure|Animation|Children|Comedy|Fantasy"])
    writer.writerow(["2","Jumanji (1995)","Adventure|Children|Fantasy"])
    writer.writerow(["3","Grumpier Old Men (1995)","Comedy|Romance"])
    writer.writerow(["4","Waiting to Exhale (1995)","Comedy|Drama|Romance"])

From the above example you can see that to write a CSV file in Python you first need to open it using Python open() method.

When you execute the above program it will create a movies.csv file in the same directory where your Python script is.

#movies.csv

movieId,title,genres
1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
2,Jumanji (1995),Adventure|Children|Fantasy
3,Grumpier Old Men (1995),Comedy|Romance
4,Waiting to Exhale (1995),Comedy|Drama|Romance

In the above example, you can see that when we open the file with open("movies.csv", 'w', newline="")statement we also specify the newline ="" parameter, and it specifies that there should be no newline gap between two records.

Write CSV data in Python using writerows() method

In the above example, we write data in our movies.csv file using writerow()method. And when we use the writerow() method to write the data, we have to use it multiple times because it writes data row by row.

The csv.writer()module also provide the writer.writerows()method, which can write multiple data rows in the CSV file with one call.

Python Example: Write multiple rows in a csv file with writerows() 

Let’s continue with our above example and append new rows of movie data in our movies.csv file with the writer.writerows() method.

import csv

movies_rows = [
                ["5","Father of the Bride Part II (1995)","Comedy"],
                ["6","Heat (1995)","Action|Crime|Thriller"],
                ["7","Sabrina (1995)","Comedy|Romance"]
               ]

#append data to movies.csv
with open("movies.csv", 'a', newline="") as file:
    writer = csv.writer(file)
    
    #write multiple rows
    writer.writerows(movies_rows)

In this example, we append new data to our movies.csv file by opening the file with "a"append mode. And when you execute this program your movies.csv file will be populated with 3 more rows.

movieId,title,genres
1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
2,Jumanji (1995),Adventure|Children|Fantasy
3,Grumpier Old Men (1995),Comedy|Romance
4,Waiting to Exhale (1995),Comedy|Drama|Romance
5,Father of the Bride Part II (1995),Comedy
6,Heat (1995),Action|Crime|Thriller
7,Sabrina (1995),Comedy|Romance

Note: The default delimiter of csv.writer() is comma (,) which makes sense for comma-separated-values file, but if you want to set the delimiter to some other symbol like $, > or < you can specify the delimiter parameter to the writer() method.

writer = csv.writer(file, delimiter= ">")

Python CSV read Data

Now you know how to write data in a CSV file, let’s discuss how can you read data from the CSV file using Python csv module.

To parse CSV file in Python or to read data from a CSV file, we can use the csv.reader() method.

In the above examples, we created a movies.csvfile and write some data in it, now let’s read the data from the same movies.csv file.

Example: Python Parse CSV file and read data using csv.reader()

The csv.reader()method parses the CSV file in Python and returns a reader iterable object which is a list of rows data separated with comma (,).  And like other iterable objects, we can use Pythonfor loop to iterate over the returned value of reader() method.

import csv

#open movies.csv file to read
with open("movies.csv", 'r') as file:
    rows = csv.reader(file)
    
    for row in rows:
        print(row)

Output

['movieId', 'title', 'genres']
['1', 'Toy Story (1995)', 'Adventure|Animation|Children|Comedy|Fantasy']
['2', 'Jumanji (1995)', 'Adventure|Children|Fantasy']
['3', 'Grumpier Old Men (1995)', 'Comedy|Romance']
['4', 'Waiting to Exhale (1995)', 'Comedy|Drama|Romance']
['5', 'Father of the Bride Part II (1995)', 'Comedy']
['6', 'Heat (1995)', 'Action|Crime|Thriller']
['7', 'Sabrina (1995)', 'Comedy|Romance']

Note: By default the csv.reader()method read the csv file based on the comma (,) delimiter. If your CSV file has a different delimiter like >, \t, >, $, @, etc. you can explicitly specify the delimiter parameter to the reader method.

rows = csv.reader(file, delimiter=">")

Parse CSV file to Dict in Python

The Python CSV module provides csv.DictReader()method, which can parse the CSV file to a Python dictionary.

The csv.DictReader()method returns a DictReader iterable object, which contains dictionary objects of columns:data pair.

Example

import csv

#open movies.csv file to read
with open("movies.csv", 'r') as file:
    
    rows = csv.DictReader(file)
    
    for row in rows:
        print(row)

Output

{'movieId': '1', 'title': 'Toy Story (1995)', 'genres': 'Adventure|Animation|Children|Comedy|Fantasy'}
{'movieId': '2', 'title': 'Jumanji (1995)', 'genres': 'Adventure|Children|Fantasy'}
{'movieId': '3', 'title': 'Grumpier Old Men (1995)', 'genres': 'Comedy|Romance'}
{'movieId': '4', 'title': 'Waiting to Exhale (1995)', 'genres': 'Comedy|Drama|Romance'}
{'movieId': '5', 'title': 'Father of the Bride Part II (1995)', 'genres': 'Comedy'}
{'movieId': '6', 'title': 'Heat (1995)', 'genres': 'Action|Crime|Thriller'}
{'movieId': '7', 'title': 'Sabrina (1995)', 'genres': 'Comedy|Romance'}

Reading and Writing CSV file using Python Pandas Library

The pandas is one of the most powerful Python Data Science libraries, it comes with many built-in methods and features. It is widely used for data manipulation and analysis. Using this library we can write data between different file formats including CSV.

But in this Python tutorial, we will only be discussing how to write and read CSV file using Pandas.

Unlike the Python csvmodule, pandas does not come pre-installed. So before using pandas library make sure you have installed it.

Installing the pandas library is very easy and with the Python pip install command you can install pandas for your Python environment.

pip install pandas

Write CSV file with Python pandas to_csv()

Creating or writing data in a CSV file using pandas is a bit tricky as compared to the Python csvmodule. Before creating a CSV file and write data into it, we have to create a Pandas DataFrame. A pandas DataFrame can be understood as an n-dimensional array, with rows and columns.

Example

import pandas as pd

#2d array of movies
movies_rows = [
        ['1', 'Toy Story (1995)', 'Adventure|Animation|Children|Comedy|Fantasy'],
        ['2', 'Jumanji (1995)', 'Adventure|Children|Fantasy'],
        ['3', 'Grumpier Old Men (1995)', 'Comedy|Romance'],
        ['4', 'Waiting to Exhale (1995)', 'Comedy|Drama|Romance'],
        ['5', 'Father of the Bride Part II (1995)', 'Comedy'],
        ['6', 'Heat (1995)', 'Action|Crime|Thriller'],
        ['7', 'Sabrina (1995)', 'Comedy|Romance'],
             ]

heading = ['movieId', 'title', 'genres']

#pandas dataframe
movies = pd. DataFrame(movies_rows, columns= heading )

#create the movies.csv file from dataframe
movies.to_csv("movies.csv")

This will create a movies.csvfile in the same directory, where your python script is located.

,movieId,title,genres
0,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
1,2,Jumanji (1995),Adventure|Children|Fantasy
2,3,Grumpier Old Men (1995),Comedy|Romance
3,4,Waiting to Exhale (1995),Comedy|Drama|Romance
4,5,Father of the Bride Part II (1995),Comedy
5,6,Heat (1995),Action|Crime|Thriller
6,7,Sabrina (1995),Comedy|Romance

Read from a CSV file in Python using pandas read_csv() method

To read the CSV file in Python using pandaswe use the pd.read_csv()method. The read_csv() method accepts the CSV file name as a parameter and creates a Python pandas DataFrame.

Example:

import pandas as pd

df = pd.read_csv("movies.csv")

print(df)

Output

 Unnamed: 0 ... genres
0 0 ... Adventure|Animation|Children|Comedy|Fantasy
1 1 ... Adventure|Children|Fantasy
2 2 ... Comedy|Romance
3 3 ... Comedy|Drama|Romance
4 4 ... Comedy
5 5 ... Action|Crime|Thriller
6 6 ... Comedy|Romance

Conclusion

If you just want to parse a CSV file and read and write data, then you should use the python Standard CSV module, because using pandas for simple read and write file operations could be a high-performance task.

To write data in a csv file using the Python standard csvmodule, we can use the writer() method along with writerow() method. And to read data from the CSV file we can use the csv.reader() method.

In pandas, we first create a DataFrame and then write its data in the CSV file by using to_csv()method. And to read data from the CSV file using pandas we use the Pandas DataFrame read_csv() method.

People are also reading:

Author: Vinay

I am a Full Stack Developer with a Bachelor's Degree in Computer Science, who also loves to write technical articles that can help fellow developers.

Leave a Reply

Your email address will not be published. Required fields are marked *