CSV is the acronym of “Comma Separated Values”. A csv file is a just plain text document used to represent and exchange tabular data. Each row in a csv file represents an “entity”, and each column represents an attribute of it. Columns are usually separated by a comma but other characters can be used as field separator instead of it. In this tutorial we will see how to read and create csv files using Python and specifically the csv module, which is part of the language standard library.

In this tutorial you will learn:

  • How to read csv rows as a list of strings
  • How to read a csv as a list of dictionaries
  • How to create a csv using Python
  • How to create a csv starting from a list of dictionaries
How to read and create csv files using Python
How to read and create csv files using Python

Software requirements and conventions used

Software Requirements and Linux Command Line Conventions
Category Requirements, Conventions or Software Version Used
System Distribution independent
Software Python3
Other Basic knowledge of Python and Object Oriented Programming
Conventions # - requires given linux-commands to be executed with root privileges either directly as a root user or by use of sudo command
$ - requires given linux-commands to be executed as a regular non-privileged user

CSV - Comma Separated Value

As we already mentioned in the introduction of this tutorial, a csv is just a simple plain text file, formatted in a way which let us represent and exchange tabular data. Each row in a csv file represents an entity of some kind, except the first row which usually contains the field titles. Let’s see an example. Suppose we want to represent characters from the Lord Of The Rings book in csv format:

Name,Race
Frodo,hobbit
Aragorn,man
Legolas,elf
Gimli,dwarf

The one above is a trivial example of the content of a csv file. As you can see we used the , (comma) as field separator. We save that data in a file called lotr.csv. Let’s see how we can read it using the Python programming language, and the csv module.

Reading a csv file

To interact with a csv file with Python, the first thing we have to do is to import the csv module. Let’s write a simple script, just few lines of code:

#!/usr/bin/env python3
import csv

if __name__ == '__main__':
    with open('lotr.csv', newline='') as csvfile:
        reader = csv.reader(csvfile)
        for row in reader:
            print(row)

SUBSCRIBE NEWSLETTER & RSS
Subscribe to RSS and NEWSLETTER and receive latest Linux news, jobs, career advice and tutorials.


In this example we suppose that the script we created above (let’s call it script.py) is in the same directory of the csv file, and said directory is our current working one.

The first thing we did was to import the csv module; then we opened the file in read mode (the default) with a context manager, so that we are sure that the file object is always closed whenever the interpreters exists the with block, even if some kind of error occurs. You can also notice that we used the newline argument of the open function to specify an empty string as the newline character. This is a security measure, since, as stated in the csv module documentation:

If newline=’‘ is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n line endings on write an extra \r will be added. It should always be safe to specify newline=’‘, since the csv module does its own (universal) newline handling.

The csvfile object represents our opened file: we pass it as argument to the csv.reader function which returns a reader object we reference via the csv_reader variable. We use this object to iterate through each line of the file, which is returned as a list of strings. In this case we just print them. If we execute the script we obtain the following result:

$ ./script.py
['Name', 'Race']
['Frodo', 'hobbit']
['Aragorn', 'man']
['Legolas', 'elf']
['Gimli', 'dwarf']

That was pretty easy, wasn’t it? What if a character other than the comma is used as a field separator? In that case we could use delimiter parameter of the function, and specify the character which should be used. Let’s say said character is |. We would write:

csv_reader = csv.reader(csvfile, delimiter="|")

Read the csv fields in a dictionary

The one we used above is probably the easiest way we can use to read a csv file with python. The csv modules defines also the DictReader class, which let us map each row in a csv file to a dictionary, where the keys are the field names and the values are the their actual content in a row. Let’s see an example. Here is how we modify our script:

#!/usr/bin/env python3
import csv

if __name__ == '__main__':
    with open('lotr.csv', newline='') as csvfile:
        reader = csv.DictReader(csvfile)
        for row in reader:
            print(row)

The DictReader class constructor mandatory first argument is the file object created when we opened the file. If we launch the script, this time we obtain the following result:

{'Name': 'Frodo', ' Race': ' hobbit'}
{'Name': 'Aragorn', ' Race': ' man'}
{'Name': 'Legolas', ' Race': ' elf'}
{'Name': 'Gimli', ' Race': ' dwarf'}

As already said, the fields contained in the first row, are used as the dictionary keys; but what if the first row of the file doesn’t contain the field names? In that case we can specify them by using the fieldnames parameter of the DictReader class constructor:

reader = csv.DictReader(csvfile, fieldnames=['Name', 'Race])

Create a csv file

Until now we just saw how to read data from a csv file, both as a list of strings each representing a row, and as a dictionary. Now let’s see how to create csv file. As always we just start with an example, and than we explain it. Imagine we want to programmatically create the csv file we created manually before. Here is the code we would write:

#!/usr/bin/env python3
import csv

if __name__ == '__main__':
    with open('lotr.csv', 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        for row in (('Name', 'Race'), ('Frodo', 'hobbit'), ('Aragorn', 'man'), ('Legoals', 'elf'), ('Gimli', 'dwarf')):
            writer.writerow(row)


The first thing you should notice is that this time we opened the lotr.csv file in write mode (w). In this mode a file is created if it doesn’t exist, and is truncated otherwise (check our article about performing input/output operations on files with Python if you want to know more about this subject).

Instead of a reader object, this time we created a writer one, using the writer function provided in the csv module. The parameters this function accepts are very similar to those accepted by the reader one. We could, for example, specify an alternative delimiter using the parameter with the same name.

Since in this case we already know all the csv rows beforehand, we can avoid using a loop, and write all of them at once using the writerows method of the writer object:

#!/usr/bin/env python3
import csv

if __name__ == '__main__':
    with open('lotr.csv', 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerows((('Name', 'Race'), ('Frodo', 'hobbit'), ('Aragorn', 'man'), ('Legolas', 'elf'), ('Gimli', 'dwarf')))

Create a csv file with the DictWriter object

The csv module provides a DictWriter class, which let us map a dictionary to a csv row. This can be very useful when the data we are working on comes this way and want to represent it in tabular form. Let’s see an example. Suppose our LOTR characters data is represented as a list of dictionaries (perhaps as it would be returned from an API call made with the requests module). Here is what we could write to create a csv based on it:

#!/usr/bin/env python3
import csv

characters_data = [
  {
    'Name': 'Frodo',
    'Race': 'hobbit'
  },
  {
    'Name': 'Aragorn',
    'Race': 'man'
  },
  {
    'Name': 'Legolas',
    'Race': 'elf'
  },
  {
    'Name': 'Gimli',
    'Race': 'dwarf'
  }
]

if __name__ == '__main__':
    with open('lotr.csv', 'w') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=('Name', 'Race'))
        writer.writeheader()
        writer.writerows(characters_data)

Let’s see what we did. First we created an instance of the DictWriter class, passing as arguments the file object (csvfile) and than fieldnames, which must be a sequence of values to be used as the csv field names, and determines in what order the values contained in each dictionary should be written to the file. While in the case of the DictReader class constructor this parameter is optional, here it is mandatory, and it’s easy to understand why.

After creating the writer object, we called its writeheader method: this method is used to create the initial csv row, containing the field names we passed in the constructor.

Finally, we called the writerows method to write all the csv rows at once, passing the list of dictionaries as argument (here we referenced them by the characters_data variable). All done!

Conclusions

In this article we learned the basics of reading and creating csv files using the Python programming language. We saw how to read the rows of a csv file both as a list of strings and in a dictionary using a DictReader object, and how to create a new csv file writing one row at the time, or all rows at once. Finally, we saw how to create a csv file starting from a list of dictionaries as could be returned from an API call. If you want to know more about the csv python module please consult the official documentation.

FIND LATEST LINUX JOBS on LinuxCareers.com
Submit your RESUME, create a JOB ALERT or subscribe to RSS feed.
LINUX CAREER NEWSLETTER
Subscribe to NEWSLETTER and receive latest news, jobs, career advice and tutorials.
DO YOU NEED ADDITIONAL HELP?
Get extra help by visiting our LINUX FORUM or simply use comments below.


Comments and Discussions