How to implement Case Insensitive CSV DictReader in python

Reading Time : ~ .

In general use cases, we upload the CSV files to the system to store huge amount of data by uploading single file. For example, in e-commerce sites we just write thousands of products details in a CSV file and just upload it. 

In python, we can read the data of a CSV file in 2 ways. One by using normal csv.reader and the other by using CSV.DictReader. To learn more about CSV's normal reader and dictreader check https://docs.python.org/2/library/csv.html

Why go for Case Insensitive CSV DictReader?

Using CSV reader we can read data by using column indexes and with DictReader we can read the data by using column names. Using the normal reader if the column indexes change then the data extraction goes wrong, to over come this we'll go for DictReder. With DictReader we can read the data using column names

Example CSV file format of products.csv:

Title UPC Cost
Samsung Galaxy SAM-123 20,000
MotoG2 MOT-123 11,000

With the following snippet we can read the above CSV file with DictReader.

import csv
with open('products.csv') as csvfile:
     reader = csv.DictReader(csvfile)
     for row in reader:
         print(row['Title'], row['UPC'])

The above snippet will print title and upc details of the product. But there is a disadvantage of this if the row headers are gone wrong in terms of case sensitivity it will raise an error.

For ex: If Title is written as TITLE or 'Title ' the above snippet raise KeyError.

To overcome above problem we have to make the DictReader case insensitive. For this, we have to override the CSV's DictReader.

In custom_dict.py :

import csv
class InsensitiveDictReader(csv.DictReader):
    # This class overrides the csv.fieldnames property, which converts all fieldnames without leading and trailing spaces and to lower case.

    @property
    def fieldnames(self):
        return [field.strip().lower() for field in csv.DictReader.fieldnames.fget(self)]

    def next(self):
        return InsensitiveDict(csv.DictReader.next(self))

class InsensitiveDict(dict):
    # This class overrides the __getitem__ method to automatically strip() and lower() the input key

    def __getitem__(self, key):
        return dict.__getitem__(self, key.strip().lower())

In the above code snippet, we have overridden python's 'dict' and CSV's 'DictReader'. We can use the above InsensitiveDictReader as below.

from custom_dict import InsensitiveDictReader

with open('products.csv') as csvfile:
     reader = InsensitiveDictReader(csvfile)
     for row in reader:
         print(row['Title'], row['UPC'])

The above code snippet will work out in all the cases of CSV header 'Title' written as '   title', 'Title', ' titLE ' and many other cases with case insensitive and with leading and trailing spaces.

    By Posted On
SENIOR DEVELOPER at MICROPYRAMID

Need any Help in your Project?Let's Talk

Latest Comments
Related Articles
Generating PDF Files in Python using xhtml2pdf Siva Chittamuru

There are many ways for generating PDF in python. In this post I will be presenting PDF files generation with xhtml2pdf.

xhtml2pdf: xhtml2pdf is a ...

Continue Reading...
Python using yield and generators. Dinesh Deshmukh

Generators are memory efficient. They allow us to code with minimum intermediate arguments, less data structures.
Generators are of two types, generator expressions and generator ...

Continue Reading...
Publishing Python Modules with PIP via PyPi Ashwin Kumar

We'll install so many packages in our day to day python development. Now in this blog post, we'll try to know how to create our ...

Continue Reading...

Subscribe To our news letter

Subscribe to our news letter to receive latest blog posts into your inbox. Please fill your email address in the below form.
*We don't provide your email contact details to any third parties