Working with Python Collections Part 1

Collections are the most useful high-performance container data types in python. collections module provides the datatypes alternative to dict, list, set and tuple for better performance

The following are the data types provided by collections module:

named tuple
deque
Counter
OrderedDict
default dict

Most of us already know about tuples in python. A tuple is a lightweight object type which allows to store a sequence of immutable Python objects.

For Example: role = ('developer', 'designer', 'tester')

here we need to use the integer indexes to access the elements in the tuple like role[0], role[1] which gives 'developer' and 'designer' as the results respectively.

But in the case of namedtuple we have tuple name and tuple field names to access the tuples by their names.

For Example:

from collections import namedtuple

Company = namedtuple('Company', 'name location website')

mp = Company(name='micropyramid', location='hyderabad', website='micropyramid.com')  or  Company('micropyramid', 'hyderabad', 'micropyramid.com')

now we can access it by the names like mp.name, mp.location..etc

>>> mp.name

 'micropyramid'

>>> mp.location

  'hyderabad'

We can create new object with the existing sequence(Company)

>>> google = Company._make(['Google', 'hyderabad', 'google.com'])

>>> google

  Company(name='Google', location='hyderabad', website='google.com')

we can replace the values of the namedtuple by _replace() which returns a new object replacing specified fields with existing fields.

>>> google._replace(location="Banglore")

 Company(name='Google', location='Banglore', website='google.com')

we can also convert this namedtuple to dictionary by mp._asdict(), it gives an ordereddict.

>>> mp._asdict()

  OrderedDict([('name', 'micropyramid'), ('location', 'hyderabad'), ('website', 'micropyramid.com')])

The main advantage of namedtuple is to understand the code easily, it is faster than dictionary and it doesn't require more space than tuples.

Unlike dict we get the items in the order we defined the fields in namedtuple.

Deque is double-ended queue which allows us to append and pop the elements from both sides of the queue. It is thread safe and memory efficient.

We can also specify the maximum length that a deque can hold. If the number of elements exceeds the maximum length it simply pop the items from the other side.

from collections import deque
dq = deque()
append items to deque:

dq.append('a')
dq.append('b')
dq.append('c')

>>> dq
deque(['a', 'b', 'c'])
pop elements from left:

>>> dq.popleft()
'a'

>>> dq

deque(['b', 'c'])

>>> dq.appendleft()

  'a'

>>> dq

 deque(['a', 'b', 'c'])

we can specify the maximum length to deque by passing maxlen to it

dq = deque(maxlen=10)

if we insert values after 10 ,the leftmost value will be poped out.

>>> dq.extend(['d', 'e', 'f'])

>>> dq

  deque(['a', 'b', 'c', 'd', 'e', 'f'])

>>> dq.extendleft([1,2,3])

>>> dq

  deque(['3', '2', '1', 'a', 'b', 'c', 'd', 'e', 'f'])

deque is very useful to insert and delete elements fastly. But it is not recommended for randomly accessing elements.

Share this Blog post

Facebook X Linkedin