https://ithaka-labs.s3.amazonaws.com/static-files/images/tdm/tdmdocs/CC_BY.png

Created by Nathan Kelber and Ted Lawless for JSTOR Labs under Creative Commons CC BY License
For questions/comments/improvements, email nathan.kelber@ithaka.org.


Counter Objects

Description: This notebook describes:

  • What a Counter object is

  • The difference between counters and dictionaries

  • Using Counter objects for finding the most common elements

Use Case: For Learners (Detailed explanation, not ideal for researchers)

Difficulty: Intermediate

Completion Time: 20 minutes

Knowledge Required:

Knowledge Recommended: None

Data Format: None

Libraries Used: Counter from Collections

Research Pipeline: None


The Counter Container Datatype

A Counter is very similar to a dictionary, where the key is some variable and the value keeps count of the number of times it occurs. As the name suggests, Counters are very useful for counting the occurence of objects. We can create a Counter object from a list.

# A Counter object created from a list
sample_list = ['a', 'c', 'a', 'b', 'a', 'a', 'c', 'b', 'a', 'b', 'b', 'a', 'c', 'a']

from collections import Counter
Counter(sample_list)

We can also create a Counter object from a dictionary.

# A Counter object created from a dictionary
sample_dictionary = {'a': 4, 'b' : 13, 'c' : 2}
Counter(sample_dictionary)

The contents of a Counter object may look identical to a dictionary, but there are some significant differences. Let’s imagine we are using our Counter object to count the occurences of five words in a text

# An example dictionary with key/value pairs of words and numbers
wordcounts_dictionary = {
    'word_a': 23,
    'word_b': 3,
    'word_c': 4,
    'word_d': 4,
    'word_e': 32} 

# Create a Counter object `wordcounts_counter` from `wordcounts_dictionary`
wordcounts_counter = Counter(wordcounts_dictionary)
print(wordcounts_counter)

The Counter object looks just like a dictionary inside the parentheses () of Counter(). Both dictionaries and counters can return a value from a key.

# Returning a value for a given key in Python dictionaries vs. Counter objects
print(wordcounts_dictionary['word_a']) # Using a dictionary
print(wordcounts_counter['word_a'])  # Using a Counter

However, the Counter() has some helpful differences from a dictionary. One difference is that a Counter() returns a 0 when no such key exists.

# With a Counter, the value of the made-up key `no_such_key_exists` is 0. 
print(wordcounts_counter['no_such_key_exists']) 

If a key is not in a dictionary, Python returns a KeyError.

# With a dictionary, the value of the made-up key `no_such_key_exists` causes a KeyError in Python
# print(wordcounts_dictionary['no_such_key_exists']) 

If we wanted to overcome this difficulty using a dictionary, we could use the .get() method for retrieving values from a given key.

# A demonstration of returning a string when no such key exists
print(wordcounts_dictionary.get('no_such_key_exists')) # If no key is found, `None` is returned
print(wordcounts_dictionary.get('no_such_key_exists', 'No such key')) # We can also supply a second argument that defines a string to be returned

The Counter object, however, is also useful for a second purpose. Counter objects can be easily sorted using the most_common() method. We can specify an argument with this method to receive a certain number of results. Let’s try it on our example wordcounts_counter.

wordcounts_counter.most_common(5) # Print the top 3 most common items in `counter_demo`

There is no least_common method, but we can get the least commmon element using a negative index or slice.

# The least common element is at index -1
wordcounts_counter.most_common()[-1]

The final advantage of the Counter object is that it is a “high performance” container type, so it is able to process large amounts of data very quickly. Counter objects are the preferred method for working quickly and efficiently with counted elements in Python.