To analyze the word lengths in the first line of the Gettysburg Address with a regular dictionary requires code to catch the KeyError
and set a default value:
text = "Four score and seven years ago our fathers brought forth on this
continent, a new nation, conceived in Liberty, and dedicated to the proposition
that all men are created equal"
text = text.replace(',', '').lower() # remove punctuation
word_lengths = {}
for word in text.split():
try:
word_lengths[len(word)] += 1
except KeyError:
word_lengths[len(word)] = 1
print(word_lengths)
Using defaultdict
in this case would be more concise and elegant:
from collections import defaultdict
word_lengths = defaultdict(int)
for word in text.split():
word_lengths[len(word)] += 1
print(word_lengths)
returns:
defaultdict(<class 'int'>, {4: 3, 5: 5, 3: 9, 7: 4, 2: 3, 9: 3, 1: 1, 6: 1, 11: 1})
Note that defaultdict
is not a built-in: it must be imported from the collections
module.
The default_factory
function is set to int
: if a key is missing, it will be inserted into the dictionary and initialized with a call to int()
, which returns 0
.