The file shark-species.txt
contains a list of extant shark species arranged in a hierachy by order, family, genus and species (with the species given as binomial name : common name). Read the file into a data structure nested dictionaries which can be accessed as follows:
>>> sharks['Lamniformes']['Lamnidae']['Carcharodon']['C. carcharias']
Great white shark
Here is one approach, which relies on the consistent formatting of the provided data file.
fi = open('shark-species.txt', 'r')
# The file shark-species.txt has order names starting in the first column,
# family indented by 4 spaces, genus indented by 8 spaces and
# binomial : common name indented by 12 spaces.
sharks = {}
for line in fi.readlines():
if line.startswith(' '*12):
# binomial name : common name
binomial, common_name = line.strip().split(' : ')
# refactor the binomial name to standard abbreviated form
species = '{}. {}'.format(genus[0], binomial.split()[1])
sharks[order][family][genus][species] = common_name
elif line.startswith(' '*8):
# A new GENUS: start a new dictionary in its name
genus = line.strip()
sharks[order][family][genus] = {}
elif line.startswith(' '*4):
# A new FAMILY: start a new dictionary in its name
family = line.strip()
sharks[order][family] = {}
else:
# A new ORDER: start a new dictionary in its name
order = line.strip()
sharks[order] = {}
fi.close()
# Test with the taxonomic form for a Great white shark
print(sharks['Lamniformes']['Lamnidae']['Carcharodon']['C. carcharias'])