Learning Scientific Programming with Python (2nd edition)
P9.5.2: Analysing Formula One data
Question P9.5.2
Read in the data file f1-data.csv concerning recent Formula One Grands Prix seasons, and rank (a) the drivers by their number of wins; (b) the constructors by their number of wins; and (c) the circuits by their average fastest lap per race.
Solution P9.5.2
The following code uses groupby
to determine the necessary rankings.
import pandas as pd
df = pd.read_csv("f1-data.csv")
# Create a DataFrame of race winners.
winners = df[df["Position"] == 1]
# Group by Driver and count wins.
g = winners.groupby("Driver")
print("Drivers by number of wins")
print(g["Driver"].count().sort_values(ascending=False))
# Group by Constructor and count wins.
g = winners.groupby("Constructor")
print()
print("Constructors by number of wins")
print(g["Constructor"].count().sort_values(ascending=False))
# Ensure the 'Fastest Lap' column is a datetime, and convert to ms.
df["Fastest Lap"] = pd.to_datetime(df["Fastest Lap"], format="%M:%S.%f")
df["Fastest Lap (ms)"] = (
df["Fastest Lap"].dt.minute * 60000
+ df["Fastest Lap"].dt.second * 1000
+ df["Fastest Lap"].dt.microsecond / 1000
)
# Group Fastest Lap by Circuit and calculate mean.
g = df[df["Fastest Lap (ms)"].notna()].groupby("Circuit")
tdf = g["Fastest Lap (ms)"].mean().sort_values()
def to_time_str(time_ms):
"""Convert from ms to string in form MM:SS.[MS]"""
mins, time_ms = divmod(time_ms, 60000)
secs = time_ms / 1000
return "{:02d}:{:6.3f}".format(int(mins), secs)
print()
print("Mean fastest lap times by circuit")
for circuit, time in tdf.items():
print(to_time_str(time), circuit)
Output:
Drivers by number of wins
Driver
Michael Schumacher 91
Lewis Hamilton 84
Sebastian Vettel 53
Fernando Alonso 32
Nico Rosberg 23
Damon Hill 22
Ayrton Senna 21
Kimi Räikkönen 21
Mika Häkkinen 20
Nigel Mansell 16
Jenson Button 15
David Coulthard 13
Alain Prost 12
Felipe Massa 11
Jacques Villeneuve 11
Rubens Barrichello 11
Mark Webber 9
Max Verstappen 8
Juan Pablo Montoya 7
Valtteri Bottas 7
Daniel Ricciardo 7
Ralf Schumacher 6
Gerhard Berger 5
Riccardo Patrese 4
Eddie Irvine 4
Johnny Herbert 3
Nelson Piquet 3
Giancarlo Fisichella 3
Heinz-Harald Frentzen 3
Charles Leclerc 2
Jean Alesi 1
Pastor Maldonado 1
Heikki Kovalainen 1
Robert Kubica 1
Jarno Trulli 1
Thierry Boutsen 1
Olivier Panis 1
Name: Driver, dtype: int64
Constructors by number of wins
Constructor
Ferrari 141
McLaren 102
Mercedes 93
Williams 72
Red Bull 62
Benetton 25
Renault 20
Brawn 8
Jordan 4
Lotus F1 2
BMW Sauber 1
Honda 1
Ligier 1
Stewart 1
Toro Rosso 1
Name: Constructor, dtype: int64
Mean fastest lap times by circuit
01:10.703 Red Bull Ring
01:14.853 Indianapolis Motor Speedway
01:16.006 Autódromo José Carlos Pace
01:17.736 Circuit de Nevers Magny-Cours
01:18.190 Circuit Gilles Villeneuve
01:18.725 Hockenheimring
01:18.918 Circuit de Monaco
01:21.746 Autódromo Hermanos Rodríguez
01:24.477 Autodromo Enzo e Dino Ferrari
01:24.738 Hungaroring
01:25.098 Circuit de Barcelona-Catalunya
01:26.289 Fuji Speedway
01:26.378 Autodromo Nazionale di Monza
01:29.332 Istanbul Park
01:30.071 Albert Park Grand Prix Circuit
01:30.172 Buddh International Circuit
01:32.043 Silverstone Circuit
01:35.307 Circuit Paul Ricard
01:37.482 Suzuka Circuit
01:37.489 Nürburgring
01:37.933 Bahrain International Circuit
01:40.180 Sepang International Circuit
01:40.519 Sochi Autodrom
01:41.088 Shanghai International Circuit
01:41.902 Valencia Street Circuit
01:42.041 Circuit of the Americas
01:44.496 Yas Marina Circuit
01:46.231 Korean International Circuit
01:46.886 Baku City Circuit
01:50.564 Marina Bay Street Circuit
01:52.076 Circuit de Spa-Francorchamps