loc and iloc in pandas

Learning Scientific Programming with Python (2nd edition)

E9.2: loc and iloc in pandas

There is a potential source of confusion when using loc for a Series or DataFrame with an integer index: it is important to remember that loc always refers to the index labels whereas iloc takes a (zero-based) integer location index:

In [x]: df = pd.DataFrame(np.arange(12).reshape(4, 3) + 10,
                          index=[1, 2, 3, 4], columns=list('abc'))
In [x]: df
Out[x]:
    a   b   c
1  10  11  12
2  13  14  15
3  16  17  18
4  19  20  21

In [x]: df.loc[1]   # the row with index *label* 1 (the first row)
Out[x]:
a    10
b    11
c    12
Name: 1, dtype: int64

In [x]: df.iloc[1]  # the row with index *location* 1 (the row labeled 2)
a    13
b    14
c    15
Name: 2, dtype: int64

Note also that index labels do not have to be unique:

In [x]: df.index = [1, 2, 2, 3]    # change the index labels
In [x]: df
Out[x]:
    a   b   c
1  10  11  12
2  13  14  15
2  16  17  18
3  19  20  21

In [x]: df.loc[2]     # a DataFrame: all rows labeled 2
Out[x]: 
    a   b   c
2  13  14  15
2  16  17  18

In [x]: df.iloc[2]    # a Series: there is only one row located at index 2                                                           
Out[x]: 
a    16
b    17
c    18
Name: 2, dtype: int64