Selecting Rows And Columns in Python Pandas
Slicing dataframes by rows and columns is a basic tool every analyst should have in their skill-set. We'll run through a quick tutorial covering the basics of selecting rows, columns and both rows and columns.This is an extremely lightweight introduction to rows, columns and pandas—perfect for beginners!
Import Dataset
import pandas as pd
df = pd.read_csv('iris-data.csv')
df.head()
sepal_length_cm | sepal_width_cm | petal_length_cm | petal_width_cm | class | |
---|---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 | Iris-setosa |
1 | 4.9 | 3.0 | 1.4 | 0.2 | Iris-setosa |
2 | 4.7 | 3.2 | 1.3 | 0.2 | Iris-setosa |
3 | 4.6 | 3.1 | 1.5 | 0.2 | Iris-setosa |
4 | 5.0 | 3.6 | 1.4 | 0.2 | Iris-setosa |
df.shape
(150, 5)
Selecting the first ten rows
df[:10]
sepal_length_cm | sepal_width_cm | petal_length_cm | petal_width_cm | class | |
---|---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 | Iris-setosa |
1 | 4.9 | 3.0 | 1.4 | 0.2 | Iris-setosa |
2 | 4.7 | 3.2 | 1.3 | 0.2 | Iris-setosa |
3 | 4.6 | 3.1 | 1.5 | 0.2 | Iris-setosa |
4 | 5.0 | 3.6 | 1.4 | 0.2 | Iris-setosa |
5 | 5.4 | 3.9 | 1.7 | 0.4 | Iris-setosa |
6 | 4.6 | 3.4 | 1.4 | 0.3 | Iris-setosa |
7 | 5.0 | 3.4 | 1.5 | NaN | Iris-setosa |
8 | 4.4 | 2.9 | 1.4 | NaN | Iris-setosa |
9 | 4.9 | 3.1 | 1.5 | NaN | Iris-setosa |
selecting the last five rows
df[-5:]
sepal_length_cm | sepal_width_cm | petal_length_cm | petal_width_cm | class | |
---|---|---|---|---|---|
145 | 6.7 | 3.0 | 5.2 | 2.3 | Iris-virginica |
146 | 6.3 | 2.5 | 5.0 | 2.3 | Iris-virginica |
147 | 6.5 | 3.0 | 5.2 | 2.0 | Iris-virginica |
148 | 6.2 | 3.4 | 5.4 | 2.3 | Iris-virginica |
149 | 5.9 | 3.0 | 5.1 | 1.8 | Iris-virginica |
Selecting rows 15-20
df[15:20]
sepal_length_cm | sepal_width_cm | petal_length_cm | petal_width_cm | class | |
---|---|---|---|---|---|
15 | 5.7 | 4.4 | 1.5 | 0.4 | Iris-setosa |
16 | 5.4 | 3.9 | 1.3 | 0.4 | Iris-setosa |
17 | 5.1 | 3.5 | 1.4 | 0.3 | Iris-setosa |
18 | 5.7 | 3.8 | 1.7 | 0.3 | Iris-setossa |
19 | 5.1 | 3.8 | 1.5 | 0.3 | Iris-setosa |
Selecting Columns
The quickest way to do this using pandas is by providing the column name as the input:
df['class']
0 Iris-setosa
1 Iris-setosa
2 Iris-setosa
3 Iris-setosa
4 Iris-setosa
5 Iris-setosa
6 Iris-setosa
7 Iris-setosa
8 Iris-setosa
9 Iris-setosa
10 Iris-setosa
11 Iris-setosa
12 Iris-setosa
13 Iris-setosa
14 Iris-setosa
15 Iris-setosa
16 Iris-setosa
17 Iris-setosa
18 Iris-setossa
19 Iris-setosa
20 Iris-setosa
21 Iris-setosa
22 Iris-setosa
23 Iris-setosa
24 Iris-setosa
25 Iris-setosa
26 Iris-setosa
27 Iris-setosa
28 Iris-setosa
29 Iris-setosa
...
120 Iris-virginica
121 Iris-virginica
122 Iris-virginica
123 Iris-virginica
124 Iris-virginica
125 Iris-virginica
126 Iris-virginica
127 Iris-virginica
128 Iris-virginica
129 Iris-virginica
130 Iris-virginica
131 Iris-virginica
132 Iris-virginica
133 Iris-virginica
134 Iris-virginica
135 Iris-virginica
136 Iris-virginica
137 Iris-virginica
138 Iris-virginica
139 Iris-virginica
140 Iris-virginica
141 Iris-virginica
142 Iris-virginica
143 Iris-virginica
144 Iris-virginica
145 Iris-virginica
146 Iris-virginica
147 Iris-virginica
148 Iris-virginica
149 Iris-virginica
Name: class, dtype: object
df[['class','petal_width_cm']] #two columns
class | petal_width_cm | |
---|---|---|
0 | Iris-setosa | 0.2 |
1 | Iris-setosa | 0.2 |
2 | Iris-setosa | 0.2 |
3 | Iris-setosa | 0.2 |
4 | Iris-setosa | 0.2 |
5 | Iris-setosa | 0.4 |
6 | Iris-setosa | 0.3 |
7 | Iris-setosa | NaN |
8 | Iris-setosa | NaN |
9 | Iris-setosa | NaN |
10 | Iris-setosa | NaN |
11 | Iris-setosa | NaN |
12 | Iris-setosa | 0.1 |
13 | Iris-setosa | 0.1 |
14 | Iris-setosa | 0.2 |
15 | Iris-setosa | 0.4 |
16 | Iris-setosa | 0.4 |
17 | Iris-setosa | 0.3 |
18 | Iris-setossa | 0.3 |
19 | Iris-setosa | 0.3 |
20 | Iris-setosa | 0.2 |
21 | Iris-setosa | 0.4 |
22 | Iris-setosa | 0.2 |
23 | Iris-setosa | 0.5 |
24 | Iris-setosa | 0.2 |
25 | Iris-setosa | 0.2 |
26 | Iris-setosa | 0.4 |
27 | Iris-setosa | 0.2 |
28 | Iris-setosa | 0.2 |
29 | Iris-setosa | 0.2 |
... | ... | ... |
120 | Iris-virginica | 2.3 |
121 | Iris-virginica | 2.0 |
122 | Iris-virginica | 2.0 |
123 | Iris-virginica | 1.8 |
124 | Iris-virginica | 2.1 |
125 | Iris-virginica | 1.8 |
126 | Iris-virginica | 1.8 |
127 | Iris-virginica | 1.8 |
128 | Iris-virginica | 2.1 |
129 | Iris-virginica | 1.6 |
130 | Iris-virginica | 1.9 |
131 | Iris-virginica | 2.0 |
132 | Iris-virginica | 2.2 |
133 | Iris-virginica | 1.5 |
134 | Iris-virginica | 1.4 |
135 | Iris-virginica | 2.3 |
136 | Iris-virginica | 2.4 |
137 | Iris-virginica | 1.8 |
138 | Iris-virginica | 1.8 |
139 | Iris-virginica | 2.1 |
140 | Iris-virginica | 2.4 |
141 | Iris-virginica | 2.3 |
142 | Iris-virginica | 1.9 |
143 | Iris-virginica | 2.3 |
144 | Iris-virginica | 2.5 |
145 | Iris-virginica | 2.3 |
146 | Iris-virginica | 2.3 |
147 | Iris-virginica | 2.0 |
148 | Iris-virginica | 2.3 |
149 | Iris-virginica | 1.8 |
150 rows Ă— 2 columns
Selecting Rows & Columns
df['class'][:5] #just first 5 instances
0 Iris-setosa
1 Iris-setosa
2 Iris-setosa
3 Iris-setosa
4 Iris-setosa
Name: class, dtype: object
df[df.columns[4]][5:10] #observations 5-10 using 'columns'
5 Iris-setosa
6 Iris-setosa
7 Iris-setosa
8 Iris-setosa
9 Iris-setosa
Name: class, dtype: object
df.ix[:, 4][-5:] # last five observations of column using 'ix'
145 Iris-virginica
146 Iris-virginica
147 Iris-virginica
148 Iris-virginica
149 Iris-virginica
Name: class, dtype: object