Pandas - Manipulating data row-wise

import pandas as pd

iris = pd.read_csv("archive.ics.uci.edu/ml/machine-learning-dat..")

df = iris.copy()

df.columns = ['sl', 'sw', 'pl', 'pw', 'flower_type']

1) deleting a particular row

df.drop(0, inplace = True)

df.drop(3, inplace = True)

df.head()

slswplpwflower_type
14.73.21.30.2Iris-setosa
24.63.11.50.2Iris-setosa
45.43.91.70.4Iris-setosa
54.63.41.40.3Iris-setosa
65.03.41.50.2Iris-setosa

It deletes the row with the label 3. By default, an extra column from 0 to n - 1 labeled column is added, which gets mistaken to be index values. In reality, it is just a label. Hence, if try to run the same above code again, it will throw an error as the row with label = 3 will get deleted in the first run.

If we don't use the inplace = True argument, all the changes made will be in the copy of df file not on the original df file.

2) see all the labels

print(df.index)

Int64Index([  1,   2,   4,   5,   6,   7,   8,   9,  10,  11,
            ...
            139, 140, 141, 142, 143, 144, 145, 146, 147, 148],
           dtype='int64', length=147)

3) deleting particular row based on position or actual index value

df.drop(df.index[0], inplace = True)

df.head()

slswplpwflower_type
24.63.11.50.2Iris-setosa
45.43.91.70.4Iris-setosa
54.63.41.40.3Iris-setosa
65.03.41.50.2Iris-setosa
74.42.91.40.2Iris-setosa

4) deleting multiple rows at the same time

df.drop(df.index[[0, 1]], inplace = True)

df.head()

slswplpwflower_type
54.63.41.40.3Iris-setosa
65.03.41.50.2Iris-setosa
74.42.91.40.2Iris-setosa
84.93.11.50.1Iris-setosa
95.43.71.50.2Iris-setosa

5) running a particular condition

df.sl > 5

5      False
6      False
7      False
8      False
9       True
       ...  
144     True
145     True
146     True
147     True
148     True
Name: sl, Length: 144, dtype: bool

This way returns in a true or false format

/ Better representation method

df[df.sl > 5]

slswplpwflower_type
95.43.71.50.2Iris-setosa
135.84.01.20.2Iris-setosa
145.74.41.50.4Iris-setosa
155.43.91.30.4Iris-setosa
165.13.51.40.3Iris-setosa
..................
1446.73.05.22.3Iris-virginica
1456.32.55.01.9Iris-virginica
1466.53.05.22.0Iris-virginica
1476.23.45.42.3Iris-virginica
1485.93.05.11.8Iris-virginica

116 rows × 5 columns

This method returns only those rows which satisfies the given condtion.

df[df.flower_type == 'Iris-setosa']

slswplpwflower_type
54.63.41.40.3Iris-setosa
65.03.41.50.2Iris-setosa
74.42.91.40.2Iris-setosa
84.93.11.50.1Iris-setosa
95.43.71.50.2Iris-setosa
104.83.41.60.2Iris-setosa
114.83.01.40.1Iris-setosa
124.33.01.10.1Iris-setosa
135.84.01.20.2Iris-setosa
145.74.41.50.4Iris-setosa
155.43.91.30.4Iris-setosa
165.13.51.40.3Iris-setosa
175.73.81.70.3Iris-setosa
185.13.81.50.3Iris-setosa
195.43.41.70.2Iris-setosa
205.13.71.50.4Iris-setosa
214.63.61.00.2Iris-setosa
225.13.31.70.5Iris-setosa
234.83.41.90.2Iris-setosa
245.03.01.60.2Iris-setosa
255.03.41.60.4Iris-setosa
265.23.51.50.2Iris-setosa
275.23.41.40.2Iris-setosa
284.73.21.60.2Iris-setosa
294.83.11.60.2Iris-setosa
305.43.41.50.4Iris-setosa
315.24.11.50.1Iris-setosa
325.54.21.40.2Iris-setosa
334.93.11.50.1Iris-setosa
345.03.21.20.2Iris-setosa
355.53.51.30.2Iris-setosa
364.93.11.50.1Iris-setosa
374.43.01.30.2Iris-setosa
385.13.41.50.2Iris-setosa
395.03.51.30.3Iris-setosa
404.52.31.30.3Iris-setosa
414.43.21.30.2Iris-setosa
425.03.51.60.6Iris-setosa
435.13.81.90.4Iris-setosa
444.83.01.40.3Iris-setosa
455.13.81.60.2Iris-setosa
464.63.21.40.2Iris-setosa
475.33.71.50.2Iris-setosa
485.03.31.40.2Iris-setosa

Generate more detailed information

df[df.flower_type == 'Iris-setosa'].describe()

slswplpw
count44.00000044.00000044.00000044.000000
mean5.0136363.4227271.4659090.245455
std0.3625430.3893130.1790710.110925
min4.3000002.3000001.0000000.100000
25%4.8000003.1750001.4000000.200000
50%5.0000003.4000001.5000000.200000
75%5.2000003.7000001.6000000.300000
max5.8000004.4000001.9000000.600000

6) checking a particular row

print(df.iloc[0]) # position based

print(df.loc[5]) # label based


sl                     4.6
sw                     3.4
pl                     1.4
pw                     0.3
flower_type    Iris-setosa

Name: 5, dtype: object
sl                     4.6
sw                     3.4
pl                     1.4
pw                     0.3
flower_type    Iris-setosa
Name: 5, dtype: object

7) adding a row

df.loc[0] = [1, 2, 3, 4, 'Iris-sertosa']

df.tail()

slswplpwflower_type
1456.32.55.01.9Iris-virginica
1466.53.05.22.0Iris-virginica
1476.23.45.42.3Iris-virginica
1485.93.05.11.8Iris-virginica
01.02.03.04.0Iris-sertosa

Adds a row to the last with the label name = 0 and provided data.