IndexError: index 5000 is out of bounds for axis 0 with size 5000 #474

venkidevictor · 2020-03-14T10:54:33Z

Hi,
I am running my capstone project and working on my dataset. When I tried to clean my dataset removing the outliers, I am getting this error.
I am attaching the code as below.

#Removing Outliers
#Tukey Method

import required libraries

from collections import Counter

Outlier detection

def detect_outliers(df,n,features):

outlier_indices = []

# iterate over features(columns)
for col in features:
    # 1st quartile (25%)
    Q1 = np.percentile(df[col], 25)
    # 3rd quartile (75%)
    Q3 = np.percentile(df[col],75)
    # Interquartile range (IQR)
    IQR = Q3 - Q1
    
    # outlier step
    outlier_step = 1.5 * IQR
    
    # Determine a list of indices of outliers for feature col
    outlier_list_col = df[(df[col] < Q1 - outlier_step) | (df[col] > Q3 + outlier_step )].index
    
    # append the found outlier indices for col to the list of outlier indices 
    outlier_indices.extend(outlier_list_col)
    
# select observations containing more than 2 outliers
outlier_indices = Counter(outlier_indices)        
multiple_outliers = list( k for k, v in outlier_indices.items() if v > n )

return multiple_outliers

List of Outliers

Outliers_to_drop = detect_outliers(data1.drop('Class',axis=1),0,list(data1.drop('Class',axis=1)))
data1.drop('Class',axis=1).loc[Outliers_to_drop]

#Create New Dataset without Outliers
good_data = data1.drop(data1.index[Outliers_to_drop]).reset_index(drop = True)
good_data.info()

IndexError Traceback (most recent call last)
in
1 #Create New Dataset without Outliers
----> 2 good_data = data1.drop(data1.index[Outliers_to_drop]).reset_index(drop = True)
3 good_data.info()

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in getitem(self, key)
4289
4290 key = com.values_from_object(key)
-> 4291 result = getitem(key)
4292 if not is_scalar(result):
4293 return promote(result)

IndexError: index 5000 is out of bounds for axis 0 with size 5000

Can any one help me to fix this and code it properly.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IndexError: index 5000 is out of bounds for axis 0 with size 5000 #474

IndexError: index 5000 is out of bounds for axis 0 with size 5000 #474

venkidevictor commented Mar 14, 2020

IndexError: index 5000 is out of bounds for axis 0 with size 5000 #474

IndexError: index 5000 is out of bounds for axis 0 with size 5000 #474

Comments

venkidevictor commented Mar 14, 2020

import required libraries

Outlier detection

List of Outliers