Skip to content

Predicting penguin species using supervised machine learning algorithms. KneighborsClassifer is used for our model as it produced the best results in comparison with others.

Notifications You must be signed in to change notification settings

koustubh1317/penguinspecies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

Predicting Penguins species

ML project:- supervised learning

Made by:-Koustubh sinha

Penguin dataset where we want to predict the species of penguins based on certain feature using appropriate model.

About penguin dataset

It is a great intro dataset for data exploration & visualization

The dataset consists of 7 columns.

-species: penguin species (Chinstrap, Adélie, or Gentoo) -culmen_length_mm: culmen length (mm) -culmen_depth_mm: culmen depth (mm) -flipper_length_mm: flipper length (mm) -body_mass_g: body mass (g) -island: island name (Dream, Torgersen, or Biscoe) in the Palmer Archipelago (Antarctica) -sex: penguin sex

What are culmen length & depth? p

What are flippers? Pen

The description of dataset and detailed exploratory analysis of it is done in predicting_penguins.ipynb file PLEASE go through it

Getting started

Workflow:-

* Step 1 :- Data Cleaning and visualization

Done through different plots.

* Step 2:- Exploratory data analysis and data preprocessing

Understand the data and make the dataset ready to be fitted in model(label encoder,normalization).

Screenshot 2022-12-18 192449 Screenshot 2022-12-18 191547

* Step 3:- Model Training and selection

Screenshot 2022-12-18 191609 As seen from the step 2, Knc has the best parameters to satisfy our dataset.

About KNeighborsClassifier

  • KNeighborsClassifier is one of the simplest Machine Learning algorithms based on Supervised Learning technique.
  • KNC algorithm assumes the similarity between the new case/data and available cases and put the new case into the category that is most similar to the available categories.
  • KNC algorithm stores all the available data and classifies a new data point based on the similarity. This means when new data appears then it can be easily classified into a well suite category by using KNC algorithm.

* Step 4:- Testing new data points

Screenshot 2022-12-18 193326

About

Predicting penguin species using supervised machine learning algorithms. KneighborsClassifer is used for our model as it produced the best results in comparison with others.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published