This project goal is to classify either the protein sequences have antioxidant property or not. To achieve this goal, we perform:
- Feature extraction: To obtain protein feature based on their sequences
- Feature selection: To select only impactful feature, there are two ways to get these features: a) Eliminate high correlated features b) Perform RFECV
- Analyze optimal SVM parameter to determined which parameter we used on hyperparameter tuning
- Hyperparameter Tuning using SVM
- Evaluate model using data testing
- Analyze whether the model is overfit or not.