This repository contains the implementation, and results for an acceleration-based user authentication system using neural networks and optimization techniques. The system leverages motion data from wearable devices to distinguish between legitimate users and imposters, with applications in secure authentication systems.
📢 Extended version of the report is available 🛑 CLICK HERE TO VIEW 🛑
This project explores a novel method for continuous user authentication using motion data collected from wearable devices like smartwatches. The primary goals include:
- Developing a robust machine learning model for user authentication.
- Optimizing model performance using advanced feature selection and genetic algorithms.
- Evaluating system performance with metrics like accuracy, precision, recall, FAR, FRR, and EER.
The data consists of accelerometer readings collected over two separate days from users. Features are extracted in both time and frequency domains:
- Time Domain: 88 statistical features (e.g., mean, standard deviation).
- Frequency Domain: 43 features generated via Fast Fourier Transform (FFT).
- Neural Networks: Designed and tuned for binary classification with optimized architectures.
- Optimization Techniques: Leveraging ANOVA, Mutual Information, Steepest Gradient, and Genetic Algorithms.
- Evaluation Metrics: FAR, FRR, EER, precision, recall, and accuracy are utilized for performance assessment.
1. Data Splitting:
- Experiments with different training/testing combinations (e.g., day-wise splitting).
- Optimal ratio of legitimate to imposter samples found to be 1:5.
2. Cross-validation:
- Leave-One-User-Out (LOUO) for generalization.
- DTW (Dynamic Time Warping) to select Leave-out user: Measures the distance between temporal sequences with varying lengths or speeds.
3. Cosine Similarity Analysis
Cosine similarity was employed to identify similarities and anomalies in walking patterns. For example:
- User 7: Displayed identical samples across days, flagged as a potential anomaly.
Principal Component Analysis (PCA) was used to reduce dataset complexity while preserving variance. The PCA revealed overlapping user clusters, highlighting challenges in separating similar patterns.
- Initial models achieved an average accuracy of 93.14%, with FAR of 7.59% and EER of 3.93%.
- Optimized models using Genetic Algorithms and SVMs reached lower EERs (e.g., 0.97%) and improved robustness.
Similarity scores provided key insights:
- Legitimate Users: High scores for their own data.
- Imposters: Consistently low scores.
- Feature Selection: Combining ANOVA, Mutual Information, and Steepest Gradient techniques yielded the most discriminative features.
- Dimensionality Reduction: PCA successfully identified clusters and reduced redundancy.
- Advanced Optimization: Techniques like GA+SVM improved precision and recall.
- Robust Validation: LOUO cross-validation ensured generalization to unseen users.
🟢 Initial Model Score:
🟢 Optimized Model Scores:
GA+SVM model | ANOVA+MI+SG model | |
---|---|---|
with LOUO | ||
without LOUO |
- Clone the repository.
- Install MATLAB and required toolboxes.
- Run the scripts in sequence, starting with ☛
model_initial.m
- Full reference list available in ☛
Extended Version of AI & ML Report _ GROUP - 34.pdf