Skip to content

Voice Activity Detection LSTM-RNN learning model

Notifications You must be signed in to change notification settings

BongkiLee/LSTM-RNN-VAD

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LSTM-RNN Voice Activity Detection

REQUIRED PACKAGES

numpy, tensorflow, libROSA, matplotlib

FILES

- dataset_utils.py
Dataset related utilities: One-hot encoding, wav file normalisation, TRS to CSV conversion, JSON to CSV conversion, Youtube wav download for the AudioSet Google corpus, Liblinear library data transformations

- metrics_utils.py
(NOT FINALISED) Metrics' related utilities for the baseline VAD methods

- feature_extractor.py
Feature extraction class to extract MFCC, deltas, double deltas, RSE

- VAD_model.py
LSTM-RNN tensorflow learning model

- _main_.py
The program's main entry point

- /checkpoint
Tensorflow checkpoint directory for saving and restoring learning models

- /parameter
LSTM-RNN learning model hyper-parameters, training parameters, and log/checkpoint directories names

- /notebook
Jupyter notebooks to test initial VAD prototypes

About

Voice Activity Detection LSTM-RNN learning model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%