This repo is an implementation of Tacotron 2 wrapped underneath a web application. I manage to generate a David Attenbourgh TTS model from training over 20 hours of labeld audio transcript data
- Integrate maximizing mutual information into the loss function
- Create APRAbet converter to break down text representation into lower dimensional phenome representation