We present the PhonVoc toolkit, a cascaded deep neural network (DNN)
composed of speech analyser and synthesizer that use a shared phonetic
and/or phonological speech representation. The free toolkit is distributed
as open-source software under a BSD 3-Clause License, available at
https://github.com/idiap/phonvoc with the pre-trained US English analysis
and synthesis DNNs, and thus it is ready for immediate use.
In a broader context,
the toolkit implements training and testing of the analysis by synthesis
heuristic model. It is thus designed for the wider speech community
working in acoustic phonetics, laboratory phonology, and parametric
speech coding. The toolkit interprets the phonetic posterior probabilities
as a sequential scheme, whereas the phonological posterior-class probabilities
are considered as a parallel via K different phonological classes.
A case study is presented on a LibriSpeech database and a LibriVox
US English native female speaker. The phonetic and phonological vocoding
yield comparable performance, improving speech quality by merging the
phonetic and phonological speech representation.