Final strategy: I'm going to use CMU Sphinx with a small vocabulary trained to my voice for most commands. I'll use the kaldi-gstreamer-server, or maybe even an online service, for larger, arbitrary pieces of sound - stuff that I can't predict.
Which means that I'll have two separate, behemoth systems installed on the computer. Ouch. At least I can stream Kaldi from a different computer. Sphinx should be small enough to not be a problem.
Here's what I need to be able to train the command and control language model.
Which means that I'll have two separate, behemoth systems installed on the computer. Ouch. At least I can stream Kaldi from a different computer. Sphinx should be small enough to not be a problem.
Here's what I need to be able to train the command and control language model.