Tuesday, July 26, 2016

Installing Kaldi and Kaldi-Gstreamer-server on Ubuntu 16.04

Notes on the process of installing Kaldi and Kaldi-GStreamer-server on Ubuntu 16.04 LTS.  These were modified somewhat, since this is retroactively documented for my own benefit.

Kaldi is a state-of-the-art speech transcription engine, geared towards researchers and people who already know what they're doing.  I'm just trying to set it up.

Decide where to put Kaldi and make that your new working directory.
mkdir ~/tools/
cd tools
Clone Kaldi from github.
git clone https://github.com/kaldi-asr/kaldi.git
cd into this new location.
cd ./kaldi-master/tools
Check for any dependencies.  There were a few things I needed to add to my Ubuntu installation; don't remember what they were.  Do whatever this output instructs.
extras/check_dependencies.sh
Now comes the actual installation.
make
cd ../src
./configure --shared
make depend
make
Run this next to install the online extensions.
make ext
Note: if you have more than one core in your machine, you can run make -j 4 to do make in parallel.

Congratulations.  Kaldi is installed.  Installing Kaldi-GStreamer-server:

Before actually installing the kaldi-gstreamer-server, there's a few more things to do with kaldi itself.
Compile the Gstreamer plugin.  First, install dependencies. Note they are older versions of the packages.  Make sure you get the right version.  On Ubuntu/Debian, run:

sudo apt-get install libgstreamer1.0-dev gstreamer1.0-plugins-good gstreamer1.0-tools gstreamer1.0-pulseaudio
Kaldi-Gstreamer-server requires the gstreamer plugin to be compiled (makes sense).
cd ~/tools/kaldi-master/src/gst-plugin/
make depend
make
This folder (gst-plugin) should now contain the file libgstkaldi.so which contains the Gstreamer plugin.

Now it's time to install the kaldi-gstreamer-server package.  First, more dependencies.
sudo apt-get install pip python-yaml python-gi
pip install tornado ws4py==0.3.2 pyyaml
Note: You might need to run pip as sudo.  e.g. sudo pip install tornado, above.
Note: I couldn't figure out which YAML package to install, so I used both.  At least, they're both installed, and I don't remember which I actually needed.  If I do this again, I'll try to remember to change this.

Clone kaldi-gstreamer-server from GitHub into your tools folder.
cd ~/tools/
git clone https://github.com/alumae/kaldi-gstreamer-server.git

This completes the installation.

cd into the main folder.
cd ./kaldi-gstreamer-server/
Open the README file, peruse until understood.
gedit ./readme.md
Now you'll understand what I mean by server and worker.  You can start the server with:
python kaldigstserver/master_server.py --port=8888
Before starting a worker, make sure that the GST plugin path includes the gstreamer plugin you compiled.  If you put everything where I recommended, this is all you have to do:
export GST_PLUGIN_PATH=~/tools/kaldi-master/src/gst-plugin
Test to make sure it worked.  If it fails, take a look at the README file again.  This command should spit out a bunch of information.  If it just says something like, 'not found', you did something wrong.  I have no idea what.
gst-inspect-1.0 onlinegmmdecodefaster
Now you can start a worker.
python kaldigstserver/worker.py -u ws://localhost:8888/worker/ws/speech -c sample_worker.yaml
Example of how to use the server to transcribe text:
python kaldigstserver/client.py -r 32000 ~/tools/kaldi-gstreamer-server/test/data/english_test.raw
You can also use a Deep Neural Network (DNN) to process the data, but at time of writing the readme walkthrough was giving me errors.

That's it!

Final post here

I'm switching over to github pages .  The continuation of this blog (with archives included) is at umhau.github.io .  By the way, the ...