After a round of optimization, refactoring, bug fixing and testing it is time for a new blog post. Since fundamentals have changed and due to public requests, we do a step-by-step tutorial. First of all, the good news: SOPARE 1.5 is out and was successful developed, installed and tested on Raspbian Wheezy, Jessie and Stretch. In addition, people mentioned that SOPARE works on Orange Pi and on some Ubuntu versions. Just in case you have no idea what SOPARE is let’s do a quick introduction:
SOPARE stands for SOund PAttern REcognition and is a Python project developed on and for the Raspberry Pi. The goal is to provide offline and real time audio processing for some words that must be trained upfront.
As SOPARE is able to learn sounds from training sessions SOPARE is able to identify the same sound later on even under different circumstances. This means that you can train words in any languages. Or just sounds like doorbells, knocks and whatever you want. Of course, there are limitations. However, SOPARE provides a simple plug-in architecture for further processing. Here are some real life operational areas: SOPARE runs 24/7 and controls smart home things like lights (on/off), a magic mirror (wake up, change views, …) and another installation controls a robotic arm via voice commands. The source code and even more information is available on GitHub.
You want to see SOPARE in action? Here is a 32 second video that shows the potential:
Now let us start with the hardware requirements. You need a computer. Yep, seriously. As SOPARE was developed for and on a Raspberry Pi we go with this one – even if SOPARE runs on other hardware as well. Make sure that the hardware comes with a multi core processor. This means Raspberry Pi 2 or 3. Please note: The Pi zero was not tested and could be too weak even if the „0“ comes with 2 cores. SOPARE does not run on older hardware like Raspberry Pi B or B+ due to the lack of multi-core processors. Of course, you need a power supply and a micro SD card if you go with the Raspberry Pi.
Then you need a microphone. Maybe some USB-mic. The microphone is extremely important and should fit your own requirements. For example: If you want speech recognition across a large distance (more that 1 meter) you may find out that the cheap USB-mic for 5 Euros does not do the trick. But if you plan to speak directly into the microphone the same mic could do the job just perfect. I’m using different microphones for different environments and requirements.
That’s it for the hardware. Now let’s talk about software. SOPARE should run on every Raspbian version that is out there. The latest version is Stretch. All of my Raspberry Pis are running the „lite“ version without a desktop UI. But this is up to you and you can choose whatever you prefer. There is some good information available how to download, install and configure Raspbian. I don’t cover this topic as it would get out of hand.
Now you should have a computer, a mic and the operating system installed and configured. In terms of Raspbian you already got most of the software for the further installation. Only some required libraries must be installed manually with the following commands:
sudo apt-get update sudo apt-get install build-essential python-pyaudio python-numpy python-scipy python-matplotlib
I recommend to create a development directory in your home directory but this is really optional. In case you follow my recommendation execute the following commands:
cd mkdir dev cd dev
You are now ready to install SOPARE from GitHub:
git clone https://github.com/bishoph/sopare.git
Voilá. To really be ready and to follow the complete instructions we need two more directories:
cd sopare mkdir tokens mkdir samples
You successful installed SOPARE. Congratulations. We can fire up some tests to find out if all requirements are met and if the microphone is configured and used correctly. Start SOPARE and the audio test with the following commands:
python sopare.py -u python test/test_audio.py
Let us assume that everything went well and you got no errors. In that case you see something like this:
sopare 1.5.1 starting unit tests... ... unit_tests run successful! done. test_audio init... ... ALSA related information ... testing different SAMPLE_RATEs ... this may take a while! Your sopare/config.py recommendations: SAMPLE_RATE = 48000 CHUNK = 512 THRESHOLD = 100
Great! You can now edit the configuration and change the file accordingly to the recommendations:
As soon as you saved the config you are ready to do a first training round. Let’s train the word „test“. This is easy as eating cake:
./sopare.py -v -t test
Start saying the word „test“ shortly after the line
INFO:sopare.recorder:start endless recording
appears on the screen. You should see lots of lines rush over your monitor. This is good as SOPARE logs some debug information. If the lines are rushing before you said something SOPARE started the training because something triggered the THRESHOLD. In that case I recommend to delete the trained file(s) and start the training again, maybe with a higher THRESHOLD.
Here is the command to delete the files and the dictionary and start again:
rm dict/*.raw ./sopare.py -d "*"
You can repeat the training round a few times. Normally 3 times is enough to get first results.
After the training SOPARE must create an internal dictionary from the training:
Finally we reached the end of the step-by-step tutorial. You may want to check if your trained words are recognized, right? Here we go. Start SOPARE in endless loop mode and say „test“:
Depending on your mic, your environment, the count of test rounds and lots of other things you should see that SOPARE is able to recognize the word test as it appears on the screen in square brackets.
Amazing. You can now fine tune, train more or different words or write your custom plugin. See the other available content for more information. Leave me a comment and tell me about your experience and your achievement. The video video tutorial for this post:
Happy voice control and have fun!