SOPARE architecture and plugins

After installing, configuring and training SOPARE, you want naturally do something with the recognized results. I run some installations where SOPARE turns lights on and off, controls a magic mirror and a robotic arm. With my voice. On a Raspberry Pi. Offline and in real time. How does that work? Glad you are asking. This post should give you an overview about SOPARE in terms of the architecture and provide some insights how to write your own custom plugins for further processing.

Architecture

SOPARE is written in Python and runs very well with Python 2.7 and was tested successfully on several Raspberry Pis running all kind of operating systems. In fact, SOPARE should run on all kind of *UNIX systems if the underlying hardware comes with a multi core CPU. Here you find more information about how to get started. Now let’s make a small walk through. When you start SOPARE with the command

./sopare.py

we can look at this simplified list what’s happening:

  1. Check for arguments from the command line and init parameters
  2. Load the config file
  3. Evaluate the init parameters
  4. Initialize the audio
  5. Read input from microphone
  6. Check if the sound level is above the THRESHOLD
    • Prepare
    • Filter
    • Analyse
    • Check for matching patterns
      • Call plugin(s)
  7. Stop everything after the configured timeout is reached

Below is an architecture overview:

SOPARE architecture overview

Now we can dig a bit deeper and do a more detailed view. The first thread (thread 1) is listening the whole time and records chunks of data. This small chunks of data are compared in terms of sound volume. Whenever one CHUNK volume is above the THRESHOLD the chunks are transformed, filtered and a characteristic is created. This happens in „thread 2“. The chain of characteristics are compared against trained results and the plugins are called in „thread 3“. Each thread runs on a different CPU core. This can be observed when SOPARE is running by starting the command

top -d 1

and then pressing „1“ to show all CPUs:

Output from command top -d 1

Press „q“ to exit the top output.

As we see in the above picture the SOPARE python processes with the corresponding PID 656, 655 and 649 I want to show two ways to kill SOPARE processes:

First via the „kill“ command and a list of all PIDs:

kill -3 649 655 656

Second via „pkill“ where the processes are killed based on a name:

pkill -3 -f ./sopare.py

I usually send a SIGQUIT as I get the dump to see what the program is doing but you can send any appropriate signal.

Ok. Now that we have touched the architecture and processes, let’s talk about plugins. Plugins are stored in the

plugins

directory. Below you see the plugin structure from my magic mirror pi:

SOPARE Plugin Structure

To create a plugin I recommend to copy the print plugin and create a „custom_plugin_directory“ where you replace the name „custom_plugin_directory“ with any meaningful name for your plugin:

cd plugins
cp -r print/ custom_plugin_directory

You now have a directory with all necessary files for your custom plugin in the directory „custom_plugin_directory“ or whatever name you have chosen. Let’s modify the plugin and see what’s inside:

cd custom_plugin_directory

The file „__init__.py“ is the one which is called from SOPARE when a sound is regognized. Open it:

nano __init__.py

You see some lines of code and the interesting part is this one:

def run(readable_results, data, rawbuf):
    print readable_results

The function „run“ is called and the value „readable_results“ is handed over as an array. This means that each recognized sound is inside this array with the corresponding ID. Let’s assume that you trained a sound with the ID „test“. When the same sound is recognized, then the ID „test“ shows up like this:

[u'test']

If more words are recognized, you get all of them in the array in the same order as they were recognized:

[u'test', u'another_id', u'whatever']

You can now work with the results and write your own conditions. For example, here is the code that just checks for two words:

def run(readable_results, data, rawbuf):
    if ('some_word' in readable_results and 'another_word' in readable_results):
        print ('Tschakka! Got my two words...now do some awesome stuff')

Here is an example where the results must be in a specific order:

def run(readable_results, data, rawbuf):
    if (len(readable_results) == 2 and 'word_1' in readable_results[0] and 'word_2' in readable_results[1]):
        print ('Tschakka! Got my two words in the right order...now do some awesome stuff')

With this knowledge you are now hopefully able to read and understand the robotic arm example and write your own plugins. There is one thing I want to mention. All plugins are called sequential. This means you should not execute any complex or long running code without threading!

Ok, I stop for now. I’ll make a video tutorial whenever I have the time. In the meantime don’t hold back and ask questions, make suggestions and give feedback.

Happy coding 🙂

42 thoughts on “SOPARE architecture and plugins

  1. Hello, me again who would like to detect specific noises from a closing door. Can we dig here a little deeper in the arcitecture of sopare? 🙂

  2. Hi
    I am attempting to run the code in Python 2.7 in the Windows environment. I am getting some errors in the pickling.py routine. In general, should sopare run on a win 7 64 bit environment? Is there anything special I need to change in the configuration?

    Thanks
    BJ

    • Never tested SOPARE on Windows as SOPARE is developed on and for the Raspberry Pi. If you resolve the dependencies it might work but I can’t guaranty it.

  3. Hello,
    I am trying to understand the logical architecture of sopare.However I can’t come to get how is the training taking place can you help me ?

    • Sure, I try to help but I don’t have a clue what’s your point is … please explain where you are stuck and how to help 🙂

      • I need to understand how the training is done ? the algorithm and dependencies to other developped module
        Basicly the log info is giving me an idea about the used modules however I still need to understand the algorithm

        • The architecture overview should give you quite a good overview about the modules and the flow. And a brief description how SOPARE works was described here and here 🙂

          • good thank you
            I have another question, the features characterising each word are: length , frequency and length is that correct ?

          • Frequencies, shape, length are the most prominent characteristics, but as you can configure quite some stuff you are in control.

  4. Hey,

    I am currently working on a speech recognition for the control of a mobile robot. All commands are limited to two consecutive words. After successful training of the vocabulary, I start the endless mode via „./sopare.py -l“. Here, however, my problem arises. Sopare recognizes a few words, but hangs up after those few words (i can’t say which number of words it is…sometimes after 3…sometimes after 24) and stops responding to my audio input. Have you ever experienced this problem? and could you help me with that?
    I would be very grateful

    Greetings, Julian

    • Hey. Do you have a custom plugin? Do you experience 100% CPU usage after some time? How many trained words are in the dictionary? One potential reason for the described behavior could be that one process or task takes very long and the work adds up to a point where the system start lagging and therefore SOPARE becomes unresponsive.

      • Hey, I think this problem could be solved by allocating one processor core per process to guarantee full parallel computing.
        Haven’t tried that yet but I’m working m way through cuda computing and I might post about it soon if that works and of corse if you agree

      • Hey. I haven´t checked the part with the CPU usage so far as i hadn´t got time last week but i will do it tomorrow. And Yes i do have a custom plugin which is responsible for the movement of the robot. In my dictionary are 7 words – each trained around 5 times. Maybe I should try to start my own code as a thread as you recommended. I will come back here after i tried it. Thank you so far for your advice!

      • Hey. I just want to inform you that the problem was caused by a mistake in my custom plugin. So there is no problem at all with your speech recognition engine :). But thank you very much for your quick response and your help.

  5. Great work and it is very fun using this tool. I am intermediate with python and I know my commands and operations. It is rare to see educational or tutorial about back-end architecture. Where can I go to learn more about how Python REALLY works? Such as how scripts interact and communicate ETC. I have looked around but not much luck, not sure if I am using wrong words.

    • I already answered this question for you. If you don’t like the answer you can try a different, more specific answer. BTW: You asked 3 times the same question so far…!?

  6. i had an error when implementing plugins
    when i tried to write to __init__.py file the following error appear:
    Error writing __init__.py: permission denied
    please need your help to fix this problem

  7. hi
    i have the same problem like julian.when i run ./sopare.py -l after 3 or 4 words that i say, sopare stops responding and doesnt react to voice.im not using any plugin and i just followed the step by step instructions.i have 2 words in my dict and each one about 4 or 5 times trained.im trying to control a robot and add my plugin but now its just sopare itself and the problem occurs.would you please guide me how to solve it?
    plus i want to use the speech recognition for long periods.is it possible or is there any limitations or something that needs to be done?
    thanks

  8. Hi, do you think it’s possible to recognize phonemes than words? Than the API doesn’t need to much resources. I would like to use Sopare for my own home assistant. There I’ve got some servers which will process the nlp afterwards.

    Thanks in advance.
    BR Daniel

  9. Hey, is there a way you can make sequence? So you say for example „Bathroom“, the System processes the command, than asks you for your next command like „lights“, than some processing and after that an other command from you. That would be really helpfull for me. I hope it is clear what i mean.

    Thanks for the great documentation.
    VG Christopher

  10. ALSA lib pcm_dsnoop.c:618:(snd_pcm_dsnoop_open) unable to open slave
    ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition ‚defaults.bluealsa.device‘
    ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
    ALSA lib conf.c:4996:(snd_config_expand) Args evaluate error: No such file or directory
    ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM bluealsa
    ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition ‚defaults.bluealsa.device‘
    ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
    ALSA lib conf.c:4996:(snd_config_expand) Args evaluate error: No such file or directory
    ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM bluealsa
    connect(2) call to /tmp/jack-1000/default/jack_0 failed (err=No such file or directory)
    attempt to connect to server failed

    • Yeah, an unrelated snippet of ALSA output without further context. What the heck!

      Ok, let’s comment this and assume it has something to do with SOPARE: If you run SOPARE and in case of an error the keyword error appears in the output.

      Note to myself: On my TODO list is now an issue to mark ALSA output as ALSA output and separate it completely from SOPARE output as 95& of all problems, questions and stuff is around ALSA output.

  11. Hey bishophdo, how are you? I’m new to Python and I have some doubts. Using Sopare can I get the characteristic string of the word that was spoken and recognized as a trained word? My goal is to find some patterns and differences through this chain of features and to be able to estimate the distance from the sound source to the microphone. I thought about maybe using THRESHOLD of words, doing a calibration and determining an estimated distance value for each THRESHOLD. Thank you.

    • You can train sounds with SOPARE and SOPARE may gets you predictions and results based on your training. Distance estimation is beyond SOPAREs scope. You should look at a different project or approach.

      • Yes I wanted to work with the help of SOPARE just to make the recognition of the sound, regarding the distance estimation I will implement something still, so I would like to know if I can access the chain of characteristics that SOPARE generates to make the comparison with the sound trained. Do you know if I can save this file? Because then I see if I can work on these data.

        • Hey bishoph, how are you? I was reading some topics in git and found the –wave function of SOPARE, this function creates wave files only of the words trained and saved in the dictionary or the words that are recognized while the program is being executed are also saved? Could you help me with this question?

          • Hey. SOPARE creates „.wav“ files for the recognized words and their parts as SOPARE processes them. You find the files in the folder „tokens/“.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.