Sopare basic usage. Voice controlling a magic mirror

In the last post I did a quick sopare intro and we controlled a robotic arm via voice. Today I want to focus on simple one word commands and how to add custom features to sopare. And because I need something to control I use a smart mirror web interface which is one of my next projects I’m working on. It’s not yet a mirror but the frame, screen, a usb mic and some more parts are already assembled and I think this is a perfect example how to use sopare with one word commands.

The magic mirror prototype that will be controlled via voice. Offline and in real time.

The magic mirror prototype that will be controlled via voice

So, let’s start with the requirements. Obviously, you need a Raspberry Pi 2/3. And a microphone. All my sopare systems are using USB microphones. Here is a list that I’m using for different objectives:

  • Blue Microphones Snowball USB Mic (light control)
  • Samson Meteor Mic USB Studio (robotic arm control)
  • Foxnovo Portable USB 2.0 Mic (magic mirror control)

And of course, you need sopare in the latest version. Before we start the training, it’s a good time to check and adjust the mic input. As I’m working most of the time with a headless Pi, „alsamixer“ is my preferred tool. Just make sure that the input is not too high and not too low. I get good results with mic input levels around the 2/3 mark when the mic input level is not yet in the red sector (see the video for a visual reference).

There is one more thing to check and to adjust: The „THRESHOLD“ value from the sopare/config.py file. This threshold defines when sopare „starts listening“. Or, in other words, everything below this threshold should be noise or environment sound. But for the training or, later on, the voice recognition, the threshold should be low enough to trigger the processing. This depends a lot on your environment. To figure out your ideal threshold you can start sopare in debug/verbose mode and just take a look what’s happening. If you see a lot of lines scrolling down the screen and you don’t say your voice command than the threshold is too low. If you start talking and nothing happens, the threshold is too high. Find your personal threshold by testing around with the following command:

./sopare.py -l -v

and adjust the config setting accordingly.

Now we are ready for training. Training is easy as eating cake. In my case I want to train a word and sopare should recognize the word with the identifier „main“. All I have to do is start the training by running the following command:

./sopare.py -t main

When you see the line „start endless recording“ on the screen it’s time to train by saying a word or playing a tone of making the sound that should be trained. Repeat this at least 3 times. After training, we need to compile the dictionary which includes the information about the trained sound. This can be done with the command:

./sopare.py -c

We are now ready to test the trained command(s):

../sopare.py -l

You should test if your trained command is recognized. Normally you should get quite far with the default settings. Short one word commands will result in false positives with the default settings. How to gain more precision is something for one of the next posts as this topic is quite complex and requires some more background which does not fit in this basic usage post.

CTRL-c to terminate the process and return to the command line. In the rare case that the process does not terminates completely, find out the PID with the command

ps aux | grep python

and kill the process(es) via

kill -3 PID

Now you may want to ask: How can I control stuff with sopare? Where does my code goes? Good questions. Sopare comes with a simple plugin interface. Just take a look into the directory „./plugins“. You see the standard plugin „print/__init__.py“. This plugin only contains two (2) lines of code that are relevant in terms of coding:

def run(readable_results, data, rawbuf):
    print readable_results

The interesting part for the basic usage is the list value „readable_results“. In my case the value contains something like

[u’main‘]

If you simply copy the print plugin you can add your own code and control stuff. Offline and in real time. In any language you want. Or any sound you want. Next time we will talk a bit more about configuration and customization but for today I have to stop. Have fun and tell me what you think and where/how you are using sopare 🙂

9 thoughts on “Sopare basic usage. Voice controlling a magic mirror

  1. Hi Bishoph,

    thanks a lot for this post and also the video tutorial. Very much appreciated as I am looking for a solution to voice control my home automation system.

    Everything works now according to your great instructions, however the precision of the word recognition could be better. Sometimes, completely unrelated words are detected as a key word. I think thats what you call false positive.

    Would very much appreciate some hints or another great tutorial!

    Best regards
    Frank (auch aus Deutschland 😉 )

  2. Hi Frank, you are welcome and cool that it works so far. In terms of precision and better recognition I guess I just need to step up and explain how this works 😉 All I can say right now is that it’s all about configuration…and the video tutorial would be really helpful to explain that. Please be patient until my priorities allow me to provide a new blog post and the corresponding video tutorial.

  3. Thanks for your feedback.

    Well I am eagerly waiting for your tutorial then. Would really appreciate those tips, I’m hitting kind of a ceiling here.

    Also: How do you make sure that commands are recognised from different positions in the room? I have the issue that the command which was recorded directly next to the Mic, gets also only recognised next to the Mic.

    Cheers

  4. Hope I get some time next weekend…

    In regards to different positions in a room: depends a lot on your microphone, where the mic is located and configuration. For example if the mic offers an omnidirectional mode and is located in the center of the room this would be my preferred setup. Of course the training and recognition goes hand in hand. If you have a mic that covers only one direction and you train very closely to the mic but in reality you want results from different angles and distances you choose either lower similarities, use a different mic, tweak config options or train the same word more often under different conditions. There is no silver bullet I can offer here so let’s see if and how the next tutorial shed some light into this topic 🙂

    • could you help me in my project i am doing a project that i control a car with voice with 9 voice commands
      1-right
      2-left
      3-slow
      4-fast
      5-foreward
      6-backward
      7-open
      8-close
      9-stop
      i want it to do this function i use h-bridge l29 driver

      • Hi. Not sure what you expect in terms of help but let’s give it a try and start with the basic steps:

        1) Get a mic and configure it
        2) Install sopare and resolve the dependencies
        3) Check your mic input levels via ./sopare.py -l -v and adjust the config settings accordingly
        4) Train your 9 commands, each word at least 3 times via ./sopare.py -t YOUR_VOICE_COMMAND
        5) Compile the dictionary: ./sopare.py -c
        6) Test if the words are recognized and adjust the config settings for good results (see the blog post about precision and accuracy)
        7) Get some experience by checking the commands under different conditions and with background noise. Also what happens if you give the commands from far away or different angles. Adapt and configure based on your practical knowledge. You may need to do a GOTO 4 or even consider a different mic and GOTO 1 😉
        8) Copy the sopare print plugin and write your very own h-bridge l29 plugin by checking for the command and then turn the GPIO pins low/high according to the action. I recommend to add some timeout logic.

  5. Hi
    According to the reference above-
    If i detect ‚forward‘,i should define
    if readable_result == ‚forward‘:
    Turn on GPIO HIGH
    delay
    Turn off GPIO LOW.

    please correct me if i am wrong.

    Regards

  6. Hello

    It would be great,if you explain how to control RPI.GPIO with the keyword train.
    Can you show with an example to control LED through train Voice command,like after training the keyword if ‚OFF‘ is detected it will switch off the led light

    regards

  7. Hi,

    not sure if you saw the robotic arm control source code:
    https://github.com/bishoph/Misc/blob/master/robotic_arm_control.py

    To control GPIO works the same. Here is a simple plugin example with „on“ and „off“ (not tested!):

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    
    import RPi.GPIO as GPIO
    
    GPIO.setmode(GPIO.BOARD)
    
    LED = 11
    GPIO.setup(LED, GPIO.OUT)
    
    def run(readable_results, data, rawbuf):
        if(len(readable_results) == 1 and readable_results[0] == 'on'):
            GPIO.output(LED, GPIO.HIGH)
        elif(len(readable_results) == 1 and readable_results[0] == 'off'):
            GPIO.output(LED, GPIO.LOW)
    

    Have fun!

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.