Raspberry Pi and offline speech recognition

Yes, I must admit the test phase went longer than initially thought. But good things take time, right. When one develops a speech recognition software or a pattern detection system stuff can go horrible wrong and the learning curve potentiates at some point.

But anyway, what the hell am I talking about? In a nutshell about SoPaRe. Or the SOund PAttern REcognition project. With Sopare and a Raspberry Pi (technically it works on any Linux system with a multi core environment) everybody can voice control stuff. Like lights, robotic arms, general purpose input and output…offline and in real time.

Voice, speech, microphone, Raspberry Pi, Sound Pattern Recognition with Sopare, voice control a robotic arm

Even without a wake up word. The local dependencies are minimal. Sopare is developed in Python. The code is on GitHub. Cool? Crazy? Spectacular? Absolutely! I made a video to show you whats possible. In the video I control a robotic arm. With my voice. Running Sopare on a Raspberry Pi. In real time. Offline. And it is easy as eating cake. You are pumped? So am I. Here it is:

I may prepare some more tutorials about fine tuning, increase precision and the difference between single and multiple word detection. Let me know what you are using Sopare for, what’s missing or about potential issues.

Happy voice controlling. Have fun 🙂

34 thoughts on “Raspberry Pi and offline speech recognition

  1. Hello Martin!

    Is it possible, too for sopare to learn non-speech patterns? I’m thinking of monitoring our coffee maker at the office, which depending on the brew has his distinguished melody. 😉

    Kind regards,
    Sebastian

  2. Hi Sebastian,

    yes, that should be possible. Currently many settings in the config are optimized for human voice, like the LOW_FREQ and HIGH_FREQ. But you can configure sopare to filter specific frequencies and for example spend more attention to the dominant frequency (f0) or to the wave model instead of the FFT result. It’s also possible to train really rough (MIN_PROGRESSIVE_STEP and MAX_PROGRESSIVE_STEP) which could help if the sound is not really repeatable. I think the hard part is to „separate“ the coffee machine specific sound … but sounds like fun and you should give it a try 🙂

  3. Hi Martin,
    thank you for your nice idea and project, great work. Voice recognition independent from Google&Co is great for an autonomous project. I’d like to try it, could you please add a little manual for dummies? Would be nice! I’m not so sure how to handle all the files …
    Kind regards,
    Markus

    • Hi Markus,

      great that you like the project 🙂 The current plan is to provide a short manual „how to start from scratch“ within the next two weeks. So please stay tuned!

      • Hi Bishoph,
        such a tutorial would also help me a lot.
        For example, I don’t know where I shall create the folders „samples“ and „tokens“. is mkdir samples the correct command? and then how can I start it?
        for me, recording a very basic „how to start from scratch“ video from installation to learning the first word would be super useful.

        thanks

        • Hi. I really thought that the current available information are enough, sorry for that – I’ll do another very simple one with all steps to start incl. the git/bash command that are required. But don’t expect this very soon 😉

          In the meantime, it’s indeed
          „mkdir samples“
          and
          „mkdir tokens“

          In the latest testing branch I’ve added the section „Quick start, useful commands and usage examples“ to the readme.md file:

          https://github.com/bishoph/sopare/tree/testing

          Hope that gets you going. If you still have questions please ask me as it helps me to identify what content is currently missing and what steps are unclear.

          Have fun 🙂

  4. Thank you very much for sharing this project.
    It is very interesting and works without any effort on Orange PI LITE.

    Question: Can take the pattens from files already generated in .wav format?
    This would avoid having to use the microphone to capture each of the sounds.

    Greetings from Mallorca.

    • You are welcome and great to hear that it works for you on an Orange PI LITE!

      In regards to your questions: training from „wav“ files is currently not supported. But there is an option to store recorded input:
      ../sopare.py -w samples/test.raw
      and the recorded file can be used for training or testing:
      ./sopare.py -r samples/test.raw – t test -v

      As this options are already available, it should be possible to create a converter to transform „wav“ to „raw“ files, to achieve what you are asking for. I’ve added this to my to-do list, but I remember something that there was an issue converting „wav“ files so it’s not guaranteed that this becomes a feature 😉

      Saludos

      • Thank you for answering so quickly!

        With your suggestion I have found this great tutorial where it explains how to do the conversion
        https://www.hellomico.com/getting-started/convert-audio-to-raw/

        After that, I have using:
        ./sopare.py -r samples/test.raw -t test -v
        ./sopare.py -c
        But there are no changes in the dict.json file so it does not recognize the new „test“ pattern.
        There is any way to attach or send the test.raw so that you take a look and validity this method of conversion?

        Thanks in advance!

      • I have also tried:

        ./sopare.py -w samples/test.raw
        ./sopare.py -r samples/test.raw -t test -v
        ./sopare.py -c
        The same, no changes.

        But, if you open the test.raw file generated by sopare in audacity, the format is Signed 16 bit PCM – Little-endian – 1 Channel (MONO) and the audio is correct, same like generated by the conversion from wav to raw.

        • Hmm, what’s possible is that the volume is below the THRESHOLD and therefore „test“ doesn’t show up in the dict.json file as it was never learned, but this is just a wild guess. I tested again and as expected, everything works in my environment 🙂

          I would like to suggest to move this thread and the issue we are discussing to GitHub, as you can easily attach files, track the issue and I can close it when its done: https://github.com/bishoph/sopare/issues/new

          BTW: You can check what’s already learned with the command:

          ./sopare.py -o

          The output is a string pair which consists of the ID and a UUID, which corresponds with the a JSON file in the dict/directory (even if the suffix pretends something different) …

          In the case you want to see only unique learned IDs, this command chain is quite handy:

          ./sopare.py -s '*' | sed 's/[^a-z].*//' | sed 's/\//g' | grep -v '^$' | sort | uniq

          Thanks y un saludo 🙂

  5. Hi Martin:

    I am very interested in getting SoPaRe working on a Raspberry Pi 3 with Rasbian Jessie O/S so that I can use it to create a voice controller for the trolling motor on my fishing boat. It sounds like just the thing I need and appreciate your efforts and that you shared it.

    I went through a fresh Rasbian Jessie install on a 16Gb SD card, performed the update to the O/S, cloned SoPaRe and installed all the dependencies listed on your web link describing what to install. But when I try:

    $ sopare/sopare.py -l -v

    from my home directory, after about 5 seconds I get:

    sopare 1.3.0

    I try speaking into the microphone and get no response. I confirmed that the microphone is working through other software. I tried moving the threshold value down from 400 to 200 and then down to 10 but still no response. After about 10-15 seconds with no response, I get:

    Segmentation fault

    So, I’m at a loss as to what I did wrong. Do you have any insight into what could be wrong? Is there any way to get more detailed diagnostics that might point me toward what the problem might be? Is there any information that I could generate and send you to diagnose the issue?

    Thanks!

    Jeff

    • Hi Jeff,

      a segmentation fault means that something went wrong in regards to memory access. As SOPARE is pure python the issue must be located somewhere else. My guess is that pyaudio is the culprit as there are others with a similar problem and without SOPARE:
      https://www.raspberrypi.org/forums/viewtopic.php?f=32&t=77696

      Unfortunately, I have no clue about the root cause and a quick search does not show any steps how to solve this.

      Please share your sopare/config.py and your sound card model/name and I try to reproduce the issue. You may want file a new GitHub issue as this forum is not very handy in terms of file uploads and bug tracking: https://github.com/bishoph/sopare/issues

      Let’s see what we can do!

  6. Great work! This sounds like the software I was looking for for quite a while now.

    Just to be sure about my plan/whish:
    Could it be possible to set up a Raspberry with this software, attach a microphone to the Raspberry, plug the Raspberry via USB to a PC and (for example) say „e, space, k, k, p, l, enter, 1, 6, s“ and the Raspberry acts like a keyboard?

    Could I combine this with a footpedal to activate/deactivate the mic?

    Could I use my own commands for different keystrokes? For example „extract“ triggers keystroke „e“, „delete“ triggers shortcut „strg x“.

    Could it talk to a specific software via the API?

    Thanks and please excuse my noobness. This is completely new to me. Due to a condition with my wrists I am looking for alternatives to use a software.

  7. Hey Thorsten,

    you want a remote voice controlled keyboard, right. Some of your use cases are technically feasible, like turn on/off the mic via a pedal (you could leverage the GPIO interface) but I have no clue about the Raspberry – PC bridge to be honest.

    If you want to give it a try and develop this by yourself I would start with a proof of concept of the most important stuff and see how far you can go. See, SOPARE is a tool that that was designed to learn certain sounds and to make predictions for sound input. With the simple plugin interface you can assign any further logic to the given predictions – including shortcuts and alike. In theory this could work even if this sounds like something SOPARE was not designed for in the first place 😉

  8. Hello Martin,

    I am going to build a robot from scratch and one of my ideas is, to control the boy with my voice.One of the first steps is, to get the voice modul running,

    So I tried your solution and iI get these messages;

    ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition ‚cards.bcm2835.pcm.front.0:CARD=0‘
    ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
    ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
    ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM front
    ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
    ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
    ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
    ….

    Any Idea, whats going wrong?

    Sorry for my bad englisch.

    Regards

    Michael

    • Hi Michael,

      this is just some information that you should (not a must as there is no real error) clean up your alsa.conf. I recommend to simply comment out the lines that appear when you start SOPARE.

      Simple edit your also.conf with your favorite editor:

      sudo nano /usr/share/alsa/alsa.conf
      

      and place comment out some lines like this one:

      #pcm.front cards.pcm.front
      

      Save the file and info lines will be less or even gone.

      Have fun!

  9. Hi Martin,

    thanks a lot for your quick answer, as the messages only a informational character, I think this Lines will show more interesting information, as i startet the command ./sopare.py -t „test“

    Cannot connect to server socket err = No such file or directory
    Cannot connect to server request channel
    jack server is not running or cannot be started
    JackShmReadWritePtr::~JackShmReadWritePtr – Init not done for -1, skipping unlock
    JackShmReadWritePtr::~JackShmReadWritePtr – Init not done for -1, skipping unlock
    Traceback (most recent call last):
    File „/usr/lib/python2.7/multiprocessing/queues.py“, line 268, in _feed
    send(obj)
    IOError: [Errno 32] Broken pipe

    The unit test did not show any errors:

    sopare 1.5.0
    starting unit tests…
    starting analyze tests…
    analyze test preparation…
    testing analyze get_match…
    testing normal conditions (1)[u’test1′] == [u’test1′]
    testing normal conditions (2)[u’test1′, u’test3′] == [u’test1′, u’test3′]
    testing normal conditions (3)[u’test1′, u’test3′, u’test2′] == [u’test1′, u’test3′, u’test2′]
    testing leading space [u’test1′, u’test3′, u’test2′] == [u’test1′, u’test3′, u’test2′]
    testing ending space [u’test1′, u’test3′, u’test2′] == [u’test1′, u’test3′, u’test2′]
    testing correct order [u’test1′, u’test3′, u’test2′, u’test1′, u’test3′, u’test2′] == [u’test1′, u’test3′, u’test2′, u’test1′, u’test3′, u’test2′]
    testing strict length [u’test1′, u’test3′, “, u’test2′] == [u’test1′, u’test3′, “, u’test2′]
    testing false leading results [“, u’test1′, “, u’test2′] == [“, u’test1′, “, u’test2′]
    analyze tests run successful.
    filter test preparation…
    testing filter n_shift…
    testing n_shift [5, 6, 7, 8, 9, 10, 11, 12, 13, 14] == [5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
    testing n_shift [15, 16, 17, 18, 19, 20, 21, 22, 23, 24] == [15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
    testing n_shift [25, 26, 27, 28, 29, 30, 31, 32, 33, 34] == [25, 26, 27, 28, 29, 30, 31, 32, 33, 34]
    filter tests run successful.
    unit_tests run successful!
    done.

    I am running Raspbian 9

    Thanks for your help

    Michael

    • Hi Michael,

      happy to help. There is still no error you should worry about. The broken pipe is unlovely – maybe you can get rid of this one by changing the config: try to find out a good „THRESHOLD“ for your environment by starting the audio test „python test/test_audio.py“. After changing the config you should be able to train with the default values and get quick results. After you got first results teak more for better results.

      Have fun and happy voice controlling!

  10. Hello Friend
    I am happy to say hello and I do it with the intention of talking about your beautiful project „sopare“ is very interesting and I would like to know more about it, I hope it is not your dislike

  11. I apologize I’m new to this
    it shows me the following error
    Could you help me with the solution?
    Traceback (most recent call last):
    File „./sopare.py“, line 23, in
    import sopare.util as util
    File „/dev/sopare/sopare/util.py“, line 29, in
    from scipy.io.wavfile import write
    ImportError: No module named scipy.io.wavfile

  12. Hello Martin,

    I am still a first year student and not so geek in software and programing, we are to build an Arduino car and to make park it self automatically. All of that is done, but, we want to add more features on it and one these features are to control the car with voice, so we have an Arduino on the car and we want to put a raspberry pi 3 to communicate with the Arduino, is that anything possible to be done with your project?

    best regards, Amjad

    • I don’t know your hardware and software stack so I can’t give you any advice here. But I want to make a statement: Even if I’m a geek and a nerd I would not want to drive a car that starts a parking action by voice command. Anyway, if you are talking about a toy car or a prototype and want to see how far you can go and learn about odds just give it a try and have fun 🙂

  13. Incredible work! I really appreciate the simplicity of it. I would just like to clarify a few things. Is this how the process should go to train a new word?

    – Run command – ./sopare.py -v -t test
    – Actually speak into the microphone and repeat the word „test“ until it stops
    – Do I repeat the above 2 actions more than once? (as you can see in my CLI log below, that is what I did)
    – To train a second word, do I just run the same command again and perform the same steps?
    – ./sopare.py -v -t hello

    If that is correct, when I started the endless loop and speak the words „Test“ and „Hello“… it doesn’t appear to recognize anything as the [] results that are returned are blank. Should they show the words „test“ and „hello“ in them if it recognizes the words?

    Thank you for everything you’ve done!

    pi@raspberrypi:~/dev/sopare $ ./sopare.py -c
    sopare 1.5.0
    recreating dictionary from raw input files…

    pi@raspberrypi:~/dev/sopare $ ./sopare.py -o
    sopare 1.5.0
    current entries in dictionary:
    hello 8e84d433-90de-4744-b170-35eaaa1baf34
    test c23b527e-6aaa-4df8-a080-90305a3b1007
    hello 18190a99-a6cd-4194-8e6f-f972395921f8
    hello b29dfece-4333-4fed-81e4-d90772d38103
    test d042f532-6033-47b5-8462-69a1da0594e4
    hello 54ee4fb8-f288-4028-9b9a-7a32897cba40
    test 63653068-9b26-4b30-80a5-b4960c521953
    hello 500e526a-98b0-4e5a-b377-f79252b8299b
    test b0e18dfa-cf51-484d-811c-57c5f272dc58

    pi@raspberrypi:~/dev/sopare $ ./sopare.py -l
    sopare 1.5.0
    *** BUNCH OF ALSA WARNINGS WERE HERE ***
    Cannot connect to server socket err = No such file or directory
    Cannot connect to server request channel
    jack server is not running or cannot be started
    JackShmReadWritePtr::~JackShmReadWritePtr – Init not done for -1, skipping unlock
    JackShmReadWritePtr::~JackShmReadWritePtr – Init not done for -1, skipping unlock
    []
    []
    []
    []
    []
    []
    []
    []

    • Glad that you like SOPARE and the overall concept 🙂

      Q: „Do I repeat the above 2 actions more than once?“
      A: „Yes, I do this normally 3 to 5 times for one id/word“

      Q: „… do I just run the same command again and perform the same steps?“
      A: „Yes, exactly“

      Q: „Should they show the words „test“ and „hello“ in them if it recognizes the words?“
      A: „Yes, the recognized id(s) shows up as text in the brackets“

      I recommend to follow this blog post until you get results: and after that you can fine tune:

      Have fun 🙂 !

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.