Amazon Echo to HA Controllers

mdesmarais said:
How does the Echo do with distinguishing a voice if there is music playing that is NOT sourced from the Echo?
 
It's mixed.  We have a distributed audio system that uses ceiling speakers connected to Sonos.  For a lot of music, the Echo hears me fine.  But for music with strong and clear vocals (like country) the Echo simply doesn't understand what I'm saying.  If I look at the history, it's clear that it is picking up lyrics from the audio.
 
I wish there was a way to tie into the receipt of the "Alexa" keyword and send an external command to pause my Sonos as soon as the keyword is recognized (i.e., when the blue ring is lit up).  Eventually I suspect this will be possible with the API and bridge software, but I haven't found a way to do this yet.
 
Edit: To be clear, the Echo almost always picks up the keyword when music is playing, it just has trouble interpreting the subsequent command.  Hence, my desire to pause (or mute) the music after the keyword is acknowledged.  This is what the Echo does when you're playing music from the echo itself.
 
Geez just started the jar file (emulator) on the RPi2 and CPU spiked.  (see it before today).  Never paid attention.  Will move it to the Haswell iSeries automation mothership.
 
IE:
 
java -jar -Djava.net.preferIPv4Stack=true amazon-echo-bridge-0.1.3.jar --logging.level.com.armzilla.ha.upnp=DEBUG --logging.file=ha.log --upnp.config.address=xyz
 
CPU 108 %
 
It is staying there right now.  Well I can run it on another box.
 
Thanks to the interest generated by this thread and very positive reviews & future potential, I have just put an order for an echo, should arrive next week.
 
I have some experience with speech recognition using Microsoft SAPI & speech recognition APIs (eg. grammar construction for command/control). I have also done simple phrase recognition for PIC microcontrollers using Dynamic Time Warping (sort of like fuzzy logic, find closest pattern match for short phrases comparing against templates) which was fun, informative and worked well especially considering the limitations (1K RAM!). The PIC micro listened for the control phrase (like Alexa for the echo) then switched the local mic to the server for phrase recognition & speech synthesis. This setup worked well and I had recognition grammars setup to play music, control lights. That was the old system, I'm prototyping a new system to replace the PIC micro with a more capable Raspberry Pi using Windows 10 & Cortana (once available on Win10 IoT).
 
I'm interested to see how the Echo works and if it possibly could replace what I'm planning with Win10 IoT.
 
 How does the Echo do with distinguishing a voice if there is music playing that is NOT sourced from the Echo?
 
Sometimes my kids blast the Echo itself at the highest volume and I cannot get to respond via voice (crazy kids), so I have to manually get its attention.  
 
I use the Echo for basic control of my Russound/Sonos system (turn on Russound zones, select and play Pandora stations on Sonos, etc.).  We don't typically blast music through my whole house audio - it's just for casual listening.  I have no problem getting the Echo to hear me over it or anything else of normal volume (TV, etc.).  If I'm having a party with lots of people talking loudly it's sometimes a challenge if I'm not close to the Echo but usually works fine.
 
 
Thanks to the interest generated by this thread and very positive reviews & future potential, I have just put an order for an echo, should arrive next week.
 
 
Did you get one at the $129 Prime Day deal?  I was almost tempted to get a 3rd Echo at that price but I restrained myself.
 
deandob said:
I have some experience with speech recognition using Microsoft SAPI & speech recognition APIs (eg. grammar construction for command/control). I have also done simple phrase recognition for PIC microcontrollers using Dynamic Time Warping (sort of like fuzzy logic, find closest pattern match for short phrases comparing against templates) which was fun, informative and worked well especially considering the limitations (1K RAM!). The PIC micro listened for the control phrase (like Alexa for the echo) then switched the local mic to the server for phrase recognition & speech synthesis. This setup worked well and I had recognition grammars setup to play music, control lights. That was the old system, I'm prototyping a new system to replace the PIC micro with a more capable Raspberry Pi using Windows 10 & Cortana (once available on Win10 IoT).
What kind of microphone are you using?
 
To simplify, I could imagine talking into a wristwatch microphone and having it send the audio to a server for processing.  I can see how that might be preferable (and cheaper) than putting an echo in every room.
 
Hi NeverDie,
 
I used a simple electret mic but I did build a special preamp as part of the solution that limited the amplification to just voice frequencies, used a threshold gate (only let through signals that were loud enough to be voice) and also an automatic gain control - all done via opamps. As well, used a 'boundary' mic technique that makes the mic more sensitive. Still a simple solution but I could get voice recognition across a large room with 80+% success (not as good as Echo / Kinect but pretty good for all DIY solution).
 
My current plan is to have wall mounted RPi2's with a similar audio/mic circuit/setup (and capacitive touchscreen) but I will see how the Echo works, it may end up being a superior overall solution and Amazon would love you to buy one per room!.
 
MikeB said:
... I use the Echo for basic control of my Russound/Sonos system (turn on Russound zones, select and play Pandora stations on Sonos, etc.).  ...
 
Are you using one of the emulator hacks?  I'd like to hear how you've done this.
 
My Echo's get here tomorrow, so I'm going to have a fun weekend.  :)
 
Here have disabled the emulator hack from running on the RPi2. It ran fine but I noticed RPi2 was on 4's 100% plus  (over 100%) utilization.
 
Might put it on bigger HA mother-ship which runs a Haswell iSeries and has16Gb memory.
 
 Are you using one of the emulator hacks?  I'd like to hear how you've done this.
 
My Echo's get here tomorrow, so I'm going to have a fun weekend.   :) 
 
Yes, I'm running the Hue emulator on a Windows server and using it to send REST commands to my ISY which is triggering my INSTEON devices, Russound system, Sonos, etc..  Absolutely love it!
 
Re: Amazon Prime day to purchase Echo, no, I'm not a prime member so paid full price. I figured if I don't like it I can always e-Bay it for close to the purchase price (checking ebay there doesn't seem to be a lot of echos for sale at a reasonable discount).
 
Thinking the price will be dropping again soon.
 
Here I am redoing my "just for the Amazon Echo" stuff so shut the Echo off a couple of days ago....baby steps...building a sort of virtual clean room environment for it with it's own internet connection (all by itself) as I have done for other internet connected products.  What I tested so far works which I like.  Now I am wanting to work in a particular manner to access the automation mothership as I do prefer my zoned Russound audio and my collection of SAPI voice fonts to that of the Amazon Echo (sounds nice though).
 
Back to the Kinect stuff for a bit as I have all of the hardware pieces installed but haven't tested the functionality of it....might as well...here using a Baytrail mini PC running W81lite / HDMI to a multitouch screen and one Kinect (purchased a few to play with).
 
Here too have always played with TTS/VR (started a bit in the 80's with this stuff).  Did at one time use VR with MS SAPI for my automation mothership.  Shut off the VR stuff over the years though as it became a bit of a PITA.  I did leave the TTS stuff on and today utilize it much with my old automation mothership(s).
 
ChrisCicc said:
The Kinect shouldn't be included in that list. Unlike everything else in the list, it doesn't record data and doesn't send it to the cloud.
 
According to the EPIC document (link from the article):
 
"Microsoft’s “always on” voice and motion recorder, called Kinect, is now installed in its
Xbox videogame consoles. 16 The Kinect sensor tracks and records users’ voice and hand gestures
when users say the word “Xbox” followed by various permissible command options. 17 For
example, users may turn on their Xbox console by saying, “Xbox on.” 18 In order to accomplish
this, the Xbox console monitors conversations taking place around it, even when Xbox is turned
off. 19 The Xbox console can also register users’ faces using the Xbox camera as well as record
users’ facial expressions and biometric data such as heartbeat rate. 20"
 
I cannot prove or disprove this statement but it got me thinking....
 
picta said:
And a view from another perspective on this (and other) wonderful toys:
 
https://finance.yahoo.com/news/generation-genuinely-creepy-electronic-devices-221000672.html
 
LOL at the guy who really thinks we have any privacy left.  That was lost 3.5 minutes after 9/11. I have friends who left the DoJ in 2004, they joke that they were doing significant stuff then, in the last 10 years they can only shudder.
 
My view: The government already knows all, sees all. Corporations have already purchased both arms of the government, and have us squabbling over distractions. I may as well get some modicum of enjoyment out of the cloud.
 
Back
Top