Amazon Echo to HA Controllers

IVB · May 30, 2015

btw i would argue that the statement isn't "not many will do a whole platform switchover to gain voice functionality", its

"not many will do a whole platform switchover to gain far-field voice functionality given the current state viability of near-field voice functionality".

I have near-field voice functionality right now. RIGHT NOW. What I don't have is far field. Not even remotely close to the uplift needed to swap out systems.

IVB · May 30, 2015

jkmonroe said:
You're like a kid standing next to a fire pit with a bottle of lighter fluid, aren't you?

:axe:

jkmonroe · May 30, 2015

ChrisCicc said:
That's what Thread and Weave etc are hoping to ultimately accomplish. Until then, it's proprietary.

Internally, we're having discussions how far do we want to take it. Should we build plugs in Control4 and the like?

I don't know anything about your company beyond a quick read of your website and demo attempt.

But I will tell you that if you offered your Kinect VR software with the ability to have commands send HTTP triggers, and priced it right, I would probably buy it. Package something up that does nothing but accept VR input from a Kinect, plugs into something like a RasPi2, and spits out custom HTTP.

I already have a Kinect and RasPi2, so I'd happily hand over $50 for a license of the VR to HTTP software. And if it doesn't work (not saying it won't, just being illustrative), then who cares because it was only $50. If it does work, and I need 4 Kinect's, then it looks like you get $200 instead of $50.

I should really do this for a living.

IVB · May 30, 2015

jkmonroe said:
I don't know anything about your company beyond a quick read of your website and demo attempt.

But I will tell you that if you offered your Kinect VR software with the ability to have commands send HTTP triggers, and priced it right, I would probably buy it. Package something up that does nothing but accept VR input from a Kinect, plugs into something like a RasPi2, and spits out custom HTTP.

I already have a Kinect and RasPi2, so I'd happily hand over $50 for a license of the VR to HTTP software. And if it doesn't work (not saying it won't, just being illustrative), then who cares because it was only $50. If it does work, and I need 4 Kinect's, then it looks like you get $200 instead of $50.

I should really do this for a living.

Meaning, in our case since we own CQC but anyone else could do whatever, Kinect->COS->HTTP sent->CQC Web Server->CQC->Device? So, for $50 Kinect + $50 COS, $100 total, there's VR in each room? And can each instance of COS be linked for single programming location? Totally, there's your abstraction layer.

BTW, this is where I see Taskers achilles heel: No easy cloud component to share programming across devices.I have to export, manually move to GDrive, pull down from GDrive on new device, restore. And it usually takes a few attempts.

If you really want to be pushing the edge, create an android version of COS that accepts voice input from Android devices and pushes out http commands. And, the ability to share configs between android apps and the windows instance.

And then in 2016, you can write a linux version, so we don't have to spend $100 on a windows license and it can run on an rPi.

There's zero chance CastleOS can catch up to home automation packages in terms of devices under control. Let them do what they do best, focus your efforts on voice recognition.

pete_c · May 30, 2015

The Kinect is a Microsoft product and will not work with Linux. It is also a product that does a bit more than TTS and VR.

There is testing going on with Wintel on the RPi2. Well too you can triple boot the Intel Baytrail to Windows, Linux or Andoid.

Here I have only puchased Kinects and tested the Windows Drivers.

The new Windowss lite is very reasonably priced but not free yet.

BTW the automation mothership here is Linux. The remote plugins run in whatever OS.

The nice part os software is that you can mix n match hardware products to suit your automation needs.

jkmonroe · May 30, 2015

IVB said:
Meaning, in our case since we own CQC but anyone else could do whatever, Kinect->COS->HTTP sent->CQC Web Server->CQC->Device? So, for $50 Kinect + $50 COS, $100 total, there's VR in each room? And can each instance of COS be linked for single programming location? Totally, there's your abstraction layer.

BTW, this is where I see Taskers achilles heel: No easy cloud component to share programming across devices.I have to export, manually move to GDrive, pull down from GDrive on new device, restore. And it usually takes a few attempts.

If you really want to be pushing the edge, create an android version of COS that accepts voice input from Android devices and pushes out http commands. And, the ability to share configs between android apps and the windows instance.

And then in 2016, you can write a linux version, so we don't have to spend $100 on a windows license and it can run on an rPi.

There's zero chance CastleOS can catch up to home automation packages in terms of devices under control. Let them do what they do best, focus your efforts on voice recognition.

Yep. Kinect -> COS -> HTTP Sent -> CQC HTTP Trigger -> Device

pete_c · May 30, 2015

It is that COS piece that is dependant on the QOS of the transport. As previously stated the more transport layers the slower the response times. Well too you have transaction times in the COS.

Slow or fast are relatve to the user/client/Internet transport/transaction times.

Securifi is using Amazon servers like the Amazon Echo. The Amazon Cloud is very fast. The connection you have to the transport can vary.

jkmonroe · May 30, 2015

For VR in the scenario described above, all transactions would be local. Well within reach of low latency delays and instant-enough reaction. I'd guess the entire chain would be under 500ms depending on the speed of processing for the COS bit.

pete_c · May 30, 2015

The bencmark of less than 1/2 of 1 second is OK but not soup.

wuench · May 30, 2015

All we have to do is convince Dean to dig into the Speech SDK for his next endeavor.

I looked at the Microsoft Speech Macros, etc. But to do it right you really want to be able to pull down a list of capabilities to build your grammar trees from the HA system, things like light names, movies in your library, etc. I think the good news is that with the V2 driver architecture that puts CQC in a good position to do something like that, since a "light", for example, is now a specifically defined thing across drivers. As far as Kinect, you just need to pick that as your mic. As far as I can tell, most of the special features of Kinect are abstracted from the developer.

The examples I have seen make the coding look pretty simple, I think the hardest part are in the grammars and integration.

DeLicious · May 30, 2015

I'm pretty sure the Alexa app kit can make these http calls. As it stands right now (at least while it's in beta), it sends http requests to a service you define at an endpoint you define.

However, I don't think this is quite what you're talking about. It sounds like you guys are proposing generic http requests that something else would have to interpret and translate to something meaningful. However, this leaves Amazon in the position of waiting for someone else to integrate and consume the generic http requests in order to be useful, which I'm certain is not a position a company like Amazon wants to be in. I think this is why they are trying to integrate directly with devices like hue, so something useful is built in at release time. If another service acts like a hue bridge and interprets the request differently, that's a step in the direction I think you guys are talking about, but it's not truly a generic interface as you would like.

So from that perspective, what can we realistically expect from Amazon? Should they define a generic requests and contract with third party systems to utilize those, trying to make them a De facto standard? Or should they continue to integrate directly with device api's?

ChrisCicc · May 30, 2015

jkmonroe said:
I don't know anything about your company beyond a quick read of your website and demo attempt.

But I will tell you that if you offered your Kinect VR software with the ability to have commands send HTTP triggers, and priced it right, I would probably buy it. Package something up that does nothing but accept VR input from a Kinect, plugs into something like a RasPi2, and spits out custom HTTP.

I already have a Kinect and RasPi2, so I'd happily hand over $50 for a license of the VR to HTTP software. And if it doesn't work (not saying it won't, just being illustrative), then who cares because it was only $50. If it does work, and I need 4 Kinect's, then it looks like you get $200 instead of $50.

You can do that right now, and the best part, it's free! You can define any custom command, and send any custom HTTP calls you'd like. This can easily accomplish setting a scene, "house start movie mode" and have it send calls all over the place.

The next major release of CastleOS is coming with a new API. (Which is in testing now.) That will offer two new options.

First, you can create a virtual device, and have it send any custom command when the various commands are call (i.e. on, off, dim, etc).

Second, you can create a protocol driver to any system you'd like. So you can connect it to CQC and pull in all the devices, groups, scenes, etc. Then, when the CastleOS app or voice command is issued, it's routed to CQC.

All of the internal drivers have also been ported to use the new API. It's very powerful and flexible.

IVB said:
Meaning, in our case since we own CQC but anyone else could do whatever, Kinect->COS->HTTP sent->CQC Web Server->CQC->Device? So, for $50 Kinect + $50 COS, $100 total, there's VR in each room? And can each instance of COS be linked for single programming location? Totally, there's your abstraction layer.

BTW, this is where I see Taskers achilles heel: No easy cloud component to share programming across devices.I have to export, manually move to GDrive, pull down from GDrive on new device, restore. And it usually takes a few attempts.

If you really want to be pushing the edge, create an android version of COS that accepts voice input from Android devices and pushes out http commands. And, the ability to share configs between android apps and the windows instance.

And then in 2016, you can write a linux version, so we don't have to spend $100 on a windows license and it can run on an rPi.

There's zero chance CastleOS can catch up to home automation packages in terms of devices under control. Let them do what they do best, focus your efforts on voice recognition.

The issue is whether you want natural language processing or not. If you just want to issue a fixed command and get a custom response, that's easy. Natural language processing within the context of home automation is hard, and for that CastleOS needs a full protocol driver.

CastleOS is already Linux compatible. We're in the process of removing two Windows only dependencies and will be releasing a Mac/Linux version as soon as Microsoft makes the new runtimes live. That will work on RPI2 too. However, the Kinect voice control itself will always be limited to Windows only. That doesn't mean you can't use the Android or Cortana voice interfaces though...

jkmonroe · May 30, 2015

DeLicious said:
I'm pretty sure the Alexa app kit can make these http calls. As it stands right now (at least while it's in beta), it sends http requests to a service you define at an endpoint you define.

However, I don't think this is quite what you're talking about. It sounds like you guys are proposing generic http requests that something else would have to interpret and translate to something meaningful. However, this leaves Amazon in the position of waiting for someone else to integrate and consume the generic http requests in order to be useful, which I'm certain is not a position a company like Amazon wants to be in. I think this is why they are trying to integrate directly with devices like hue, so something useful is built in at release time. If another service acts like a hue bridge and interprets the request differently, that's a step in the direction I think you guys are talking about, but it's not truly a generic interface as you would like.

So from that perspective, what can we realistically expect from Amazon? Should they define a generic requests and contract with third party systems to utilize those, trying to make them a De facto standard? Or should they continue to integrate directly with device api's?

Why does it have to be an either/or proposition? Go ahead and keep building out native compatibility, but also give us a way to do some things on our own.

pete_c · May 30, 2015

CastleOS is already Linux compatible

@Chris; does that mean that CastleOS will run using Mono and will it also run on iOS using Mono?

Just recently noticed that my favorite weather program (Cumulus) written for Wintel now runs too on Linux / iOS using Mono and HomeGenie does the same. Personally it (Mono) does run faster than say Openhab with also will work on Wintel, Linux or iOS.

I am currently using MS SAPI on a Wintel embedded PC running whatever speech fonts (one is running a Portuguese font to bug my wife). The Linux automation mothership just tells the client to say this or that (and VR) and it works fine in this manner. It is quick on an Intel Atom based tabletop touchtablet. The step up to just the microphone is using the Kinect here. (Baytrail, Kinect and dual touch capacitance screen combo).

I look at the Amazon Echo Alexa to provide an alternate automation methodoloy of TTS/VR (Cortana) to my mobile phone.

Well too stuck with only the Alexa voice font though at this time. (I didn't look to see though if there are other voice fonts).

ChrisCicc · May 30, 2015

pete_c said:
@Chris; does that mean that CastleOS will run using Mono and will it also run on iOS using Mono?

The reason we never ported to Mono to start with is because we used WCF, and mono hadn't added support for it. However now that Microsoft has pledged full support for Linux, Mac, and Mono, we're good to go as soon as it launches.

The other two dependencies are easily fixable, but the WCF was a deal breaker. Now, as a .NET app, platform doesn't matter. #MicrosoftRules #DevelopersDevelopersDevelopers

Amazon Echo to HA Controllers

IVB

Senior Member

IVB

Senior Member

jkmonroe

Active Member

IVB

Senior Member

pete_c

Guru

jkmonroe

Active Member

pete_c

Guru

jkmonroe

Active Member

pete_c

Guru

wuench

Senior Member

DeLicious

Active Member

ChrisCicc

Active Member

jkmonroe

Active Member

pete_c

Guru

ChrisCicc

Active Member

Similar threads