File Submitter: etc6849
File Submitted: 21 Jun 2013
File Category: Premise
Author: John in VA
Contact: John in VA
Speech Recognition for the Premise Home Control System
Premise SYS provides an excellent framework for controlling objects in your home from a variety of input channels. It’s architecture abstracts the (sometimes) complex low-level interactions with devices in your home by providing a relatively small number of high-level commands. Those commands can be issued by many different types of user interface devices (browser, touch-pads, etc.). SpeechCommander adds the speech interface channel to Premise SYS.
Speech is the most natural interface mechanism for human beings. The primary design goal for SpeechCommander was to implement a “natural language-like” interface for Premise. Unfortunately, true natural language processing is only available via very expensive software solutions. Those solutions can literally infer the intent of the speaker even if the speaker doesn’t say exactly the phrase programmed into the recognizer. SpeechCommander implements the next best thing: Command and Control Grammars.
Command and Control Grammars are Microsoft SAPI constructs that define the words and syntax the recognizer should listen for and take action on. If the grammars are designed properly, the end-user can have a natural language-like experience with a little training. When designed properly, Command and Control grammars can be used to implement commands like this:
Play the album Chicago’s Greatest Hits in the Kitchen
Turn on the Living Room Lights
Tell me the forecast
Play scene Movie in theater
SpeechCommander uses SAPI Command and Control grammars to recognize the intent of the speaker via a SAPI compliant Speech Recognizer. That recognizer integrated with Premise via minibroker gives SpeechCommander the ability to control objects in your home.
- Premise Home Control System V2.0 (latest service pack)
- Rob Brun’s ScriptTools, SYSTools, and MS SAPI modules (download these separately)
- Microsoft SAPI 5.1 SDK – Installed and trained on the machine you will use for Speech Recognition. You will also need to installed the SDK on your Premise Server (needed by Rob Brun’s MS SAPI module)
- The Speech Recognition engine that comes with Microsoft Office 2003 is optional, but highly recommended.
- AT&T Natural Voices Text to speech (optional, but recommended)
- Modified version of Rob Brun’s Forecast module
- NetNews module (included)
- SpeechZone module (included)
- Install Rob Brun’s SpeechTools and MS SAPI modules according to his instructions
- Install the modified version of Rob Brun’s Forecast module (provided in this kit) following his instructions
- Set the FTPurl property to point to NOAA’s forecast file for your area. (I use: weather.noaa.gov/data/forecasts/city/va/washington_dulles_intl_airport.txt
- Import the NetNews module and create NetNews instances in your Home setting them to the types of news you want to fetch (i.e. Top Stories, Business, etc.) by setting the NewsCategory attribute.
- Create a Speech Object (right click on the MSSAPI Device) for each of the audio cards you wish to use for Text to Speech. Don’t forget to set the AudioOutput attribute to the actual Audio Card in your computer.
- Import the SpeechZone module and create instances of the SpeechZone around your home. I create a SpeechZone associated with every MediaZone in my Home. However, you probably only want to create one to start off with.
- Set the SpeechObject property in your SpeechZone to one of the Speech Objects you created under MSSAPI.
- Set the MediaZone property in your SpeechZone to the MediaZone you want to associate with. In other words, the MediaZone that will actually render any spoken responses. I have a MediaZone called “Whole House Page” associated with a SpeechZone that handles all the spoken responses from SpeechCommander. It essentially is a connection between an audio card and the music inputs of my intercom system. Any audio coming out of that card ends up on intercom stations.
- If your MediaZone is connected to an A/V Switching device, and you’ve selected a SpeechObjected with an Audio Card directly connected to that Switching Device, the SwitcherInput attribute should be set to the Input that is connected to the Audio Device. If not, you can manually set the input. I’ve only tested automatic selection of inputs with a Nuvo Essentia so I can’t confirm it works with any Switcher.
- Install SpeechCommander by double clicking on Setup and following the prompts. I recommend installing it on a computer other than your Premise server. Speech recognition is a bit computationally intensive so you probably want to isolate it to a different machine. I’ve also discovered that Remote Desktop Connection (RDC) connecting to a Windows 2000 server hosting SpeechCommander simply does not work. The Microsoft Speech Recognition engine can’t enumerate the audio input device correctly when connected via RDC. You must be logged into the console to run SpeechCommander.
Click here to download this file