Tuesday, March 22, 2011

Speech Recognition Part 1 - Dictation Mode

The previous post was about speech output with .Net, this post is dedicated to the inverse, namely speech recognition.
This is much harder for computers than the output, but the .Net framework again provides ready-made functions of the class System.Speech.Recognition, over which speech recognition can be realized with little effort.
In general there are 2 modes in which speech recognition can be run: This post is about the Dictation Mode, the next about the Command Mode.
The Dictation Mode is, as the name already suggests, suited for dictating texts. The recorded sound is understood as a dictate and the program tries to understand the spoken words.
As for the speech output, first a reference to the class System.Speech has to be included. Now though the needed subclass is Recognition, so we first use the following using directive:

using System.Speech.Recognition;

For speech recognizing we use an instance of the class SpeechRecognitionEngine. This needs a grammar, which is kind of a command list, on how to interpret the language.
As a grammar we hand over an instance of the class DictationGrammar, to indicate, that we want to use the dictation mode.
The recognition of spoken words now works with the function Recognize(). This prepares for recognition and starts it, when the microphone records sound. Is the speaker takes a break (the needed duration can be set), the speech recognition is finished and the program now tries to interpret the sound as words (an asynchone recognition is also possible). Finally the dictation result is returned.
Now the code:

            SpeechRecognitionEngine SRE = new SpeechRecognitionEngine();
            SRE.LoadGrammar(new DictationGrammar()); // load dictation grammar
            SRE.SetInputToDefaultAudioDevice(); // set recording souce to default

            RecognitionResult Result = SRE.Recognize(); // record sound and recognize
            string ResultString = "";
            // add all recognized words to the result string
            foreach (RecognizedWordUnit w in Result.Words)
                ResultString += w.Text;

No comments:

Post a Comment