Speech! - Python

Hi there!

I'm working on a simple script, and I was wondering if there is some way to use the Microsoft built-in speech SDKs (to voice, and text FROM voice)? I'd like to capture raw_input() data from my microphone whenever the program recognizes that something went through the microphone. I'd also like to have the program use the default text-to-voice synthesizer I have installed... currently Microsoft Mary.

Is this possible? If so, how might I go about doing it? Are there any simple modules with functions such as, "say()" or "listen()"? Thanks a lot!

Jul 2 '08 #1

Subscribe Post Reply

3210

Dekudude

Hmm, okay. I figured out text to voice. Is there some way to do voice to text, though? I already have the Microsoft speech API, SAPI, but I don't know how to make that work with Python...

Jul 3 '08 #2

heiro

posting again with code tags

Jul 3 '08 #3

heiro

Expand|Select|Wrap|Line Numbers

 
from win32com.client import constants

import win32com.client

import pythoncom

class SpeechRecognition:

    """ Initialize the speech recognition with the passed in list of words """

    def __init__(self, wordsToAdd):

        # For text-to-speech

        self.speaker = win32com.client.Dispatch("SAPI.SpVoice")

        # For speech recognition - first create a listener

        self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer")

        # Then a recognition context

        self.context = self.listener.CreateRecoContext()

        # which has an associated grammar

        self.grammar = self.context.CreateGrammar()

        # Do not allow free word recognition - only command and control

        # recognizing the words in the grammar only

        self.grammar.DictationSetState(0)

        # Create a new rule for the grammar, that is top level (so it begins

        # a recognition) and dynamic (ie we can change it at runtime)

        self.wordsRule = self.grammar.Rules.Add("wordsRule",

                        constants.SRATopLevel + constants.SRADynamic, 0)

        # Clear the rule (not necessary first time, but if we're changing it

        # dynamically then it's useful)

        self.wordsRule.Clear()

        # And go through the list of words, adding each to the rule

        [ self.wordsRule.InitialState.AddWordTransition(None, word) for word in wordsToAdd ]

        # Set the wordsRule to be active

        self.grammar.Rules.Commit()

        self.grammar.CmdSetRuleState("wordsRule", 1)

        # Commit the changes to the grammar

        self.grammar.Rules.Commit()

        # And add an event handler that's called back when recognition occurs

        self.eventHandler = ContextEvents(self.context)

        # Announce we've started

        self.say("Started successfully")

    def say(self, phrase):

        self.speaker.Speak(phrase)
 
class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):

    def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):

        newResult = win32com.client.Dispatch(Result)

        print "You said: ",newResult.PhraseInfo.GetText()

if __name__=='__main__':

    wordsToAdd = [ "One", "Two", "Three", "Four" ]

    speechReco = SpeechRecognition(wordsToAdd)

    while 1:

        pythoncom.PumpWaitingMessages()

##### for text to speech################

Expand|Select|Wrap|Line Numbers

  
import sys

from win32com.client import constants

import win32com.client
 
speaker = win32com.client.Dispatch("SAPI.SpVoice")

while 1:

   try:

      s = raw_input('Type word or phrase: ')

      speaker.Speak(s)

   except:

      if sys.exc_type is EOFError:

         sys.exit()

Jul 3 '08 #4

Dekudude

Thanks for your response:

However, that is not exactly what I want. For starters, I already got the text-to-voice working. As for voice-to-text, I already tried that example, but it doesn't work how I want it to. As far as my beginner eyes can see, there's no way to replicate raw_input() using it, where if I said, "This is a test, hello world", it would enter that.

Do you understand what I'm saying? If you can show me how to use that code example, that would be great, but I just want a raw_input() style function that enters data via voice-to-text.

Thanks!

Jul 3 '08 #5

Similar topics

Speech - SASDK

by: BrewskiAtBellSouth | last post by:

Is it my imagination or is the stuff in the SASDK overly complex? For example, let's say I'd like to have a one page web application where I enter a name into a text box and have the application...

.NET Framework

Need help in finding some means of being able to program with speech recognition

by: Rod | last post by:

About two weeks ago I had an accident and have broken my left elbow and left wrist. For doing things like Word or e-mail (I use Outlook for) I have been using Microsoft's speech recognition and...

C# / C Sharp

Stopping SAPI speech output

by: Colin | last post by:

Hello all... I'm working on a small app that uses SAPI with the widely mentioned SpeechLib interface... App is written in VC#... It speaks fine and I can trigger that event perfectly... ...

C# / C Sharp

Microsoft Speech SDK 5.1

by: ShadowOfTheBeast | last post by:

Hi all, is there any one who have developed an application in c# using a text-speech engine using the Microsoft Speech SDK 5.1 especially using visual studio.net IDE 2003 (v7.1) I seem to get an...

C# / C Sharp

Speech Application

by: Sateesh Kumar E C | last post by:

Dear all, I have developed a speech web application using Speech SDK(SASDK) with VisualStudio.NET on Win2003 environment. Users can access this speech web application by using SALT enabled...

ASP.NET

Speech technologies

by: Man From The Moon | last post by:

I'm interesting in coding a Windows app that can produce multiple audio voice effects, given input text. I've been doing some searching for SDKs/APIs, and this is what I've found: the Microsoft...

.NET Framework

Text to speech (TTS) application on web server

by: Onur | last post by:

Hi.All I'm working on a TTS application. It runs on my local pc (WindowXP pro) without any error. Microsoft Visual Studio .NET 2003, Microsoft .NET Framework SDK v1.1, Microsoft Speech...

ASP.NET

Adding Speech Recognition in VB.Net

by: Meena | last post by:

In our company we are trying to add speech recognition to our products. I downloaded the Speech Recognition engine. Now there is a component called Microsoft Direct Speech Recognition in VB.Net...

Visual Basic .NET

Text to Speech in c#

by: Hakan Fatih YILDIRIM | last post by:

Hi List! i am making a project and i want the text in speech mode so idownload a .net speech library from code project.But i have to make the speech in Turkish accent.How can i change the speech...

C# / C Sharp

Speech Feedback to Recogniser

by: HardySpicer | last post by:

I am writing my own recogniser and synthesis software in VB .net. However, every time I get the syntheser to speek something the mic picks it up and thinks it is a command! It is quite bizzar - it...

.NET Framework

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA