Hi there!
I'm working on a simple script, and I was wondering if there is some way to use the Microsoft built-in speech SDKs (to voice, and text FROM voice)? I'd like to capture raw_input() data from my microphone whenever the program recognizes that something went through the microphone. I'd also like to have the program use the default text-to-voice synthesizer I have installed... currently Microsoft Mary.
Is this possible? If so, how might I go about doing it? Are there any simple modules with functions such as, "say()" or "listen()"? Thanks a lot!
4 3210
Hmm, okay. I figured out text to voice. Is there some way to do voice to text, though? I already have the Microsoft speech API, SAPI, but I don't know how to make that work with Python...
posting again with code tags
-
from win32com.client import constants
-
import win32com.client
-
import pythoncom
-
class SpeechRecognition:
-
""" Initialize the speech recognition with the passed in list of words """
-
def __init__(self, wordsToAdd):
-
# For text-to-speech
-
self.speaker = win32com.client.Dispatch("SAPI.SpVoice")
-
# For speech recognition - first create a listener
-
self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer")
-
# Then a recognition context
-
self.context = self.listener.CreateRecoContext()
-
# which has an associated grammar
-
self.grammar = self.context.CreateGrammar()
-
# Do not allow free word recognition - only command and control
-
# recognizing the words in the grammar only
-
self.grammar.DictationSetState(0)
-
# Create a new rule for the grammar, that is top level (so it begins
-
# a recognition) and dynamic (ie we can change it at runtime)
-
self.wordsRule = self.grammar.Rules.Add("wordsRule",
-
constants.SRATopLevel + constants.SRADynamic, 0)
-
# Clear the rule (not necessary first time, but if we're changing it
-
# dynamically then it's useful)
-
self.wordsRule.Clear()
-
# And go through the list of words, adding each to the rule
-
[ self.wordsRule.InitialState.AddWordTransition(None, word) for word in wordsToAdd ]
-
# Set the wordsRule to be active
-
self.grammar.Rules.Commit()
-
self.grammar.CmdSetRuleState("wordsRule", 1)
-
# Commit the changes to the grammar
-
self.grammar.Rules.Commit()
-
# And add an event handler that's called back when recognition occurs
-
self.eventHandler = ContextEvents(self.context)
-
# Announce we've started
-
self.say("Started successfully")
-
def say(self, phrase):
-
self.speaker.Speak(phrase)
-
-
class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):
-
def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
-
newResult = win32com.client.Dispatch(Result)
-
print "You said: ",newResult.PhraseInfo.GetText()
-
if __name__=='__main__':
-
wordsToAdd = [ "One", "Two", "Three", "Four" ]
-
speechReco = SpeechRecognition(wordsToAdd)
-
while 1:
-
pythoncom.PumpWaitingMessages()
-
-
##### for text to speech################ -
-
import sys
-
from win32com.client import constants
-
import win32com.client
-
-
speaker = win32com.client.Dispatch("SAPI.SpVoice")
-
while 1:
-
try:
-
s = raw_input('Type word or phrase: ')
-
speaker.Speak(s)
-
except:
-
if sys.exc_type is EOFError:
-
sys.exit()
-
Thanks for your response:
However, that is not exactly what I want. For starters, I already got the text-to-voice working. As for voice-to-text, I already tried that example, but it doesn't work how I want it to. As far as my beginner eyes can see, there's no way to replicate raw_input() using it, where if I said, "This is a test, hello world", it would enter that.
Do you understand what I'm saying? If you can show me how to use that code example, that would be great, but I just want a raw_input() style function that enters data via voice-to-text.
Thanks!
Sign in to post your reply or Sign up for a free account.
Similar topics
by: BrewskiAtBellSouth |
last post by:
Is it my imagination or is the stuff in the SASDK overly complex? For
example, let's say I'd like to have a one page web application where I enter
a name into a text box and have the application...
|
by: Rod |
last post by:
About two weeks ago I had an accident and have broken my left elbow and left
wrist. For doing things like Word or e-mail (I use Outlook for) I have been
using Microsoft's speech recognition and...
|
by: Colin |
last post by:
Hello all...
I'm working on a small app that uses SAPI with the widely mentioned
SpeechLib interface... App is written in VC#... It speaks fine and I can
trigger that event perfectly...
...
|
by: ShadowOfTheBeast |
last post by:
Hi all, is there any one who have developed an application in c# using a
text-speech engine using the Microsoft Speech SDK 5.1 especially using visual
studio.net IDE 2003 (v7.1) I seem to get an...
|
by: Sateesh Kumar E C |
last post by:
Dear all,
I have developed a speech web application using Speech SDK(SASDK) with VisualStudio.NET on
Win2003 environment. Users can access this speech web application by using SALT enabled...
|
by: Man From The Moon |
last post by:
I'm interesting in coding a Windows app that can produce multiple audio voice
effects, given input text. I've been doing some searching for SDKs/APIs, and
this is what I've found: the Microsoft...
|
by: Onur |
last post by:
Hi.All
I'm working on a TTS application.
It runs on my local pc (WindowXP pro) without any error.
Microsoft Visual Studio .NET 2003, Microsoft .NET Framework SDK v1.1,
Microsoft Speech...
|
by: Meena |
last post by:
In our company we are trying to add speech recognition to our products.
I downloaded the Speech Recognition engine.
Now there is a component called Microsoft Direct Speech Recognition in
VB.Net...
|
by: Hakan Fatih YILDIRIM |
last post by:
Hi List!
i am making a project and i want the text in speech mode
so idownload a .net speech library from code project.But i have to
make the speech in Turkish accent.How can i change the speech...
|
by: HardySpicer |
last post by:
I am writing my own recogniser and synthesis software in VB .net.
However, every time I get the syntheser to speek something the mic
picks it up and thinks it is a command! It is quite bizzar - it...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
| |