Techie: The sound of your voice
Recognition technology getting better, but not perfect
John ďjaQĒ Andrews
You talk to your computer already. Admit it. We canít print what you
probably yell at it, but you know what youíve said.
Wouldnít it be great if your computer understood you?
Perfect voice recognition has been the holy grail of software
development for decades. Dozens of applications, not to mention hardware
devices like GPS units and light controllers, will recognize a small set
of commands from just about any person. Download the Opera Web browser (www.opera.com)
to get a free taste of software that responds to your decrees.
The trickier side of voice recognition comes not in responding to a
small list of designated words, but in correctly interpreting the
bizarre sounds coming out of our primitive, fleshy mouths and
transcribing those words into text. Even top-of-the-line software claims
only a 99 percent accuracy rate.
Ninety-nine percent sounds pretty good, you say? I suppose it does. But
if you consider your really big documents ó say, a novel, at a
conservative 50,000 words ó youíre dealing with 500 errors right there.
If itís a fantasy novel with words like Glyllenphage or ponyrabbit,
You donít write novels? Okay. Make it about one error every other
paragraph. And that error is hidden in some embarrassing homophone or
some word you never even knew existed.
Not to mention that the 99 percent accuracy rate is only achieved after
youíve configured the software to your specifications and trained it to
recognize your voice. Other users will need to create their own profiles
so the software can deal with their unique inflections, accents and
And donít even think about trying voice recognition in a crowded office.
Not because your computer will pick up background chatter ó a decent
headset microphone will have a narrow enough focus to only listen to
you. The real problem is you yammering away all the live-long day while
your officemates are trying to get work done. Or surf porn. Whatever.
All that aside, voice recognition does have its place. You can dictate
informal correspondence or notes where you donít mind a few errors.
Transcribing a speech or interview youíve recorded is a heck of a lot
easier, as long as youíre meticulous about combing back through it and
removing mistakes. Above all, itís great if youíre hopped up on caffeine
and have to get words out of your head but canít sit still to type.
The current giant in the consumer voice recognition business is Dragon
NaturallySpeaking (www.nuance.com). Itís a very mature product thatís in
version 8 at the moment and has absorbed IBMís ViaVoice product. It
comes in several editions: different dictionaries come with the medical
and legal editions, so arcane terminology is still recognized. You can
even build voice recognition into your own applications with their
Software Development Kit (SDK). They claim punctuation is automatically
inserted, but Iíd add that to your list of things to check over once
youíre finished talking.
Other software is sparse, but includes Wave to Text and Dictation 2005 (www.research-lab.com).
Both offer a three-day trial version for download, but boast only 80
percent accuracy. Their own screenshot shows the text ďThis is testing
one to three computer,Ē which, only guessing at the speakerís
intentions, isnít encouraging.