Have you ever had a tool that was so painfully cool that even though it was a giant pain in the ass to use, you still loved it so much that you used it all of the time anyway? That is how this reviewer felt about computer voice\speech recognition. Back in the misty dawn of the modern age, voice recognition software and the technologies that incorporated it fell so far short of the advertised and often over-hyped mark, that bummed out users were left to cry in their beer. We wanted so much for the technology to be usable, for it to work! For speech to be a serious input source of actual utility, like a keyboard or mouse. For as long as there have been computers, humans have hoped for the ability to communicate with them the way that Dave Bowman was able to communicate with the Hal 9000 in “2001: A Space Odyssey”, or the crews of the Enterprise with their silicon shipmates in the various versions of “Star Trek”. Heck, even before computers were ever conceived, science fiction writers and dreamers have prophesied the day when one could just speak to machines or automata and that speech would automatically be recognized as specific commands and acted upon, or in the world of business, captured, transcribed perfectly and faithfully, and then processed in whatever manner was appropriate for the business purpose one had in mind. Not only did the transcription have to be accurate, it also had to be discreet; until about 1997, one had to rely on faithful and trustworthy assistants (or secretaries back in the Pleistocene when yours truly grew up). But as of 1997, we have been able to use Dragon NaturallySpeaking. This seminal voice recognition software was originally developed by Dragon Systems, a company founded by James and Janet Baker in 1982, but it has been lovingly and carefully nurtured and developed by Nuance Communications for the last decade.
Much has changed in the world of voice recognition technology since 1982, but one consistent element has been the fact that Nuance and a few key competitors have invented and continue to develop, the algorithms and the software that can be found in literally all of the technology that utilizes speech recognition to this day. This includes but is not limited to, incredibly cool things like home automation systems that make you feel like you really are living in the 21st century, and incredibly obnoxious things, like those horrible interactive voice response systems (IVRS) that pass for customer service these days. But voice recognition technology is not limited to ridiculously cool domiciles and obtuse robotic operators; today it is as ubiquitous as the air itself. Voice recognition is utilized in everything from cars to telephones, from classrooms to assistive technologies for people with disabilities; Dragon NaturallySpeaking 11.5 represents if not the pinnacle of this technology, than certainly the state-of-the-art as it exists as of the writing of this review.
To begin with, in the interest of full disclosure, I am using the software to write my review. I routinely use Dragon dictation on my iPhone, and at this moment, I am using Dragon remote microphone in lieu of a physical headset (more about that later). In other parts of this review I have utilized the Andrea , mic-in headset that was supplied with the software. I have been using Dragon NaturallySpeaking products for a number of years now; I am familiar with this software and its feature history. I started with one of its earliest iterations, version 3.0 which was pretty darn good, and the most recent version, with which I am familiar, version 10, was very good. But this version, at least in this, the premium edition, stands head and shoulders above its esteemed ancestors. The progress achieved by Nuance engineers in the ability of the software to recognize normal human speaking patterns and flow, compared to the earlier versions, which required a robot like staccato cadence in order to capture the data in one’s speech, is nothing short of miraculous. Now, rather than speaking in short choppy sentence fragments, one can speak as naturally and as comfortably as if one were speaking to another human being. I know this seems counter-intuitive, but it’s true; the longer the speech, the more fluent and well enunciated the diction, the more accurately the software captures and transcribes one’s words. I won’t lie; it’s pretty freaking cool.