Looking for the best free Windows speech to text software? The most-repeated paid recommendation is Dragon Naturally Speaking (DNS). But some might scoff at paying money for software. Fortunately, there are several great free programs out there like Google Docs Voice Typing (GDVT) and Windows Speech Recognition (WSR).
You might wonder how these two products compare against DNS — and whether or not DNS meets your needs. For this article, I’ve identified three kinds of users: those who need speech to text transcription for writing novels, those who need academic transcription, and those who write business documents, like memos. To this end, I tested three speech transcription programs (DNS, GDVT, and WSR).
Speech Transcription Setup
Before we head into the test, let’s first look at the recommended hardware and software setup.
Hardware and Software Requirements
Google Docs Voice Typing requires the Chrome Browser and a microphone. And it also needs a persistent internet connection (which isn’t mentioned in the requirements). Otherwise, this is probably the easiest method to get started with speech transcription.
DNS requires a processor made in 2001 or later, Windows 7 or later, and around 4GB of free storage. Its strictest requirement is 2GB of RAM. Here’s a complete list of DNS’s hardware requirements.
I use a dynamic microphone (best microphone for podcasting) and a relatively fast Intel Core i7 processor. While a high-quality microphone is desired, it isn’t required. Even so, your results will improve with better sound quality and reduced background noise.
The lowest-priced microphone that I would recommend for high-quality recording is the Audio-Technica ATR-2100. However, the accuracy difference between $5 microphone and a $200 device is pretty minimal.
On the other hand, the minimum requirements for Microsoft Speech are pretty much any remotely modern computer (most computers made in the last ten years) and a microphone. If you own a laptop or tablet made in the last five years, it should have what you need by default.
Configuring Speech Transcription Programs
Here’s how to use Google Voice Typing:
Here’s how to get started with Windows Speech Recognition:
And, finally, here’s how to get started with Dragon Naturally Speaking:
I want to find the best free Windows-based speech to text application. Because different consumers may need a different product, I’ve devised a simple test. I read three different passages from texts without copyright: one from Charles Darwin’s On the Tendency of Species to Form Varieties. One from H.P. Lovecraft’s Call of Cthulu. The last hails from Jerry Brown’s 2017 State of the Union speech. My methodology is by no means perfect, but it does give an impression of each voice recognition suite’s accuracies.
Fiction Writing Sample (From H.P. Lovecraft’s Call of Cthulu)
“The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents. We live on a placid island of ignorance in the midst of black seas of infinity, and it was not meant that we should voyage far. The sciences, each straining in its own direction, have hitherto harmed us little; but some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the deadly light into the peace and safety of a new dark age.”
Business Writing Sample (Jerry Brown’s 2017 State of the Union speech)
“It is customary on an occasion like this to lay out a specific agenda for the year ahead. Six times before from this rostrum, I have done that, and in some detail. And, as I reread those proposals set forth in previous State of the State speeches, I was amazed to see how much we have accomplished together.”
Academic Writing Sample (Charles Darwin’s On the Tendency of the Species to Form Varieties)
“Now when a variety of such an animal occurs, having increased power or capacity in any organ or sense, such increase is totally useless, is never called into action, and may even exist without the animal ever becoming aware of it. In the wild animal, on the contrary, all its faculties and power being brought into full action for the necessities of existence, any increase becomes immediately available, is strengthened by exercise, and must even slightly modify the food, the habits, and the whole economy of the race.”
3 Voice Transcription Suites Tested
It’s surprising how free voice recognition tests performed against paid software. But at the end of the day, the most accurate app is Dragon Naturally Speaking. However, both Google Voice Typing and Windows Speech Recognition cost nothing and deliver over 90 percent accuracy. But each has its own strengths and weaknesses and you might prefer one over the other.
H.P. Lovecraft (Fiction Writing Test)
Lovecraft loved writing in long, unbroken, parenthetically dense prose. While all three suites do a great job of accurately transcribing Lovecraft’s vocalized text, DNS comes out ahead of its competitors. It includes both capitalization and punctuation (which is completely insane).
DNS: DNS only dropped a single word from the text. Overall, it scored 107 correct out of 108 words. It nailed several long, non-stop sentences as well.
WSR: Windows did a very good job — but not amazing — of transcribing Lovecraft. It got around 97 of around 108 words correct. While that falls short of both GDVT and DNS, it’s still good for a free speech to text program that doesn’t require online access.
GDVT: I’m not sure what happened because Google nailed the transcription for the other excerpts. GDVT only achieved 103 right out of 108, dropping two words and mistranscribing three. It even once spelled out “semicolon” instead of inserting the correct punctuation. It also capitalized certain words, turning them into proper nouns (but I won’t penalize them since it’s accuracy and not capitalization that matters).
I’m pretty sure that if I reread the document a second time, it wouldn’t have any errors.
Charles Darwin (Scientific or Academic Writing Test)
Darwin writes in, like Lovecraft, long sentences loaded with parenthetical information. However, his use of language is very clear and he uses almost no jargon, which differs from nearly incomprehensible science writing today.
DNS: Darwin’s text comes out near perfect in Dragon Naturally Speaking. DNS misspelled only one word (“into”) and otherwise completely nailed the test with 87 words right out of 88.
WSR: Microsoft did a great job, matching 82 out of 88 words. It made some relatively bizarre errors, though, like spelling “sense” as “cents”.
GDVT: Google did great on Darwin’s excerpt. GDVT only fouled up two words, out of 88. Overall, for a free application, you can’t find a more accurate alternative.
Jerry Brown State of the State Address 2016 (Business Writing Test)
Brown’s speech doesn’t use a lot of complicated sentences or vocabulary (aside from the word “rostrum”). Overall, most of the transcription services performed amazingly. More or less, if you need a service that handles simple sentences and limited vocabulary, any one of these works great.
DNS: DNS nailed Brown’s State of the State Address. While it dropped a period, otherwise, it got every word perfectly. Note, though, that political speeches oftentimes lack the sort of complex language that you might see in fiction or academia. A memo or speech is direct and to the point. That’s something a speech recognition client shouldn’t have any problems handling.
WSR: Windows Speech Recognition did a great job — although not as great as DNS or Google — at transcribing Brown’s speech. It scored 55 out of 58 words. It even recognized the word “rostrum,” which I didn’t even know was a word, nor did I know how to pronounce it. Apparently, either I got it right or speech recognition technology can even catch mispronunciations.
GDVT: Google’s transcription software absolutely nailed the transcription, with 100 percent accuracy. It even managed to correctly capitalize “State of the State”, without needing user input. It did oddly use the number, rather than the spelling, for the word “six”. Which resulted in a stylistic error.
Are Free Transcription Services Worth Using?
There is a difference between Dragon Naturally Speaking, Google Voice Typing, and Microsoft’s Windows Speech Recognition. Dragon is more accurate than its competitors. However, the best free program in terms of accuracy is — by a narrow margin — Google’s Voice Typing. While both Microsoft’s and Google’s transcription services compare less-than-favorably against DNS, they do not cost $30.
Contrasting the two free services against each other, Google offers better voice recognition accuracy, punctuation, and case, it requires an internet connection. Google also captures a lot that you don’t intend, like punctuation and capitalization.
However, if you want a free transcription program that you don’t need an internet connection to use, Windows Speech Recognition fits the bill. It’s by no means bad and offers 90 percent of what Dragon Naturally Speaking offers. Give it a shot if you haven’t already.
What’s your favorite transcription service? Please let us know in the comments!