Voice recognition used to be horribly inaccurate. It only worked for a handful of people a handful of the times. But now it’s actually rather good, thanks to the combined efforts of Nuance, Microsoft, Apple, and Google, who have thrown countless resources at actually improving it.
Of all those companies, few have matched the commitment to voice recognition of Google, who has made it a fundamental lynchpin of its mobile and services strategy.
One of Google’s earliest forays was the short-lived GOOG-411 (or Google Voice Local Search). It launched in 2008 and allowed people to search for business phone numbers using their voice. Voice recognition technology has also been a centre piece of Android, and with the launch of JellyBean it finally became available offline.
Earlier this week, Google finally introduced voice recognition into Google Docs.
Users can literally dictate their documents (much like I am doing with this article) without the need to install any additional software or plug-ins. It’s a significant leap forward for the online office suite, but is it any good?
Before we start diving into its features, I want to touch on how you get Google Voice Typing. If you have a Google account, you already have this. Just open Google Docs, and open a new or existing document. Then, a window will pop up that will ask if you would like to try voice dictation. Click Try It.
Next you have to give Google Docs permission to use your microphone. That’s just a matter of clicking Allow to a pop-up window.
Then, you have to select the language you want to use with Google Voice Typing. The range of languages and dialects on offer range from English and Spanish, to Afrikaans and Arabic.
Then, just click the microphone icon and start to talk.
How Accurate Is It?
One of the biggest hurdles to voice recognition hitting the mainstream, is that often it’s not accurate enough. It used to be a given that if you see use voice recognition, you will have to spend a good few hours editing and correcting your text. So how does Google’s offering fare in this respect?
Pretty favorably, actually. For the most part, Google Voice Typing understood what I said, even though I’ve got a regional English accent (we’ll talk about accents later).
I was especially impressed with the way Google’s voice recognition handled background noise. As I wrote this article, a Yorkshire Terrier was barking in my living room, and my window was partially open. I live on a busy road where cars drive past constantly. But despite that, Google was able to filter that out and focus on just what I was saying.
The biggest problem was Google Voice Typing often struggled with punctuation. I would say “comma”, “period” and “full stop”, and it would interpret that as me wanting to write “comma”, “period”, and “full stop”. This was frustrating for two reasons.
Firstly, because it would taunt me, by first using the correct punctuation, before immediately reverting to the spelled-out version of the word. There was no way to stop this, and I would have to manually edit the document to fix it.
But, perhaps worse, I couldn’t prevent it from happening. There’s no dictionary where you could override spellings. It just happens, and you have to deal with it.
I don’t want to understate how frustrating this is. It’s seriously annoying. But it’s also something I’m confident will be improved upon as more and more people use this feature, and as Google commits more resources to improving its voice recognition.
Besides that particular annoyance, I was pretty pleased with the accuracy of Google’s voice recognition.
How It Handles Accents
I was amazed by how many languages and dialects Google Voice Typing supports. In English alone, it supports the New Zealand, Australian, Indian, South African, American, and British dialects, to name just a few. The problem is there isn’t really an American accent, much like there isn’t a British accent. Rather, there are a range of accents and dialects that differ from place to place.
It’s a truism that the UK has an accent for each post-code. The MakeUseOf Team boasts a range of different accents among the British staff. Christian Cawley speaks with a broad Middlesbrough accent. Rob Nightingale, who hails from Southport, has a more Northern drawl. While Mark O’Neil has a Scottish twang.
I live in Liverpool, so I have a Scouse accent that slightly drifts into the Atlantic, largely thanks to my American fiancee and the time I spent living in Switzerland.
And it’s fair to say that voice recognition programs often struggle to understand regional English dialects. When Siri came out, for example, its inability to understand Scottish users became a running joke.
But Google’s offering was exceptional. Believe me when I say you won’t have to practice speaking with a different accent. I’ve spoken to a handful of friends who also have regional English accents, and they’ve had similarly positive experiences with it. While I admit that’s a small and completely unscientific sample, it’s certainly promising.
Voice Dictation Speed
Voice recognition programs have traditionally been hamstrung by an inability to keep up with the speed to which the user dictates. Admittedly, I was a little bit concerned that Google’s offering would be no different, especially given that it’s an online service, rather than a program running on my souped-up MacBook Pro.
But I was impressed. Google was able to keep up with my highly-caffeinated rate of speaking, and didn’t act as a bottleneck to my productivity. It was the complete opposite of my experiences with other voice dictation tools.
I don’t know whether that was because I have a fast FTTC (Fiber to the Cabinet) home Internet connection, or the fact that Google has a limitless supply of fast servers at its disposal. Either way, I was able to get stuff done.
A Note on Microphones
Built-in microphones tend to be hit-and-miss. In my experience, they are either excellent, like they are on Apple’s laptops, or they aren’t. There’s very seldom any middle ground.
As a general rule, the cheapest laptops will have the worst internal microphones. It’s just one of those features that tend to be overlooked by device manufacturers.
I started dictating this article using the internal microphone on my MacBook Pro. Although Google Docs frequently said it was having trouble hearing me, that didn’t translate to slower or inaccurate dictation. Everything worked just fine.
I also tried Google Voice Typing with an expensive Blue Yeti External Microphone. These are podcast-quality microphones that retail on Amazon for over $100.
Admittedly, I didn’t notice any differences when it came to the accuracy or the speed of the dictation. However, the biggest advantage to using this microphone was I was able to insert a pair of AKG headphones and use them as a monitor. This allowed me to be more aware of background noise, and to self-adjust if I was being either too loud or too quiet.
Google voice recognition isn’t perfect. But that’s hardly a surprise, as solid voice recognition is a pretty hard feat to pull off. There’s a lot that I felt could be improved.
This mostly centers around how the software deals with punctuation and sentence structure. In an ideal world, Google would automatically insert punctuation based on the rhythm and cadence of your voice, but we’re a long way away from that.
It’s also a pity that this software has yet to make it’s way into Google’s other offerings, like Gmail. Ideally, I’d like the opportunity to download Google Voice Typing as an app, and use it with other pieces of software, like iWork’s Pages or the markdown editor IA Writer.
But those are two minor annoyances, Google Voice Typing is as good as it gets. For contrast, I wrote this section of the article using the built-in voice recognition of OS X, and it was nowhere near as accurate, nor as fast.
If this doesn’t persuade people to switch to Google Drive, I don’t know what will.
Now over to you! Have you been tempted by Google Drive’s speech recognition? Have you tried it out, yet? Tell me all about it in the comments below.