Tag Archives: nuance

Review: Nuance Dragon Naturally Speaking 13

I have great admiration for Nuance Dragon Naturally Speaking, mainly because of its superb text recognition engine. New versions appear regularly and the recognition engine seems to improve a little each time. The recently released version 13 is no exception, and I am getting excellent results right now as I dictate into Word.

If you are still under the illusion that dictation is not viable unless you are unable to type, it may be that you have not tried Dragon recently. Another possibility is that you tried Dragon with a poor microphone. I recommend a high quality USB headset such as those from Plantronics or Jabra. USB is preferred since you are not dependent on the microphone preamplifier built into your PC, which is often poor.

At the same time, Dragon can be an intrusive application. The problem is that Dragon tries to accomplish two distinct tasks. One is to enable dictation and to some extent transcription of recordings, which is something anybody might want to take advantage of. For example, one of my uses is transcribing interviews, where I play my recording into a headset and read it back into the microphone. It is a lot quicker than the normal stop-start typing approach and even if it is a little less accurate the time-saving is worthwhile.

Incidentally, Dragon is nowhere near smart enough yet to transcribe an interview directly. Background noise combined with the variety of accents used make this generally a hopeless task. In principle though, there is no reason why software should not be able to accomplish this as both processing power and algorithms improve so watch this space.

The other task for which people use Dragon is as an assistive technology. Those unable to use mouse and keyboard need to be able to navigate the operating system and its applications by other means, and Dragon installs the hooks necessary for this to work. This is where the intrusive aspect comes to the fore, and I wish Dragon had a stripped down install option for those who simply want dictation.

I had some issues with the Outlook add-in, which I do not use anyway. Outlook complained about the add-in and automatically disabled it, following which it was Dragon’s turn to sulk:

image

That said, it is possible to configure it as you want. Because of this kind of annoyance, I tend to avoid Dragon’s add-ons for applications like Microsoft Outlook and Internet Explorer. If you are using Dragon as an assistive tool though, you probably need to get them working.

Dragon can be fiddly then, which is why users who dive in and expect excellent results quickly may well have a bad experience. Speech recognition and interaction with applications that were primarily designed for mouse and keyboard is a hard task; you will have to make some effort to get the best from it.

What’s new?

So what is new in version 13? The first thing you will notice is that the Dragon bar, which forms the main user interface, has been redesigned. The old one is docked right across the top of the screen by default and has traditional drop-down menus. You can also have it floating like this:

image

The new bar has modern touch-friendly icons, though these turn out to be drop-down menus in disguise:

image

There is also an option to collapse the bar when not in use, in which case it goes tiny:

image

Another user interface change is that the handy Dragon Sidebar, a help panel which shows what commands you can use in the current application and which changes dynamically according to context, has been revamped as the Learning Center. Here it is in Word, for example:

image

I like the Learning Center, which is a genuine help until you are familiar with all the commands.

The changes to the Dragon  user interface are mostly cosmetic, but not entirely. One innovation is that the Dragon Bar now works in Store apps in Windows 8. Here I am dictating into Code Writer, a Store app:

image

It works, but this seems to be work in progress. Dragon is really a desktop application, and I found that some commands would mysteriously bounce me back to the desktop, and others just did not work. For example, the Bar prompted me to open the Dictation Box for an unsupported application, and moments later informed me that it could not be used here.

Another issue is that the Bar sits over the full-screen app, obstructing some of the text. You can workaround this by shunting it to the right. My guess though is that you will have a frustrating time trying to use Dragon with Store apps; but it is good to see Nuance making the effort.

What else is new? Well, Nuance has made it easier to get started, and no longer forces you to complete a training exercise (training Dragon to understand you, not you to understand Dragon) before you can use a profile. It is not really a big change, since you should do this anyway in order to get good results.

There is also better support for web browsers other than Internet Explorer. In particular, there are extensions for Chrome and Firefox which Nuance says gives “full text control”.

Worth upgrading?

If you want or need speech to text, Dragon is the best option out there, much better in my experience than what is built into Windows, and better on Windows than on a Mac. In that respect, I recommend it; though with the caveat that you should work with a high quality microphone and be willing to invest time and effort in training its recognition engine and learning to use it.

If you have an earlier version, even as far back as 11, is 13 worth the upgrade? That is hard to say. The user interface changes are mostly cosmetic; but if you use the latest Microsoft Office then getting the latest Dragon is worth it for best compatibility.

The other factor is the gradually improving speech recognition. Comparing the accuracy of, say, version 11 with version 13 would be a valuable exercise but sadly I have not found time to do it. I can report my impression that it makes fewer errors than ever in this version, but that is subjective.

Frankly, if you use dictation a lot, get the latest version anyway; even small improvements add up to more productivity and less frustration.

Review: Nuance Dragon Dictate 4 for the Mac

There is something liberating about working without a keyboard – and I do not mean stabbing hopefully at a touch screen. Voice control means you can sit back, easily refer to books or papers,  and input text more quickly and naturally than is possible using a keyboard. Some conditions including RSI (Repetitive Strain Injury) may make dictation a necessity. I use dictation for transcribing interviews and for rapid text input generally. I do not often use dictation for controlling a computer, as opposed to entering and editing text, but this is also a key feature.

image

Nuance has the best voice recognition system available as far as I can tell, though my experience is mainly with Nuance Dragon NaturallySpeaking on Windows. But what about Mac users? For them, Nuance provides Dragon Dictate, which has recently been updated to version 4. It is not a port of Dragon NaturallySpeaking, but rather has its own distinctive features, though it is less comprehensive, and a glance at the Nuance forums suggests that Mac users feel a bit neglected.

Does Dragon Dictate 4 change that? The good news is that the voice recognition engine in Dragon Dictate appears to be just as good as the one in Dragon naturally speaking. The accuracy is superb though you still have to be realistic. Some recognition problems are just very difficult and the software is bound to make mistakes especially in specialist fields – mine is programming and a specialist phrase like “JIT compiler” is bound to cause an error (Dragon thinks I want “Jet compiler”). Similarly, “pull request” became “full request”. Over time you can build up a custom vocabulary, but recognition will never be 100%, so a dictation system has to handle corrections as well as original input.

Setting up Dragon Dictate involves installing the software and then letting it create a profile and doing some training so that Dragon can learn the characteristics of your voice. I highly recommend using a good quality headset since without it we cannot expect accurate recognition. I found the setup process quick and painless and was soon up and running.

Dragon Dictate has five modes:

  • Dictation Mode is what you use most of the time.
  • Spelling Mode is for spelling out problematic words. You can speak the letters naturally or use the International Radio Alphabet (Alpha Bravo Charlie etc). It is a nice feature since if you know Dragon is likely to get something wrong, you can switch to Spelling Mode, enter the difficult word, and then go back to Dictation Mode.
  • Numbers Mode is for typing numbers.
  • Command Mode is for non-dictation commands. However, commands also work in Dictation Mode. The advantage of Command Mode is that Dragon will not misinterpret your commands as text input; but there is no way to configure Dictation Mode to prevent it interpreting speech intended as text as commands. The manual suggests that you use unnatural pauses for this. For example, if you are reviewing Dragon Dictate and want to type “Command Mode”, you can say “Command [pause] Mode” and get what you want.
  • Sleep Mode puts Dragon in a resting state, so for example you can take a telephone call without Dragon trying to transcribe it.

Switching mode is easy: just speak the mode you want. If Dragon is in Sleep Mode, you can say “Wake up”.

My initial experience with Dragon Dictate 4 was not too good. The problems were not with recognition but rather with navigating and correcting existing text, which I found harder than in Dragon NaturallySpeaking on Windows. In fact my attempts to make corrections all too often ended up with more and more errors as a correction went wrong and I would be trying to correct the correction, getting increasingly frustrated.

Using Microsoft Word 2011, I experienced unexpected behaviour. For example, if I put the cursor in between two words and dictated a word to insert, sometimes the word appeared elsewhere in the text.

Another odd thing: I dictated "for example", and Dragon recognised it as "one example”. No problem: Dragon has a Recognition Window which lists alternatives when you say “correct” followed by the word you want to amend. I said "correct one” and the recognition window appeared offering "for example" as one of the choices. I selected it, but Dragon then entered “for example example” in the text. I was not offered the word “for” on its own.

Dragon Dictate 4 was rescued from a terrible review when I studied the manual. Towards the end is a section entitled “The Cache and the Golden Rule”. This explains that you should not combine the use of keyboard and mouse with dictation when editing a document. If you do, Dragon gets confused about the contents of the document and you see unexpected results. You can fix this with a special command, “Cache Document”, which tells the software to clear and rebuild the cache for the entire document.

If you are not aware of this issue, then you are likely to make increasing use of keyboard and mouse as Dragon gets it wrong, making the issue worse. That is exactly what had happened to me.

Another key point is the difference between training and correcting. If you use the Recognition Window to make a change that is not in fact a recognition fault – such as changing “good” to “excellent” – then you will confuse the voice training. Rather, you should say “Select good”, to select the word you want to change, and then say “excellent” to overtype it.

After studying the manual, I got much better results, though Dragon Dictate still occasionally seems to have a mind of its own.

Nevertheless, this fussiness is a weakness in the software. The best software works the way you want it to, rather than making the user do things a certain way. Why cannot Dragon do its cache repairs automatically in the background?

Still, what Dragon offers is of high value, and in this case if you want the best results you have to do the homework.

There are a few others things to mention. Nuance offers a free app for the iPhone that lets you use it as a remote microphone. Personally I find a headset more convenient but I guess there are scenarios where this is useful.

There are also features in Dragon Dictate aimed at general system control. I tried the MouseGrid, which overlays a grid over the entire screen and lets you zoom into the area of interest for accurate mouse control. You can also move the mouse using Up, Down and so on under voice control, and perform single, double or triple clicks.

Conclusion? The software does not feel as complete or as polished as Dragon NaturallySpeaking, but the excellent voice recognition means that this is the best available for the Mac. Recommended, but with reservations.

Dragon Notes Review: quick voice to text for Windows 8, but is it good enough?

When I saw that Nuance had released a Dragon Notes app for Windows 8 I was intrigued for two reasons.

First, I am interested in tracking the health of the app market for Windows 8, and an app from a company as well respected as Nuance is worth looking at.

Second, I have great respect for the Dragon Dictate application for speech to text. Dragon Dictate is superb; indispensable if you cannot use a keyboard for some reason, and valuable even if you can, whether to fend off RSI (Repetitive Strain Injury) or to help transcribe an interview. If Notes is based on the same engine, it could be very useful.

I installed it for review and was intrigued to find that it is not a real Windows 8 app, installed from the Windows Store. Rather, it is a desktop app designed to look superficially like Metro, the touch-friendly user interface in Windows Store apps. That said, the effect is rather odd since it does not run full screen or support the normal gestures and conventions, like settings in the Charms menu.

image

Still, it is mostly touch-friendly. I say “mostly” because occasionally it departs from the Metro-style user interface and reverts to something more like desktop-style – like these small and ugly buttons in the delete confirmation dialog:

image

This is sloppy design; look at the lack of margin around the button captions, the childish “No Way!”, and the fact that these buttons are smaller than they should be for comfortable touch control.

In the main part of the user interface the design remains poor. The font size is too small and there appears to be no way to change it. “Settings” lets you access Help, select language, connect to Twitter and Facebook, and register the product. That is all.

The big question though: how well does it work? Dragon Notes is different from Dragon Dictate, in that there is no voice training; it just does its best with whatever voice it hears.

Notes are easy to make; just tap Record, and tap again (or stop talking) to finish. You can transcribe for a maximum of 30 seconds, though you can also append to an existing note.

My initial results on a Surface Pro tablet, using the built-in microphone, were dire. Hardly any words were recognised. Before giving up though, I had a look at the microphone settings and made a recording using Sound Recorder. The result was a distorted mess, and I do not blame Dragon Notes for making no sense of it. I changed the levels in Windows, reducing the “Microphone Boost” until the level was reasonable but not distorted.

image  

The improvement in Dragon Notes was dramatic. Speaking a simple note slowly and carefully I could get almost perfect accuracy.

I attached a high quality Plantronics headset and tried Wordsworth’s Daffodils:

image

Not bad, but not perfect either. (I did dictate “over” rather than “o’er” as the latter is just too difficult for Dragon).

Here is one of my efforts with the built-in microphone:

image

Again, not that bad, but not something you could use without editing.

And that could be a problem. In the full Dragon Dictate you can use commands like “Select Fattening” and then select a correction, or repeat the word, or spell it. The only commands in Dragon Notes are for basic punctuation, posting to Facebook and Twitter, sending in an email, or searching the web.

This last is fun when it works. Tap to record, speak a word or phrase, then when it is recognised say “Search the web”.

image 

Summary: simple voice to text that works somewhat, terrible user interface design but basic enough that you will not struggle to use it.

Imitating a Metro user interface is a mistake; it is neither one thing nor the other. It is a shame Nuance did not do a proper Windows Store app.

That aside, how useful is this? It all hinges on the quality of the voice recognition, which will vary according to your voice, your microphone, and the quietness of your surroundings.

In the worst case it will be useless. In the best case, I can see some value in dictating a quick note rather than struggling to type with the on-screen keyboard, presuming you are in fact using a tablet.

It would help though if Dragon would record your voice as well as transcribing it, so that if the text is not intelligible you can later refer back to the recording.

A lot of the time you will end up having to edit the note with the keyboard to fix problems, which lessens its value.

Plenty of potential here, but with sloppy fake Metro design and features that are too limited it cannot yet be recommended.

More information on Dragon Notes is here.

Review: Dragon NaturallySpeaking 12. Stunning accuracy, a few annoyances

I am writing this review, or should I say dictating, in Nuance’s Dragon NaturallySpeaking 12, the latest version of what is in my experience the most accurate speech recognition system out there. Accuracy has got to the point where the great majority of words are recognised perfectly. There are a few intractable problems though. How is a dictation system meant to distinguish between nuances and Nuance’s, for example? The answer is generally that it cannot, but in mitigation Dragon has an excellent correction box. You speak a command to select the intransigent word, and either select the correct spelling from a list or in the worst case spell it out. After a bit of practice you can progress quickly and easily.

image

First, a few quick facts about the system. Your first task after running setup is to set levels and check the quality of your microphone. Nuance supplies a microphone in the box, which is worth it because the average user is unlikely to have a suitable microphone of good enough quality. That said, I was unhappy with the quality of the microphone supplied this time around and will return to this issue later. There is a handy fold-out reference card supplied, a nice touch.

Once set up, Dragon walks you through a quick training exercise during which it sets up a profile with some knowledge about your particular voice. I remember spending ages training early voice recognition systems and it was a tedious procedure. This is no longer the case and Dragon can be set up effectively in just a few minutes.

Dragon runs by default with a menu bar across the top of the screen and a contextual sidebar which lists common commands for the particular application you are using. The sidebar also gives a quick reference to global commands such as those to wake or sleep the microphone, move the mouse, or even post to Twitter or Facebook. Once you have learned all the commands, you can close the sidebar to get your screen space back.

image

Dragon works best in applications which are supported, which includes the obvious ones like Word and OpenOffice. In other applications you can use a dictation box which lets you dictate into a Dragon window and then transfer your text in either plain or Rich Text Format. Microsoft Office support depends on an add-In. Unfortunately I am currently running the Office 2013 preview and the add-in currently causes Word to crash. No doubt this will be fixed when the final version of Office is released. As an alternative I used OpenOffice which worked fine. I was also able to use Word 2013 with the dictation box.

While the accuracy is impressive, I did find that recognition slows down on occasion for no obvious reason, which is annoying and slows down your work.

Dragon is not limited to text input. You can run your entire Windows session with speech, using it to switch between windows, move and click the mouse. I found that Dragon works well in dialogs, using the Tab command to switch between fields, and Click … to click buttons and checkboxes.

If you have the Premium edition, you can also use Dragon to transcribe recordings and to read back editable text. Do not get your hopes up too much. If you create a recording of your own voice using a high quality recorder, you can get good results. I tried transcribing a telephone call though, and got gibberish.

So what is new in Dragon 12? It has to be said that version 11.5 was already very good. Accuracy is perhaps slightly improved, but not as much as 11.5 improved over 11. You do get the Dictation Box. You also get browser extensions for the Web-based Gmail and Hotmail provided you use a supported browser, which includes IE9, Firefox 12 or higher, and Google Chrome 16 or higher. I tested this with Gmail in Chrome and it does make a big difference to usability. Go to a Google Doc though, and it is back to the Dictation Box.

Also new in version 12 is the ability to disable voice commands that you do not use to boost performance. The full list of new features is available on the Nuance website.

Now about that microphone. The headset that came in my box is called the HS-GEN-C, and include an adaptor so it can be used with the combined earbud/microphone inputs now common, especially on tablets and laptops. However I had difficulty getting this to work well. It failed Dragon’s built in microphone test at first, though with some effort and speaking more loudly than usual I managed to get it reported as “acceptable. This could be because of a poor microphone preamp on the PC, though I got the same results with another machine. I did not want to test the software with doubtful microphone input, so I used a the Plantronics Bluetooth headset that came with Dragon 11.5 instead. This passed the microphone check first time.

image

I also tried Dragon NaturallySpeaking with Windows 8. The news is mixed. On the plus side, Dragon worked fine in the Windows desktop and with applications like Google Chrome and OpenOffice Writer. When I switched to the Modern UI (formerly known as Metro) though, I could not get Dragon to work at all. This does not surprise me since the Windows Runtime environment is different from the desktop. I do not see how the Dragon sidebar will ever work, for example, since all apps run full-screen. Nor is the Dragon bar available in the Modern UI. Microsoft does claim an accessibility story for Windows 8, and I am asking Nuance what if anything  is planned for Dragon NaturallySpeaking in this respect.

Do not try to use Dragon with Microsoft’s Office 2013 preview; wait for the final version and proper support.

Conclusion

Dragon NaturallySpeaking combines a high standard of accuracy with strong correction tools. If you are wondering whether speech recognition is a viable and productive technique for text input, have no doubt that it is.

There is still scope for improvement. If I can make sense of my recorded telephone call, then in principle voice recognition should be able to do so as well. It will get there.

Is Dragon now more productive than keyboard and mouse, if you have the choice? It may be in some scenarios, but probably not for expert typists. If you are in the habit of frequently switching applications, for example to research an article you are typing, Dragon can get in the way.

Is Dragon 12 worth the upgrade? From 11.5, that is doubtful unless one of the new features matters a lot to you, perhaps because you use Gmail frequently, for example. From older versions, it probably is.

I am puzzled why Nuance supplies what in my experience was a poor headset for the purpose, though you may be luckier (and the box says “actual model may vary”). I preferred the Plantronics headsets that used to be bundled, but guess that the cost was higher. If you do serious amounts of dictation, do not skimp on the headset as it soon pays for itself.

  

Review: Dragon NaturallySpeaking 11.5

Nuance Dragon NaturallySpeaking is a voice dictation system for Windows, and there is a similar but not identical version available for the Mac. I have been trying version 11.5 in its Premium edition.

Voice recognition is interesting on several levels. Dictation can be quicker than typing, avoids repetitive strain injury, and for some users may be the only practical way to input text and control a computer.

Voice control is also a computing aspiration. In science fiction novels and films from 40 or 50 years ago, the characters use voice to interact with computers like Asimov’s Multivac or Kubrick’s HAL in 2001: A Space Odyssey as a matter of course. It has proved a difficult problem though, and even the best voice recognition systems are frustrating to work with, since mistakes are frequent and corrections difficult.

That said, Dragon NaturallySpeaking is the best I have used. Let me answer a few questions:

Q: Is Dragon good enough to use for real work?

A: Yes. Fire up Dragon, then Microsoft Word, start dictating, and you can write a document without too much pain. Of course there will be errors, but Dragon has an excellent correction system. In the following example, I said “The reason” but Dragon heard “Losing”. I then spoke the command “Select losing” and Dragon popped up a selection box.

image

Now I just have to say “Choose 1” and the error will be fixed.

It is not always so easy, and you may have to spell words like place names and specialist vocabulary, but Dragon learns and you get better at dictating, so perseverance pays.

Dragon has a sidebar which is great when you are learning the system, as it shows brief contextual help for the most commonly used commands. It does occupy significant screen space, so best used when you have a large screen or more than one display.

Q: What is the key to success with dictation?

First, use a good microphone. Some editions of Dragon come with a Plantronics Bluetooth headset, which is ideal for the task. Trying to dictate using the mic built into a laptop, or one of those cheap gaming mics, will only lead to frustration.

Second, be patient. Your first day or two with Dragon will be frustrating, but it gets better.

A quiet room also helps, but with a headset this is not so critical.

Q: Is Dragon good enough that you would use it by choice, even when you could use keyboard and mouse?

For me, not yet. I type professionally, so I am pretty fast, and I do find Dragon gets in the way. If I could reel off a few thousand words in one blast, I might use Dragon, but in practice I find I need to task-switch frequently, checking a fact, searching the web, finding a screenshot, or listening to an interview. You can do almost anything in Windows using Dragon, but using a mouse and and keyboard is much quicker. If you use Dragon just for dictation that is fine, though you do have to set Dragon to stop listening when you are performing other tasks, otherwise Dragon will do something unexpected.

Work patterns vary, and some voices are easier than others for Dragon to interpret, so this is a matter of individual preference.

Q: Do you need Dragon when Windows has its own voice recognition system?

I did a quick test. I read the following paragraph, from a guide book that happens to be close by:

Original:

This little book is not properly a “guide” but rather a collection of random notes and thoughts, and I have published it mainly as a souvenir for those who make a short journey from Wroxham with Broads Tours.

Windows 7:

This little book it is not properly A “guide” but rather a collection of London dates and courts, and I had published in mainly as a souvenir for those who make a short journey from locks on withdrawn schools.

Dragon:

This little book is not properly a “guide” but rather a collection of random notes and thoughts, and I have published it mainly as a souvenir for those who make a short journey from locks and with Broads Tours.

Not a rigorous test; but with my voice and on this particular passage Dragon is well ahead, and that accords with my general impression. I do think the Windows system is usable, but the extra cost of Dragon is worth it if you expect to use dictation frequently.

Q: Any other snags with Dragon?

Yes. Dragon hooks deeply into Windows, as it must do in order to control things like window switching and mouse movement, and I saw an impact on performance and stability. I suspect this can be improved by fine-tuning Dragon’s configuration and by keeping Windows as plain as possible. It also seems to work much better with software for which it is specifically designed, such as Microsoft Office, than with generic text input into software it does not know about, such as Windows Live Writer.

Q: What is new in version 11.5?

Dragon NaturallySpeaking 11.5 is a free upgrade from 11. The most obvious new feature from 11 is that you can use an iPhone as a remote wi-fi microphone. I tried this, which requires creating a new profile specifically for the purpose, and found it works nearly as well as with the Plantronics headset. However, the headset is a lot more convenient so I am not sure what is the benefit.

There are also new commands including “Post to Twitter” and “Post to Facebook”, and both the user interface and the voice recognition engine have been fine-tuned in this version.

Finally, version 11.5 specifically supports Windows 7 SP1 and Internet Explorer 9.

Q: Any other features worth mentioning?

The Premium edition has a transcription feature. No, this will not successfully interpret your recorded interview, though I suppose this might work in ideal circumstances. Rather, it is intended to let you dictate into a recording device for transcription later. This is an interesting way of working. It is easier to pause and restart a recorder than to interrupt a live dictation session, and Dragon can take more time over analysing a recording than when it has to keep up with your voice.

Concluding remarks

Nuance Dragon NaturallySpeaking gets significantly better with each new version, tipping me further towards the point where I may start using it in preference to typing. It is not only a matter of improved algorithms, but also more powerful hardware that enables Dragon to do more intensive processing. Although I am not quite ready to use it myself day to day, I think this is a brilliant product, and would not hesitate to recommend it. I also think it is inevitable that voice dictation will eventually become the norm for text input, at least in quiet environments, as the technology continues to improve.