I was emailed an attachment scanned from a magazine; it was a nuisance and I wanted to convert it to text. There are of course a million ways to do this and I recall that every multifunction printer used to come with an OCR facility but what is the easiest way now? For a while I’ve used Microsoft OneNote for this, you just paste in an image, right-click, and there is a Copy Text from Picture option:
This normally works OK but not this time. The results were not completely useless but included lots of errors; words missing and words wrongly recognised or scrambled. I am not sure, for example, how the word “score” got recognized as “scMe”.
So I looked for a better solution online, trying to avoid ad-laden free OCR sites of unknown quality. I found Convertio which has a straightforward introductory service with no registration or ads for the first 10 pages. It did a much better job with only 3 or 4 errors, text converted correctly to two columns in a Word document, and a table converted to a Word table. The main issue was that the text was tiny – 4pt – but that was reasonably easy to fix up. It seems that it has a much better recognition engine than OneNote.
I’ll be inclined to use Convertio again, but it also seems that Microsoft has got behind with this little corner of Office 365. Perhaps it should do something based on its Cognitive Services.