Photo Corners headlinesarchivemikepasini.com


A   S C R A P B O O K   O F   S O L U T I O N S   F O R   T H E   P H O T O G R A P H E R

Enhancing the enjoyment of taking pictures with news that matters, features that entertain and images that delight. Published frequently.

Macro: Extract Text From an Image Share This on LinkedIn   Tweet This   Forward This

19 April 2023

With the release of macOS Monterey, Apple introduced a new software framework called Vision to extract text from an image. At the same time, the company released Shortcuts, an automation tool previously available only on iOS, which tapped into Vision to scan images on the clipboard for text.

Our OCR Macro. A Keyboard Maestro macro with embedded AppleScript tapping into the Vision framework.

We never got Shortcuts to perform that trick on our unsupported hardware and wondered if Vision was for heartier metal than ours.

A BREAKTHROUGH

But the other day Chris Stone, moderator of the Keyboard Maestro forum, posted a note about a discussion on the MacScripter forum about scripting Vision with AppleScript.

In that discussion, a user from Arizona named peavine shared several AppleScripts and one JavaScript for harnessing Vision. And because Keyboard Maestro can embed either in a macro, we thought we'd give it a try.

Our first attempt, which simply prompted for an image file and displayed the text recovered, was a success, although it took 24 seconds to process the data.

Our second try was a screen crop of a newspaper story that itself had a photo of some protestors holding signs. Not only was the text of the story recovered but so were most of the text on the signs.

A Newspaper Story. Even the hand-painted signs were read.

The font sizes ranged from large (easy to convert) to small (hard) and included ligatures and kerning, not to mention tight letterspacing. But the signs were on dark backgrounds and hand-written. The red lettering isn't legible to begin with so Vision didn't manage much there but it handled the white lettering without a problem.

We were impressed.

SOME OPTIONS

So we wondered what we could we bring to this party.

Reading peavine's posts we learned that Vision has several options. The options let you fine-tune the recognition process to optimize results. Here's a brief rundown:

  • Recognition Method can be Accurate or Fast. The first mimics how we read by using a neural network to find text as strings and lines before looking for individual words and sentences. But the faster approach simply mimics traditional OCR techniques using a small machine learning model to recognize individual characters and words.
  • Languages can be configured to tell Vision what language the document primarily uses as well as which other languages appear in it.
  • Language Correction takes a few clock cycles to evaluate the processed text against specific language conventions. Disabling it is quicker but less accurate.

Depending on your source text, these options can have a significant affect on accuracy. So we incorporated them into our macro.

We only included a few languages in our lists but it's easy to add any others.

And we used two buttons to activate the macro after setting up the options for any particular image. One reads the image from the clipboard and the other reads an image file.

THE MACRO

At the moment, we have the file option running reliably but can't quite figure out how to get the clipboard option to work on our unsupported system (it could just be us).

You can download the macro and follow the discussion about it on the Keyboard Maestro forum. And if you still need a copy of Keyboard Maestro, the indispensible macOS utility, you can get a discount from us.


BackBack to Photo Corners