Recognizing Text from Screenshots in Ubuntu

Motivation

The two top posts1 on my very humble Facebook feed today:

  1. Tim posting about how he tried to use LLMs to learn Spanish.

  2. One of my high school friends posting about an article claiming how Replying ‘Haha So True!’ to Every Meme Your Friend Sends: An Experimental Study in Preserving Social Bonds with Minimum Effort.

And these are quite representative of my two personas.

Anyway, Tim’s post contained a screenshot of his conversation with ChatGPT with a prompt that went like:

My native language is English. I want to practice writing in Spanish, at a high A1 level. Give me, in my native language, 10 sentences suitable for this level, one by one, starting with simple structures and gradually increasing in complexity. Please use a variety of verbs and adjectives. After each sentence, ask me for the translation in Spanish. I will provide my translation, and then you will correct my mistakes and explain the grammar or vocabulary points I haven’t mastered. Praise me when my answers are correct. Use common vocabulary and practical sentences for everyday life. At the end of the exercise, give me a performance review and provide me with advice to improve.

I did not want to copy this word-for-word so I decided to instead find a script to automate this task.

Thankfully, it was a quick Kagi search.

The StackExchange Answer

According to the top answer on this StackExchange thread, which I will copy over here in case that link goes down:

#!/bin/bash 
# Dependencies: tesseract-ocr imagemagick scrot xsel

select tesseract_lang in eng fre equ ;do break;done
# quick language menu, add more if you need other languages.

SCR_IMG=`mktemp`
trap "rm $SCR_IMG*" EXIT

scrot -s $SCR_IMG.png -q 100    
# increase image quality with option -q from default 75 to 100

mogrify -modulate 100,0 -resize 400% $SCR_IMG.png 
#should increase detection rate

tesseract $SCR_IMG.png $SCR_IMG &> /dev/null
cat $SCR_IMG.txt | xsel -bi

exit

This worked quite well. I just had to install xsel and an amusingly named utility called scrot (hello, second persona!).

Awesome Tool

StackExchange user danpla also posted his solution, which is actually a GUI program called dpScreenOCR.

Once it’s installed and opened, it’s simply a matter of pressing the hotkey (default is Ctrl+`) to start the screenshot and pressing it again to end the screenshot.

From there, it could either copy the text into your clipboard or save it in its history.

Conclusion

After succeeding this task, I finally managed to ask ChatGPT to give me a B2 French quiz similar to Tim’s. And here is my performance review:

You’re making great progress! You’ve been able to handle a variety of sentence structures, including conditionals, comparisons, and compound tenses. You’re getting better with verb tenses and understanding nuances in vocabulary, especially in complex sentences.

It was honestly a pleasant experience. Initially, this part of the prompt made me raise an eyebrow

Praise me when my answers are correct.

since it felt like tapping yourself on the back. But it did help encourage me to finish the quiz.

Plus it added some sort of suspense because it wasn’t only telling me my mistakes but also reinforcing the things I did well.

Bonus: Sample Question

Here is a sample question:

Sentence 8:
“They would travel more often if they had enough money.”

What is your translation?

And here was my answer:

Ils voyageraient plus souvent s’ils ont assez de sou.

And the response:

  1. “voyageraient”: Perfect use of the conditional tense! Great job!
  2. “s’ils ont”: When the main clause is in the conditional (voyageraient), the “if” clause must use the imperfect tense (avaient).
  3. “sou” vs. “argent”: Sou is an old-fashioned word for money, used in historical contexts or idiomatic expressions. Argent is the modern, standard term for “money.”
  4. “assez de”: Perfect use of the partitive construction. Well done!

Honestly, I’m not convinced about sou because I have been hearing it a lot lately.

Footnotes

  1. There was also a third post Tuwing kelan dapat bumibili ng bagong brip? which did spark some interesting discussions.