A crossplatform CLI utility to extract text from a screenshot.
Run it, select an area on your screen and the text will be copied to your clipboard.
It relies on readily existing external tools for screenshotting, OCR, and clipboard management.
Built as a good enough solution until someone builds a cross platform super fast native rust application that does the same thing better 😅
Testing Details
| Platform | Tested On | Status |
|---|---|---|
| Linux (X11) | Native Ubuntu 24.04 (X11) | |
| Linux (Wayland) | Quickemu Ubuntu 25.04 (Wayland) | |
| macOS | Quickemu macOS Sequoia |
You will need to have one of the supported tools for each category installed on your system.
| Platform | Screenshot Tools | OCR | Clipboard Tools |
|---|---|---|---|
| Linux (X11) | flameshot, gnome-screenshot |
tesseract |
xclip |
| Linux (Wayland) | flameshot |
tesseract |
wl-copy |
| macOS | flameshot, screencapture |
tesseract |
pbcopy |
sudo apt-get install gnome-screenshot tesseract-ocr xclip
sudo apt-get install flameshot tesseract-ocr wl-clipboard
brew install tesseract
We recommend installing with pipx or uv
pipx install screenshot-to-textor
uv tool install screenshot-to-textBefore the first run, you need to configure the tool:
s2t configThis command will prompt you to select the tools you want to use for screenshotting, OCR, and clipboard operations from the available tools on your system. If your system is missing the necessary tools it will ask you to install them.
It will create a config.toml file in your user configuration directory.
To take a screenshot and extract text to clipboard, run:
s2t runIt is recommended to attach s2t run to a hotkey for maximum convenience.
You can also use the alias screenshot-to-text.
The run command accepts the following options:
--keep-screenshot: Overwrites the default config to keep the screenshot.--no-keep-screenshot: Overwrites the default config to not keep the screenshot.--ocr-enabled: Overwrites the default config to enable OCR.--no-ocr-enabled: Overwrites the default config to disable OCR.
Python versions >=3.11 are supported.
Contributions are welcome! Here are some areas where you can help:
- Add Windows Support: The current implementation is focused on Linux and macOS. Adding support for Windows would be a great contribution. This would involve finding and integrating with Windows-native command-line tools for screenshots and clipboard management.
- Support More OS-Native Tools: If you use a screenshot or clipboard tool that is not currently supported, feel free to open a pull request to add it.
- Add Support for AI OCR Libraries: Currently, only Tesseract is supported for OCR. It would be great to add support for modern AI-based OCR libraries and services (e.g., Google's OCR, or local AI models).
- Improve Image Preprocessing: The current image preprocessing is basic. You can contribute by improving the existing preprocessing steps or adding new ones to improve OCR accuracy. The preprocessing parameters could also be made configurable.
- Tweak Tesseract Parameters: The parameters for Tesseract could be exposed in the configuration file to allow users to fine-tune them for their specific needs.
- Better CLI Flags and Flexibility: Current CLI flags are not exactly useful implementations, these can be improved.
Please open an issue or a pull request to discuss your ideas.
