Image to Text (OCR) — Extract Text from Any Image
Drop in a photo, screenshot, or scanned document and pull out the text in seconds. Powered by Tesseract.js — 100% in your browser, no upload, no server.
Image
Settings
First run downloads the language data (~5–15 MB). Cached after that.
Extracted Text
Image OCR (Optical Character Recognition) is the process of converting the text inside a photo, screenshot, or scanned document into machine-readable characters. It's the technology behind every "scan to editable" feature in modern productivity apps — and now you can run it entirely in your browser without uploading anything.
TryDocsy's Image to Text tool uses Tesseract.js, the most popular open-source OCR engine, compiled to WebAssembly. That means the OCR model runs locally on your device — there's no server, no API key, no usage limit. The trade-off is that the first run downloads the language data (~5–15 MB per language), so the very first OCR may take a few extra seconds.
Common use cases include extracting text from receipts, business cards, screenshots, handwritten notes, whiteboard photos, scanned book pages, and product labels. The extracted text can be copied to the clipboard or downloaded as a .txt file.
How it works
- 1
Upload an image (JPG, PNG, WEBP, BMP, or GIF). The file is processed locally; nothing is uploaded.
- 2
Choose the language of the text in your image. The default is English. You can select multiple languages for mixed content.
- 3
Click "Extract Text". The Tesseract.js engine runs in your browser. First runs are slower because the language data is downloaded once and cached for subsequent uses.
- 4
Review the extracted text in the result panel. Use the bounding-box overlay (when enabled) to verify each detected character cluster.
- 5
Copy the result to your clipboard, or download it as a .txt file for use in other apps.
Supported languages
We support 20+ languages through Tesseract.js. Common languages include:
Related tools
Frequently Asked Questions
Is my image uploaded to a server?
No. OCR processing happens entirely in your browser using Tesseract.js WebAssembly. Your image is never uploaded, logged, or transmitted to any server.
How accurate is the OCR?
Accuracy depends on the image quality, text size, and language. Clean, high-resolution (300+ DPI) printed text typically achieves 95%+ accuracy. Handwritten or low-resolution text will be less accurate.
What languages are supported?
Tesseract.js supports 100+ languages including English, Spanish, French, German, Italian, Portuguese, Chinese (Simplified and Traditional), Japanese, Korean, Arabic, Hindi, Bengali, Russian, and many more. Use the language selector to add multiple languages for mixed-content images.
What image formats are supported?
JPG, PNG, WEBP, BMP, and GIF. HEIC files are not directly supported in the browser — convert them to JPG first using our HEIC to JPG converter.
Why is the first run slow?
The first run downloads the language data (typically 1–15 MB depending on the language). After the first run, the data is cached by your browser and subsequent OCR is much faster.
Can I use the OCR offline?
Once the language data is cached, you can use the OCR tool offline. The TryDocsy service worker caches the page and the WebAssembly bundle so the tool continues to work without an internet connection.
Is there a file size limit?
There is no hard limit. Because processing happens on your device, the practical limit is your available memory. Most images up to 50 MB process smoothly.
