Skip to main content

Internal Tool

Document OCR Tool

Auto-capture & Apple Vision OCR. Instant Markdown & PDF generation.

Automatic screen capture with Apple Vision OCR for high-accuracy text extraction. Two modes — Web UI and headless CLI — running entirely locally on your machine.

Problem

Document Management Challenges

When you need to reuse text from digital documents, manual copy-paste is inefficient.

Inefficient Manual Copying

Manually copying text page by page is extremely time-consuming for large documents.

Lost Formatting

Copied text loses its original structure and formatting, requiring additional cleanup work.

No Searchability

Image-based documents can't be text-searched, making it hard to find specific information.

No Batch Processing

No way to convert multi-page documents to text at once — each page must be handled individually.

How It Works

5-Tab Web UI

A browser-based UI for intuitive operation from capture to export.

01

Auto — Screen Capture

Auto-capture screenshots with automatic page turning from your document app.

02

Upload — Image Upload

Drag & drop manually captured screenshots for upload.

03

OCR — Text Extraction

Batch-process all pages with Apple Vision OCR. High-accuracy Japanese text recognition.

04

Edit — Review & Edit

Review and edit extracted text in the browser. Make manual corrections as needed.

05

Export — Download

Export as Markdown or PDF. Download instantly.

Features

Key Features

Everything you need for document OCR.

Auto Capture + Page Turning

Automatically screenshot document app pages with auto page turning. Manual upload also supported.

Apple Vision OCR

High-accuracy Japanese text recognition using macOS built-in Vision framework. Fully local, no external services required.

Auto Spread Page Splitting

Automatically splits spread pages into left/right halves for correct reading order. Handles landscape scans.

Markdown & PDF Export

Export OCR results as Markdown or PDF. Text-only MD export also available.

Headless CLI

Run from terminal with a single command. Perfect for automation scripts and batch processing.

Fully Local Operation

No data is ever sent externally. All processing runs entirely on your local machine — safe for confidential documents.

Tech Stack

Technology

FastAPI + uvicorn

High-performance API server

Apple Vision

macOS built-in high-accuracy OCR

Tailwind CSS

Dark mode modern UI

Local Only

No external data transfer

Setup

Getting Started

Up and running in minutes.

Launch Web UI

Run uvicorn server:app --port 8000 to start the server. Access http://localhost:8000 in your browser.

Launch Headless CLI

Run python kindle_cli.py --title "Title" --pages 50 for one-command execution from terminal.

First-Time Setup

Run python3 -m venv venv && pip install -r requirements.txt to install dependencies.

Requirements

System Requirements

macOS (Apple Silicon / Intel)

Python 3.10+

Screen Recording permission (enable in System Settings)

Document OCR Tool

Automatic screen capture with Apple Vision OCR for high-accuracy text extraction. Two modes — Web UI and headless CLI — running entirely locally on your machine.

View on GitHub