Fire in da houseTop Tip:Paying $100+ per month for Perplexity, MidJourney, Runway, ChatGPT and other tools is crazy - get all your AI tools in one site starting at $15 per month with Galaxy AI Fire in da houseCheck it out free

ocrtool-mcp

MCP.Pizza Chef: ihugang

ocrtool-mcp is a lightweight, fast OCR server for macOS built with Swift and Apple's Vision framework. It provides accurate text recognition in both Chinese and English, returning line-wise OCR results with bounding boxes. Designed as an MCP-compatible server, it exposes a JSON-RPC interface over stdin, enabling seamless integration with LLM tools and custom agents for real-time OCR processing within AI workflows.

Use This MCP server To

Extract text from images in macOS applications Integrate OCR capabilities into LLM-powered agents Process scanned documents for searchable text Enable real-time text recognition in AI workflows Convert screenshots to editable text with bounding boxes Support multilingual OCR for Chinese and English text Automate data extraction from forms and receipts

README

ocrtool-mcp

πŸ‡¨πŸ‡³ δΈ­ζ–‡ζ–‡ζ‘£

ocrtool-mcp is an open-source macOS-native OCR module built with Swift and Vision framework, designed to comply with the Model Context Protocol (MCP). It can be invoked by LLM tools like Cursor, Continue, OpenDevin, or custom agents using JSON-RPC over stdin.

ocrtool-mcp is a macOS-native OCR tool that implements the stdin-based MCP module protocol, allowing LLM tools like Cursor or Continue to call it via JSON-RPC.

platform language mcp license


✨ Features

  • βœ… Accurate OCR powered by macOS Vision Framework
  • βœ… Recognizes both Chinese and English text
  • βœ… MCP-compatible JSON-RPC interface
  • βœ… Returns line-wise OCR results with bounding boxes (in pixels)
  • βœ… Lightweight, fast, and fully offline
  • βœ… Open source free software

πŸš€ Quick Start

git clone https://github.com/ihugang/ocrtool-mcp.git
cd ocrtool-mcp
swift build -c release

Run as MCP Module:

.build/release/ocrtool-mcp

Send a JSON-RPC request via stdin:

{
  "jsonrpc": "2.0",
  "id": "1",
  "method": "ocr_text",
  "params": {
    "image_path": "test.jpg",
    "lang": "zh+en",
    "enhanced": true
  }
}

Expected output:

{
  "jsonrpc": "2.0",
  "id": "1",
  "result": {
    "lines": [
      { "text": "δ½ ε₯½", "bbox": { "x": 120, "y": 200, "width": 300, "height": 20 } },
      { "text": "Hello", "bbox": { "x": 122, "y": 240, "width": 290, "height": 20 } }
    ]
  }
}

πŸ“ Project Structure

.
β”œβ”€β”€ Package.swift
β”œβ”€β”€ Sources/OCRToolMCP/main.swift
β”œβ”€β”€ .mcp/
β”‚   β”œβ”€β”€ config.json
β”‚   └── schema/ocr_text.json
β”œβ”€β”€ README.md
β”œβ”€β”€ LICENSE
└── .gitignore

πŸ“˜ MCP Integration

You can use this module with:

  • Continue
  • Cursor
  • Any custom LLM agent that supports MCP stdin/stdout JSON-RPC

πŸ›  Cursor Configuration

To use this module in Cursor, add the following to your cursor.json file:

{
  "mcpServers": {
    "ocrtool-mcp": {
      "command": "Full path ... /ocrtool-mcp"
    }
  }
}

πŸ‘¨β€πŸ’» Author

πŸ“ License

MIT License

ocrtool-mcp FAQ

How does ocrtool-mcp integrate with LLM tools?
It uses a JSON-RPC interface over stdin, allowing LLM tools like Cursor and Continue to invoke OCR functions seamlessly.
What languages does ocrtool-mcp support?
It supports accurate OCR for both Chinese and English text using the macOS Vision framework.
Is ocrtool-mcp compatible with MCP?
Yes, it fully implements the MCP module protocol for easy integration with MCP clients and agents.
What platform does ocrtool-mcp run on?
It is a macOS-native server built with Swift, leveraging the macOS Vision framework.
Can ocrtool-mcp return detailed OCR results?
Yes, it returns line-wise OCR results including bounding boxes in pixels for precise text localization.
Is ocrtool-mcp lightweight and fast?
Yes, it is designed to be lightweight and fast for efficient OCR processing on macOS.
How can I invoke ocrtool-mcp from my application?
You can call it via JSON-RPC over stdin, making it easy to integrate with custom agents or LLM tools.
Does ocrtool-mcp support other languages beyond Chinese and English?
Currently, it focuses on Chinese and English OCR using the macOS Vision framework.