paperless-gpt –Yet another Paperless-ngx AI companion with LLM-based OCR focus

Hey everyone,

I've noticed discussions in other threads about paperless-ai (which is awesome), and some folks asked how it differs from my project, paperless-gpt. Since I’m a newer user here, I’ll keep things concise:

Context

  1. paperless-ai leans toward doc-based AI chat, letting you converse with your documents.
  2. paperless-gpt focuses on LLM-based OCR (for more accurate scanning of messy or low-quality docs) and a robust pipeline for auto-generating titles/tags.

Why Another Project?

  • I didn't know paperless-ai in Sept. '24: True story :D
  • LLM-based OCR: I wanted a solution that does advanced text extraction from scans, harnessing Large Language Models (OpenAI or Ollama).
  • Tag & Title Workflows: My main passion is building flexible, automated naming and tagging pipelines for paperless-ngx.
  • No Chat (Yet): If you do want doc-based chatting, paperless-ai might be a better fit. Or you can run both—use paperless-gpt for scanning/tags, then pass that cleaned text into paperless-ai for Q&A.

Key Features

  • Multiple LLM Support (OpenAI or Ollama).
  • Customizable Prompts for specialized docs.
  • Auto Document Processing via a “paperless-gpt-auto” tag.
  • Vision LLM-based OCR (experimental) that outperforms standard OCR in many tough scenarios.

Combining With paperless-ai?

  • Totally possible. You could have paperless-gpt handle the scanning & metadata assignment, then feed those improved text results into paperless-ai for doc-based chat.
  • Some folks asked about overlap: we do share the “metadata extraction” idea, but the focus differs.

If You’re Curious

  • The project has a short README, Docker Compose snippet, and minimal environment vars.
  • I’m grateful to a few early sponsors who donated (thank you so much!). That support motivates me to keep adding features (like multi-language OCR support).

Anyway, just wanted to clarify the difference, since people were asking. If you’re looking for OCR specifically—especially for messy scans—paperless-gpt might fit the bill. If doc-based conversation is your need, paperless-ai is out there. Or combine them both!

Happy to answer any questions or feedback you have. Thanks for reading!

Links (in case you want them):

Cheers!