Skip to content

ppisljar/pdf_translator

Repository files navigation

PDF Translator

This repository offers an WebUI and API endpoint that translates PDF files using openai GPT, preserving the original layout.

Features

  • translate PDF files while preserving layout

  • translation engines:

    • google translate (default)
    • openAI (best)
  • layout recognition engines:

    • UniLM DiT
  • OCR engines:

    • PaddleOCR
  • Font recognition engines:

    • /

Installation

  1. Clone this repository
   git clone https://github.com/ppisljar/pdf_translator.git
   cd pdf_translator
  1. Edit config.yaml and enter openai api key change type to 'openai' and enter your key under openai_api_key if this is not changed translation engine will default to google translate

docker installation

  1. Build the docker image via Makefile
   make build
  1. Run the docker container via Makefile
   make run

venv installation

  1. create venv and activte

prerequesites:

  • ffmpeg, ... possibly more, check Dockerfile if you are running into issues
python3 -m venv .
source bin/activate
  1. install requirements
pip3 install -r requirements.txt
pip3 install "git+https://github.com/facebookresearch/detectron2.git"
  1. get models
make get_models
  1. run
python3 server.py

GUI Usage

Access to GUI via browser.

http://localhost:8765

Requirements

  • NVIDIA GPU (currently only support NVIDIA GPU)
  • Docker

License

This repository does not allow commercial use.

This repository is licensed under CC BY-NC 4.0. See LICENSE for more information.

TODOs

  • Make possible to highlight the translated text
  • Support M1 Mac or CPU
  • switch to VGT for layout detection
  • add font detection (family/style/color/size/alignment)
  • add support for translating lists
  • add support for translating tables
  • add support for translating text within images

References