This repository offers an WebUI and API endpoint that translates PDF files using openai GPT, preserving the original layout.
-
translate PDF files while preserving layout
-
translation engines:
- google translate (default)
- openAI (best)
-
layout recognition engines:
- UniLM DiT
-
OCR engines:
- PaddleOCR
-
Font recognition engines:
- /
- Clone this repository
git clone https://github.com/ppisljar/pdf_translator.git
cd pdf_translator
- Edit config.yaml and enter openai api key change type to 'openai' and enter your key under openai_api_key if this is not changed translation engine will default to google translate
- Build the docker image via Makefile
make build
- Run the docker container via Makefile
make run
- create venv and activte
prerequesites:
- ffmpeg, ... possibly more, check Dockerfile if you are running into issues
python3 -m venv .
source bin/activate
- install requirements
pip3 install -r requirements.txt
pip3 install "git+https://github.com/facebookresearch/detectron2.git"
- get models
make get_models
- run
python3 server.py
Access to GUI via browser.
http://localhost:8765
- NVIDIA GPU (currently only support NVIDIA GPU)
- Docker
This repository does not allow commercial use.
This repository is licensed under CC BY-NC 4.0. See LICENSE for more information.
- Make possible to highlight the translated text
- Support M1 Mac or CPU
- switch to VGT for layout detection
- add font detection (family/style/color/size/alignment)
- add support for translating lists
- add support for translating tables
- add support for translating text within images
-
For PDF layout analysis, using DiT.
-
For PDF to text conversion, using PaddlePaddle model.