pdf-images-extractor
is a very easy to use images extractor for PDF files.
It extracts all images from one or several PDF files and saves them as separate image files.
In the following example we will extract images from two PDF files test1.pdf
and test2.pdf
:
berthier@cogip:~$ ./pdfimex.py ~/Downloads/test1.pdf ~/Downloads/test2.pdf
The folder test1-images
which contains all extracted images from test1.pdf
has been created in the same
folder as test1.pdf
:
berthier@cogip:~$ ll ~/Downloads/test1-images/
total 272
drwxrwxr-x 2 berthier berthier 4096 déc. 21 23:30 ./
drwxr-xr-x 4 berthier berthier 4096 déc. 21 23:30 ../
-rw-rw-r-- 1 berthier berthier 163255 déc. 21 23:30 image_1.jpg
-rw-rw-r-- 1 berthier berthier 105597 déc. 21 23:30 image_2.jpg
Same goes for test2.pdf
:
berthier@cogip:~$ ll ~/Downloads/test2-images/
total 272
drwxrwxr-x 2 berthier berthier 4096 déc. 21 23:30 ./
drwxr-xr-x 4 berthier berthier 4096 déc. 21 23:30 ../
-rw-rw-r-- 1 berthier berthier 163255 déc. 21 23:30 image_1.jpg
-rw-rw-r-- 1 berthier berthier 105597 déc. 21 23:30 image_2.jpg
The following dependencies must be satisfied:
pip install Pillow
pip install PyPDF2