-
Notifications
You must be signed in to change notification settings - Fork 938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeError
raised by extract_text
method with compressed PDF file
#886
Comments
TypeError
raised by `extract_textTypeError
raised by extract_text
method with compressed PDF file
Here’s a simple and uncompressed PDF to reproduce the problem, in case you’d like to avoid installing another tool 😄: The error is caused by the XRef table with Instead of using |
fixed in #1029 (and thank you for weasyprint, it is very nice software!) |
Bug report
Description
I'm generating PDF document through Weasyprint. Since the version 59.0 of this package, I'm not able to extract text from generated compressed PDF files with
pdfminer.highlevel.extract_text
method. Indeed this method raises aTypeError
, invalid length. The exception is raised from a util method called nunpack.So I first open an issue on the Weasyprint repository, but it appears the issue's source could be come from pdfminer itself.
You can take a look to the answer of Weasyprint maintainer, to understand pdfminer concern in this problem.
Steps to reproduce
The text was updated successfully, but these errors were encountered: