-
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hello, did you try to indicate the position of the vertical lines as below, then filter by an identical value for each line? table=doc.pages[l].extract_table( where |
Beta Was this translation helpful? Give feedback.
-
Hi @YeHW, and thanks for your interest in this library. Unfortunately, it is very difficult to help with a PDF based only on a screenshot, since it describes very little of the actual underlying structure of the PDF. Are you able to share that, or a version redacted via https://github.com/JoshData/pdf-redactor?
There is not currently a way to incorporate rectangle color information in |
Beta Was this translation helpful? Give feedback.
Hi @YeHW, and thanks for your interest in this library. Unfortunately, it is very difficult to help with a PDF based only on a screenshot, since it describes very little of the actual underlying structure of the PDF. Are you able to share that, or a version redacted via https://github.com/JoshData/pdf-redactor?
There is not currently a way to incorporate rectangle color information in
.extract_text()
, but you could write custom code to examine thenon_stroking_color
attributes of allpage.rects
objects, and then use that information to inform your table-extraction strategy.