Possible bug: two letters in 'text' field of 'char' #361
Answered
by
jsvine
LivingDeadCloud
asked this question in
Q&A
-
Beta Was this translation helpful? Give feedback.
Answered by
jsvine
Feb 25, 2021
Replies: 1 comment
-
Hi @LivingDeadCloud, I think you've come across a ligature — a typographic convention in which two letters combined into a single symbol: https://en.wikipedia.org/wiki/Orthographic_ligature So the char you see is, indeed, one symbol — but it represents two letters. I don't know exactly where the conversion happens, but it's "before" pdfplumber. My best guess, without doing further research, is that it's "before" pdfminer.six as well and instead part of the font's definition. |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
LivingDeadCloud
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi @LivingDeadCloud, I think you've come across a ligature — a typographic convention in which two letters combined into a single symbol: https://en.wikipedia.org/wiki/Orthographic_ligature
So the char you see is, indeed, one symbol — but it represents two letters. I don't know exactly where the conversion happens, but it's "before" pdfplumber. My best guess, without doing further research, is that it's "before" pdfminer.six as well and instead part of the font's definition.