You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
this would more properly be called the Phonologizer, and it could borrow heavily from spaCy's Morphologizer. see for reference Wikipedia on "distinctive features".
The text was updated successfully, but these errors were encountered:
ultimately this could just be another function of the Phonemizer — when the output of the model is just a vector, it's up to the component how to translate that information into phonological data. we could have a new component type that sets phonological properties on tokens, or we could just make this a method available on the Token itself, so that the downstream consumer can request both the phonological features or the phonemes themselves from the same source data.
this becomes synonymous with the existing phonemizer as part of #24; we should rename it Phonologizer accordingly.
also with #22 we should make both components respect overwrite/extend config options (as spacy builtins do) so that they can work together in concert: rule-based runs first, then the statistical version runs and fills in all the gaps (e.g. polyphones).
thatbudakguy
changed the title
create pipeline component for tagging phonological features
support tagging phonological features
Apr 11, 2022
this would more properly be called the
Phonologizer
, and it could borrow heavily from spaCy'sMorphologizer
. see for reference Wikipedia on "distinctive features".The text was updated successfully, but these errors were encountered: