support tagging phonological features #7

thatbudakguy · 2022-02-26T19:18:37Z

this would more properly be called the Phonologizer, and it could borrow heavily from spaCy's Morphologizer. see for reference Wikipedia on "distinctive features".

The text was updated successfully, but these errors were encountered:

thatbudakguy · 2022-04-03T20:55:28Z

ultimately this could just be another function of the Phonemizer — when the output of the model is just a vector, it's up to the component how to translate that information into phonological data. we could have a new component type that sets phonological properties on tokens, or we could just make this a method available on the Token itself, so that the downstream consumer can request both the phonological features or the phonemes themselves from the same source data.

thatbudakguy · 2022-04-11T23:09:13Z

this becomes synonymous with the existing phonemizer as part of #24; we should rename it Phonologizer accordingly.

also with #22 we should make both components respect overwrite/extend config options (as spacy builtins do) so that they can work together in concert: rule-based runs first, then the statistical version runs and fills in all the gaps (e.g. polyphones).

thatbudakguy added the enhancement New feature or request label Feb 26, 2022

thatbudakguy changed the title ~~create pipeline component for tagging phonological features~~ support tagging phonological features Apr 11, 2022

thatbudakguy mentioned this issue Apr 11, 2022

add phonology module #24

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support tagging phonological features #7

support tagging phonological features #7

thatbudakguy commented Feb 26, 2022

thatbudakguy commented Apr 3, 2022

thatbudakguy commented Apr 11, 2022

support tagging phonological features #7

support tagging phonological features #7

Comments

thatbudakguy commented Feb 26, 2022

thatbudakguy commented Apr 3, 2022

thatbudakguy commented Apr 11, 2022