Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization format/API #1

Open
porpoiseless opened this issue May 17, 2019 · 1 comment
Open

Serialization format/API #1

porpoiseless opened this issue May 17, 2019 · 1 comment

Comments

@porpoiseless
Copy link

In order to encourage development of related tooling, we should establish a standard way of transmitting and inputting Ithkuil/TNIl words, e.g. to a web frontend. In particular, we should provide canonical solutions to the following problems:

  • Some field names (i.e. "Function") are reserved words in some languages. Should we use "fn" instead?
  • Should fields like "Stem" be addressed as 0-indexed or 1-indexed arrays? Should the new "Stem 0" be at index 0 or at index 3?
  • Some fields, like the one containing the lexical consonant cluster, lack an official singular name. Do we call this field "root" or "cr".
@HactarCE
Copy link

HactarCE commented Sep 1, 2019

  • Some field names (i.e. "Function") are reserved words in some languages. Should we use "fn" instead?

This can be handled on a language-by-language basis. While function may be reserved in one language, fn is the function definition keyword in some Lisps. I know that the standard way to handle conflicts with builtins in Python is to append an underscore: class --> class_. Anyway, any reasonable language should be ok handling reserved words in strings, as they probably would be. My Clojure program is using keywords, which are idiomatic and can handle alphanumerics and most special characters.

  • Should fields like "Stem" be addressed as 0-indexed or 1-indexed arrays? Should the new "Stem 0" be at index 0 or at index 3?

I don't think "stem 0" or "degree 0" ever has its own definition like the other stems/degrees, so this shouldn't be an issue for dictionaries. For word interchange, I think a human-oriented approach should be used: if a human would say it's "degree 2", then the number 2 would be used. If a programming language uses zero-indexed arrays, then the program would have to adjust the index when fetching fetching the definition from a one-indexed lexicon.

  • Some fields, like the one containing the lexical consonant cluster, lack an official singular name. Do we call this field "root" or "cr".

Here's my proposal for all the grammatical category names:

  • affiliation
  • aspect
  • bias
  • case
  • case_scope
  • configuration
  • context
  • degree
  • designation
  • effect
  • essence
  • format (incorp_case?)
  • function
  • illocution
  • incorp_root
  • incorp_specification
  • incorp_stem
  • level
  • mood
  • perspective
  • phase
  • referent
  • register
  • root
  • sanction
  • specification
  • stem
  • type
  • valence
  • version

(I may have forgotten some.)

Names for word types: formative, simple_formative, complex_formative, adjunct_1, adjunct_2, adjunct_3, csvx_adjuct, carrier_adjunct, register_adjunct, pronoun_adjunct (maybe personal_reference_adjunct instead -- but let's face it, they're basically just pronouns), case_stacking_pronoun_adjunct, and parsing_adjunct.

Any other ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants