You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's becoming increasingly apparent that the Pelias API will have to call both libpostal and placeholder to come up with the correct answer. For example, libpostal parsed "Fort Hood, TX" as:
There's no reasonable non-hacky way to correct for this so the idea is to call both placeholder and libpostal for inputs, then figure out an answer from both responses.
For example, for the input 30 W 26th St, New York, NY, placeholder throws away the 30 W 26th St. For the above strategy to work, the API would need to know which tokens are unknown. In the case of Fort Hood, TX, if the API that placeholder had no leftover tokens then it could reasonably assured that the input was only admin data and could disregard the libpostal input (which is incorrect in this case).
The text was updated successfully, but these errors were encountered:
the tokenize endpoint returns the token 'groups', it would a simple matter of working from left-to-right through this array to find the tokens which didn't match:
[["street","neutral bay","north sydney","new south wales","au"]]
it might actually be better to do this on the placeholder end as these tokens have been normalized and so may not match verbatim the input tokens.
would you like the tokens returned as normalized values or verbatim as they were input by the user? what about punctuation such as commas, periods etc?
At this point, I'm not terribly concerned with the format of the unknown tokens, just whether there were any. An array of them would be nice in case the API would want to know what they are, but for the initial steps, the API would just make decisions on the condition that there were or weren't.
It's becoming increasingly apparent that the Pelias API will have to call both libpostal and placeholder to come up with the correct answer. For example, libpostal parsed "Fort Hood, TX" as:
There's no reasonable non-hacky way to correct for this so the idea is to call both placeholder and libpostal for inputs, then figure out an answer from both responses.
For example, for the input
30 W 26th St, New York, NY
, placeholder throws away the30 W 26th St
. For the above strategy to work, the API would need to know which tokens are unknown. In the case ofFort Hood, TX
, if the API that placeholder had no leftover tokens then it could reasonably assured that the input was only admin data and could disregard the libpostal input (which is incorrect in this case).The text was updated successfully, but these errors were encountered: