Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Update regex if Huggingface fixes a presumed bug in the OpenAI implementation #2317

Closed
schmmd opened this issue Jan 9, 2019 · 4 comments
Closed
Assignees

Comments

@schmmd
Copy link
Member

schmmd commented Jan 9, 2019

#2310 changed our regular expressions to match Huggingface's, but it's possible they have a bug in their regular expressions (see huggingface/pytorch-openai-transformer-lm#48). If huggingface/pytorch-openai-transformer-lm#48 is resolved we should update

text = re.sub(r'''(-+|~+|!+|"+|;+|\?+|\++|,+|\)+|\(+|\\+|\/+|\*+|\[+|\]+|}+|{+|\|+|_+)''', r' \1 ', text)
with their fix.

@schmmd schmmd self-assigned this Jan 9, 2019
@matt-gardner
Copy link
Contributor

@schmmd, ping.

@schmmd
Copy link
Member Author

schmmd commented Jun 14, 2019

Cool! Although I likely won't get to this before my vacation.

@schmmd
Copy link
Member Author

schmmd commented Jul 8, 2019

Huggingface hasn't made any changes (my issue on their repo is still open)--so there's nothing to do here presently.

@matt-gardner
Copy link
Contributor

We're going to be removing that code entirely, in favor of PretrainedTransformer stuff. Closing this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants