Variable classifier_dropout #3

KoichiYasuoka · 2024-04-06T14:20:28Z

I've just found that classifier_dropout is fixed to 0.1 in both LlamaForTokenClassification and MistralForTokenClassification as:

        self.dropout = nn.Dropout(0.1)

however in GPT2ForTokenClassification of HuggingFace Transformers it is enhanced as:

        if hasattr(config, "classifier_dropout") and config.classifier_dropout is not None:
            classifier_dropout = config.classifier_dropout
        elif hasattr(config, "hidden_dropout") and config.hidden_dropout is not None:
            classifier_dropout = config.hidden_dropout
        else:
            classifier_dropout = 0.1
        self.dropout = nn.Dropout(classifier_dropout)

and now we are trying to include LlamaForTokenClassification and MistralForTokenClassification into HuggingFace Transformers at huggingface/transformers#29878 . Please show us better way to include them.

The text was updated successfully, but these errors were encountered:

SeanLee97 · 2024-04-07T01:31:08Z

@KoichiYasuoka Thank you for your suggestions! I will add a feature to support classifier_dropout.

B.T.W, the BiLLM's implementation for TokenClassification differs from the official one. In BiLLM, we convert the attention mask from uni- to bi-directional. This change can improve the performance of token classification significantly, according to our experiments in the paper https://arxiv.org/abs/2310.01208

SeanLee97 · 2024-04-07T02:10:03Z

I've added a feature to support classifier_dropout in PR: #4

Now, you can specify classifier_dropout in billm>=0.1.2

KoichiYasuoka · 2024-04-07T08:23:20Z

Thank you @SeanLee97 and now I close this issue. See you later at huggingface/transformers#29878 and huggingface/transformers#29940

KoichiYasuoka closed this as completed Apr 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variable classifier_dropout #3

Variable classifier_dropout #3

KoichiYasuoka commented Apr 6, 2024

SeanLee97 commented Apr 7, 2024

SeanLee97 commented Apr 7, 2024

KoichiYasuoka commented Apr 7, 2024

Variable classifier_dropout #3

Variable classifier_dropout #3

Comments

KoichiYasuoka commented Apr 6, 2024

SeanLee97 commented Apr 7, 2024

SeanLee97 commented Apr 7, 2024

KoichiYasuoka commented Apr 7, 2024