support for longformer (new 2023 request) (#47)

* Add files via upload * Update __init__.py Added LongformerWithTabular * Update tabular_modeling_auto.py Added support for Longformer * Update tabular_transformers.py Added support for long former * Update setup.py Changed url to point to this package. * Update tabular_transformers.py Changed add_start_docstring * Update tabular_transformers.py Remove @ * Update tabular_transformers.py Changed @add_start_docstrings_to_model_forward * Update tabular_transformers.py Removed @ from transformers.file_utils import @add_start_docstrings_to_model_forward * Update tabular_transformers.py Loaded this last - from transformers.file_utils import add_start_docstrings_to_model_forward * Update tabular_transformers.py Added @ to from transformers.file_utils import @add_start_docstrings_to_model_forward * Update tabular_transformers.py Removed @ from from transformers.file_utils import @add_start_docstrings_to_model_forward * Update tabular_transformers.py Uncommented XLMRobertaConfig and moved add_start_docstrings_to_model_forward to bottom of group * Update tabular_transformers.py Commented out self.embedding_layer = nn.Embedding.from_pretrained(torch.from_numpy(embedding_weights).float(), freeze=True) * Update tabular_transformers.py Uncommented embeddings section * Update tabular_transformers.py Removed text after .format in @add_start_docstrings_to_model_forward(LONGFORMER_INPUTS_DOCSTRING.format("(batch_size, sequence_length)") * Update tabular_transformers.py * Update tabular_transformers.py Copied over the longformer section from sidharrth2002 and changed to @add_start_docstrings_to_model_forward * Update tabular_transformers.py Commented out #self.embedding_layer = nn.Embedding.from_pretrained(torch.from_numpy(embedding_weights).float(), freeze=True) #self.embedding_layer = nn.Embedding() * Update tabular_transformers.py Add import torch * Update tabular_modeling_auto.py Move longformer to top of lists * Update tabular_transformers.py hf_model_config.summary_proj_to_labels=False #Added from XLM example * Update tabular_transformers.py Updated longformer class to match XLM * Update tabular_transformers.py Removed XLM edits * Update tabular_transformers.py Commented out #self.dropout = nn.Dropout(hf_model_config.hidden_dropout_prob) * Update layer_utils.py Changed loss_fct based on this suggestion - https://discuss.huggingface.co/t/data-format-for-bertforsequenceclassification-with-num-labels-2/4156 * Update layer_utils.py Changed back to original value. * Update tabular_combiner.py #changed to matmul based on https://stackoverflow.com/questions/67957655/runtimeerror-self-must-be-a-matrix * Update tabular_combiner.py Trying torch.mul instead of torch.matmul * Update tabular_combiner.py Revert back to torch.mm * Update tabular_combiner.py Changed line 447 back to torch.mul * Update tabular_combiner.py Change back to torch.mm on line 449 and print shapes. * Update tabular_combiner.py Remove shape statements * Update tabular_combiner.py Changed to -1 in combined_feats = torch.cat((text_feats, cat_feats, numerical_feats), dim=-1) * Update tabular_combiner.py Changed from torch.cat to torch.stack in combined_feats = torch.stack((text_feats, cat_feats, numerical_feats), dim=1) * Update tabular_combiner.py Changed back to original code for combined_feats = torch.cat((text_feats, cat_feats, numerical_feats), dim=1) * Update tabular_transformers.py Changed forward to input_ids=torch.LongTensor(batch_size, sequence_length) * Update tabular_transformers.py Updated forward to input_ids(torch.LongTensor(batch_size, sequence_length)) * Update tabular_transformers.py Added comma to input_ids(torch.LongTensor(batch_size, sequence_length)), * Update tabular_transformers.py Changed back to input_ids=None, * Add files via upload * Update tabular_transformers.py Load embeddings * Update tabular_combiner.py combined_feats = torch.stack((text_feats, cat_feats, numerical_feats), dim=1) * Update tabular_transformers.py Removed load embeddings * Update tabular_combiner.py combined_feats = torch.cat((text_feats, cat_feats, numerical_feats), dim=1) * Update tabular_combiner.py Testing with only text - combined_feats = torch.cat((text_feats), dim=1) * Update tabular_combiner.py Changed back to combined_feats = torch.cat((text_feats, cat_feats, numerical_feats), dim=1) * Update tabular_combiner.py combined_feats = torch.cat((text_feats, cat_feats, numerical_feats), dim=0) * Update tabular_combiner.py combined_feats = torch.stack((text_feats, cat_feats, numerical_feats), dim=0) * Update tabular_transformers.py Updated class Longformer to match Roberta * Update tabular_combiner.py Changed to combined_feats = torch.cat((text_feats, cat_feats, numerical_feats), dim=1) * Add files via upload * Delete Longformer_text_w_tabular_classification_042623.ipynb * Delete text_w_tabular_classification.ipynb * Add files via upload * Delete Longformer_text_w_tabular_classification_050423.ipynb * Update layer_utils.py Removed commented line and 3 blank lines * Update tabular_combiner.py Removed commented lines from testing * Add files via upload Added the original notebook to this folder. * Rename Longformer_text_w_tabular_classification_051823.ipynb to longformer_text_w_tabular_classification.ipynb Changed file name * Update setup.py Removed changes in __version__
georgian-io · May 31, 2023 · bfc6c24 · bfc6c24
1 parent bd5bf14
commit bfc6c24
Show file tree

Hide file tree

Showing 5 changed files with 3,946 additions and 4 deletions.
diff --git a/multimodal_transformers/model/__init__.py b/multimodal_transformers/model/__init__.py
@@ -5,6 +5,7 @@
     BertWithTabular,
     RobertaWithTabular,
     DistilBertWithTabular,
+    LongformerWithTabular
 )
 
 
@@ -15,4 +16,5 @@
     "BertWithTabular",
     "RobertaWithTabular",
     "DistilBertWithTabular",
+    "LongformerWithTabular"
 ]
diff --git a/multimodal_transformers/model/tabular_combiner.py b/multimodal_transformers/model/tabular_combiner.py
@@ -455,7 +455,7 @@ def forward(self, text_feats, cat_feats=None, numerical_feats=None):
             if cat_feats.shape[1] != 0:
                 if self.cat_feat_dim > self.text_out_dim:
                     cat_feats = self.cat_mlp(cat_feats)
-                w_cat = torch.mm(cat_feats, self.weight_cat)
+                w_cat = torch.mm(cat_feats, self.weight_cat) 
                 g_cat = (
                     (torch.cat([w_text, w_cat], dim=-1) * self.weight_a)
                     .sum(dim=1)

diff --git a/multimodal_transformers/model/tabular_modeling_auto.py b/multimodal_transformers/model/tabular_modeling_auto.py
@@ -2,36 +2,39 @@
 
 from transformers.configuration_utils import PretrainedConfig
 from transformers import (
+    LongformerConfig,
     AutoConfig,
     AlbertConfig,
     BertConfig,
     DistilBertConfig,
     RobertaConfig,
     XLNetConfig,
     XLMConfig,
-    XLMRobertaConfig,
+    XLMRobertaConfig
 )
 
 from .tabular_transformers import (
+    LongformerWithTabular,
     RobertaWithTabular,
     BertWithTabular,
     DistilBertWithTabular,
     AlbertWithTabular,
     XLNetWithTabular,
     XLMWithTabular,
-    XLMRobertaWithTabular,
+    XLMRobertaWithTabular
 )
 
 
 MODEL_FOR_SEQUENCE_W_TABULAR_CLASSIFICATION_MAPPING = OrderedDict(
     [
+        (LongformerConfig, LongformerWithTabular),
         (RobertaConfig, RobertaWithTabular),
         (BertConfig, BertWithTabular),
         (DistilBertConfig, DistilBertWithTabular),
         (AlbertConfig, AlbertWithTabular),
         (XLNetConfig, XLNetWithTabular),
         (XLMConfig, XLMWithTabular),
-        (XLMRobertaConfig, XLMRobertaWithTabular),
+        (XLMRobertaConfig, XLMRobertaWithTabular)
     ]
 )
 

diff --git a/multimodal_transformers/model/tabular_transformers.py b/multimodal_transformers/model/tabular_transformers.py
@@ -1,3 +1,4 @@
+import torch
 from torch import nn
 from transformers import (
     BertForSequenceClassification,
@@ -6,6 +7,7 @@
     AlbertForSequenceClassification,
     XLNetForSequenceClassification,
     XLMForSequenceClassification,
+    LongformerForSequenceClassification
 )
 from transformers.models.bert.modeling_bert import BERT_INPUTS_DOCSTRING
 from transformers.models.roberta.modeling_roberta import ROBERTA_INPUTS_DOCSTRING
@@ -15,6 +17,7 @@
 from transformers.models.albert.modeling_albert import ALBERT_INPUTS_DOCSTRING
 from transformers.models.xlnet.modeling_xlnet import XLNET_INPUTS_DOCSTRING
 from transformers.models.xlm.modeling_xlm import XLM_INPUTS_DOCSTRING
+from transformers.models.longformer.modeling_longformer import LONGFORMER_INPUTS_DOCSTRING
 from transformers.models.xlm_roberta.configuration_xlm_roberta import XLMRobertaConfig
 from transformers.file_utils import add_start_docstrings_to_model_forward
 
@@ -679,3 +682,90 @@ def forward(
             class_weights,
         )
         return loss, logits, classifier_layer_outputs
+
+class LongformerWithTabular(LongformerForSequenceClassification):
+    """
+    Longformer Model With Sequence Classification Head
+    """
+    def __init__(self, hf_model_config): #, embedding_weights=None):
+        #hf_model_config.summary_proj_to_labels=False #Added from XLM example
+        super().__init__(hf_model_config)
+        tabular_config = hf_model_config.tabular_config
+        if type(tabular_config) is dict:  # when loading from saved model
+            tabular_config = TabularConfig(**tabular_config)
+        else:
+            self.config.tabular_config = tabular_config.__dict__
+
+        tabular_config.text_feat_dim = hf_model_config.hidden_size
+        tabular_config.hidden_dropout_prob = hf_model_config.hidden_dropout_prob
+        self.tabular_combiner = TabularFeatCombiner(tabular_config)
+        self.num_labels = tabular_config.num_labels
+        combined_feat_dim = self.tabular_combiner.final_out_dim
+        self.dropout = nn.Dropout(hf_model_config.hidden_dropout_prob)
+        if tabular_config.use_simple_classifier:
+            self.tabular_classifier = nn.Linear(combined_feat_dim,
+                                                tabular_config.num_labels)
+        else:
+            dims = calc_mlp_dims(combined_feat_dim,
+                                 division=tabular_config.mlp_division,
+                                 output_dim=tabular_config.num_labels)
+            self.tabular_classifier = MLP(combined_feat_dim,
+                                          tabular_config.num_labels,
+                                          num_hidden_lyr=len(dims),
+                                          dropout_prob=tabular_config.mlp_dropout,
+                                          hidden_channels=dims,
+                                          bn=True)
+
+        # load embeddings
+        #self.embedding_layer = nn.Embedding.from_pretrained(torch.from_numpy(embedding_weights).float(), freeze=True)
+        #self.embedding_layer = nn.Embedding()
+
+    @add_start_docstrings_to_model_forward(LONGFORMER_INPUTS_DOCSTRING.format("(batch_size, sequence_length)"))
+    def forward(
+        self,
+        #input_ids(torch.LongTensor(batch_size, sequence_length)),
+        input_ids=None,
+        attention_mask=None,
+        token_type_ids=None,
+        position_ids=None,
+        #global_attention_mask=None,
+        head_mask=None,
+        inputs_embeds=None,
+        labels=None,
+        output_attentions=None,
+        output_hidden_states=None,
+        #return_dict=None,
+        class_weights=None,
+        cat_feats=None,
+        numerical_feats=None
+    ):
+#         if global_attention_mask is None:
+#             print("Initializing global attention on CLS token...")
+#             global_attention_mask = torch.zeros_like(input_ids)
+#             # global attention on cls token
+#             global_attention_mask[:, 0] = 1
+
+        outputs = self.longformer(
+            input_ids,
+            attention_mask=attention_mask,
+            #global_attention_mask=global_attention_mask,
+            token_type_ids=token_type_ids,
+            position_ids=position_ids,
+            head_mask=head_mask,
+            inputs_embeds=inputs_embeds,
+            output_attentions=output_attentions,
+            output_hidden_states=output_hidden_states,
+            #return_dict=return_dict,
+        )
+        sequence_output = outputs[0] #added from Roberta
+        text_feats = sequence_output[:,0,:] #added from Roberta
+        text_feats = self.dropout(text_feats) #added from Roberta
+        combined_feats = self.tabular_combiner(text_feats,
+                                               cat_feats,
+                                               numerical_feats)
+        loss, logits, classifier_layer_outputs = hf_loss_func(combined_feats,
+                                                              self.tabular_classifier,
+                                                              labels,
+                                                              self.num_labels,
+                                                              class_weights)
+        return loss, logits, classifier_layer_outputs