Benchmark for sequence classification #874
-
I am trying to classify a sequence of tokens. The setup can be thought as NER task in NLP where for each token multiple possible entity classes. The setup in question is similar to the discussion here, i.e., the point 1 requirement by @AndreaCossu is fulfilled, but defers in point 2. Instead of whole sequence classfication, classification is required for each token(each timestep), but number of timestep(max length of sentence in terms of tokens) is known and fixed. I tried to create AvalancheDataset from the custom pytorch dataset, but got the error - So is it possible to setup benchmark in such scenarios in Avalanche? PS: If I understand correctly, this discussion points thats its not possible, or need extra care while using metrics. Please correct me if wrong. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @davians12 ! Thanks for reaching out. Consider that in avalanche the So, you can:
Let me know if this is clear and works (I haven't tried it yet) or if you need further help in coding this up. Of course, if you find a better solution feel free to share! |
Beta Was this translation helpful? Give feedback.
Hi @davians12 ! Thanks for reaching out.
One hack that should work exploits the fact that Avalanche dataset (as pytorch dataset) is able to return a variable number of elements.
So, when looping over the dataloader you can have a variable number of tensors:
for x, y, a, b, ..., t in dataloader
.Consider that in avalanche the
BaseStrategy
defines the inputmb_x
as the first element returned by the dataloader, the targetmb_y
as the second and the (optional) task labelmb_task_id
as the last one (see here the properties I am mentioning).So, you can: