Skip to content

Latest commit

 

History

History
184 lines (146 loc) · 8.95 KB

README.md

File metadata and controls

184 lines (146 loc) · 8.95 KB

Approaches

We include baselines (Finetuning, Freezing and Incremental Joint Training) and the approaches defined in Class-incremental learning: survey and performance evaluation (arxiv). The regularization-based approaches are EWC, MAS, PathInt, LwF, LwM and DMC. The rehearsal approaches are iCaRL, EEIL and RWalk. The bias-correction approaches are IL2M, BiC and LUCIR.

Main usage

When running an experiment, the approach used can be defined in main_incremental.py using --approach. Each approach is called by their respective *.py name. All approaches inherit from class Inc_Learning_Appr, which has the following main arguments:

  • --nepochs: number of epochs per training session (default=200)
  • --lr: starting learning rate (default=0.1)
  • --lr-min: minimum learning rate (default=1e-4)
  • --lr-factor: learning rate decreasing factor (default=3)
  • --lr-patience: maximum patience to wait before decreasing learning rate (default=5)
  • --clipping: clip gradient norm (default=10000)
  • --momentum: momentum factor (default=0.0)
  • --weight-decay: weight decay (L2 penalty) (default=0.0)
  • --warmup-nepochs: number of warm-up epochs (default=0)
  • --warmup-lr-factor: warm-up learning rate factor (default=1.0)
  • --multi-softmax: apply separate softmax for each task (default=False)
  • --fix-bn: fix batch normalization after first task (default=False)
  • --eval-on-train: show train loss and accuracy (default=False)

If the approach has some specific arguments, those are defined in the specific extra_parser() of each approach file and are also listed below. All of this information is also available by using --help.

Allowing rehearsal

For all approaches using exemplars, the corresponding arguments are:

  • --num-exemplars: fixed memory, total number of exemplars (default=0)
  • --num-exemplars-per-class: growing memory, number of exemplars per class (default=0)
  • --exemplar-selection: exemplar selection strategy (default='random')

where --num-exemplars and --num-exemplars-per-class cannot be used at the same time. We extend LwF, EWC, MAS, Path Integral to allow exemplar rehearsal.

Adding new approaches

To add a new approach, follow this:

  1. Create a new file similar to finetuning.py. The name used will be the one that can be called with --approach.
  2. Implement the method as needed and overwrite necessary functions and methods from incremental_learning.py.
  3. Add necessary arguments to the approach parser and make sure to not modify calculate_metrics() unless necessary to make sure that metrics are comparable.

Baselines

Finetuning

--approach finetuning

Learning approach which learns each task incrementally while not using any data or knowledge from previous tasks. By default, weights corresponding to the outputs of previous classes are not updated. This can be changed by using --all-outputs. This approach allows the use of exemplars.

Freezing

--approach freezing

Learning approach which freezes the model after training the first task so only the heads are learned. The task after which the model is frozen can be changed by using --freeze-after num_task (int). As in Finetuning, by default the corresponding to the current task outputs are updated, but can be changed by using --all-outputs.

Incremental Joint Training

--approach joint

Learning approach which has access to all data from previous tasks and serves as an upperbound baseline. Joint training can be combined with Freezing by using --freeze-after num_task (int). However, this option is disabled (default=-1).

Approaches

Learning without Forgetting

--approach lwf arxiv | TPAMI 2017

  • --lamb: forgetting-intransigence trade-off (default=1)
  • --T: temperature scaling (default=2)

iCaRL

--approach icarl arxiv | CVPR 2017 | code

  • --lamb: forgetting-intransigence trade-off (default=1)

Elastic Weight Consolidation

--approach ewc arxiv | PNAS 2017

  • --lamb: forgetting-intransigence trade-off (default=5000)
  • --alpha: trade-off for how old and new fisher are fused (default=0.5)
  • --fi-sampling-type: sampling type for Fisher information (default='max_pred')
  • --fi-num-samples: number of samples for Fisher information (-1: all available) (default=-1)

Path Integral (aka Synaptic Intelligence)

--approach path_integral arxiv | ICML 2017 | code

  • --lamb: forgetting-intransigence trade-off (default=0.1)
  • --damping: damping (default=0.1)

Memory Aware Synapses

--approach mas arxiv | ECCV 2018 | code

  • --lamb: forgetting-intransigence trade-off (default=1)
  • --alpha: trade-off for how old and new fisher are fused (default=0.5)
  • --fi-num-samples: number of samples for Fisher information (-1: all available) (default=-1)

Riemannian Walk

--approach r_walk arxiv | ECCV 2018 | code

  • --lamb: forgetting-intransigence trade-off (default=1)
  • --alpha: trade-off for how old and new fisher are fused (default=0.5)
  • --damping: damping (default=0.1)
  • --fi-sampling-type: sampling type for Fisher information (default='max_pred')
  • --fi-num-samples: number of samples for Fisher information (-1: all available) (default=-1)

End-to-End Incremental Learning

--approach eeil arxiv | ECCV 2018 | code

  • --lamb: forgetting-intransigence trade-off (default=1)
  • --T: temperature scaling (default=2)
  • --lr-finetuning-factor: finetuning learning rate factor (default=0.01)
  • --nepochs-finetuning: number of epochs for balanced training (default=40)
  • --noise-grad: add noise to gradients (default=False)

Learning without Memorizing

--approach lwm arxiv | CVPR 2019

  • --beta: trade-off for distillation loss (default=1)
  • --gamma: trade-off for attention loss (default=1)
  • --gradcam-layer: which layer take for GradCAM calculations (default='layer3')
  • --log-gradcam-samples: how many examples of GradCAM to log (default=0)

Deep Model Consolidation

--approach dmc arxiv | WACV 2020 | code

  • --aux-dataset: auxiliary dataset (default='imagenet_32_reduced')
  • --aux-batch-size: batch size for auxiliary dataset (default=128)

Bias Correction

--approach bic arxiv | CVPR 2019 | code

  • --lamb: forgetting-intransigence trade-off (-1: original moving trade-off) (default=-1)
  • --T: temperature scaling (default=2)
  • --val-exemplar-percentage: percentage of exemplars that will be used for validation (default=0.1)
  • --num-bias-epochs: number of epochs for training bias (default=200)

Learning a Unified Classifier Incrementally via Rebalancing

--approach lucir CVPR 2019 | code

  • --lamb: trade-off for distillation loss (default=5)
  • --lamb-mr: trade-off for the MR loss (default=1)
  • --dist: margin threshold for the MR loss (default=0.5)
  • --K: Number of "new class embeddings chosen as hard negatives for MR loss (default=2)

Dual Memory (IL2M)

--approach il2m ICCV 2019 | code