Train DINO from scratch #1

xiaoyatang · 2024-11-29T22:50:05Z

Dear authors,

Thanks for your work! A simple question: why do we need a dino_scratch.pth for training DINO from scratch? Isn't it from scratch?
Running case:"
python compute_feats.py
--embedder=DINO
--num_classes=2048
--backbone=vit_small
--weights=embedders/dino_scratch.pth
--version_name=dino_scratch"

da-nial · 2024-11-29T23:17:16Z

Thanks for the question! Let me clarify:

The dino_scratch.pth file contains the weights of the DINO model after it has been trained from scratch on the patches. While DINO is indeed trained "from scratch", we need to save these trained weights so we can use them later for feature extraction. The compute_feats.py script you referenced is the step that comes after training - it uses the already-trained DINO model (i.e. embedder, loaded from dino_scratch.pth) to extract features (embeddings) from patches.

xiaoyatang · 2024-11-29T23:29:47Z

Thanks for the question! Let me clarify:

The dino_scratch.pth file contains the weights of the DINO model after it has been trained from scratch on the patches. While DINO is indeed trained "from scratch", we need to save these trained weights so we can use them later for feature extraction. The compute_feats.py script you referenced is the step that comes after training - it uses the already-trained DINO model (i.e. embedder, loaded from dino_scratch.pth) to extract features (embeddings) from patches.

Hi Danial,

Thank you for your prompt reply! That helps a lot! I have 3 more questions:

Did you provide the code for training the embedded from scratch on patches?
I am using my own patches generated from Camelyon16. Does the tile_label.csv you generated in 'deepzoom_tiler_camelyon16.py' contain labels for all patches or tiles?
What does the 'bags' mean in compute_feats.py? In your provided dataset structure: {DATASETS_PATH}/{args.dataset}/'single/{args.fold}'/{train, validation, test}/{0_normal,1_tumor} I can't see what is 'bags'.

da-nial · 2024-11-30T00:36:53Z

Hi! Thanks for your follow-up questions. Let me address each one:

For training DINO from scratch, we actually don't provide the code in our repository because the original DINO repository implementation is sufficient. You can follow their training procedure as mentioned in this section. We only provide the code for our adapter-based methods (DINO with Adapter, MAE with Adapter).
Yes, the tile_label.csv generated by deepzoom_tiler_camelyon16.py contains labels for all patches/tiles. This file is necessary if you want to evaluate patch-level metrics (e.g. accuracy or AUC) for patch classification and ROI detection.
The term "bags" comes from treating WSI classification as a Multiple Instance Learning (MIL) problem. In this context:
- A "bag" refers to a single WSI
- The "instances" are the individual patches/tiles from that slide
  we sometimes use these terms interchangeably in the code: Slide/Bag and Instance/Patch/Tile.

Please let me know if you have any other questions!

xiaoyatang · 2024-11-30T06:06:49Z

Hi! Thanks for your follow-up questions. Let me address each one:

For training DINO from scratch, we actually don't provide the code in our repository because the original DINO repository implementation is sufficient. You can follow their training procedure as mentioned in this section. We only provide the code for our adapter-based methods (DINO with Adapter, MAE with Adapter).

Yes, the tile_label.csv generated by deepzoom_tiler_camelyon16.py contains labels for all patches/tiles. This file is necessary if you want to evaluate patch-level metrics (e.g. accuracy or AUC) for patch classification and ROI detection.

The term "bags" comes from treating WSI classification as a Multiple Instance Learning (MIL) problem. In this context:

A "bag" refers to a single WSI

The "instances" are the individual patches/tiles from that slide
we sometimes use these terms interchangeably in the code: Slide/Bag and Instance/Patch/Tile.

Please let me know if you have any other questions!

Thank you so much!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train DINO from scratch #1

Train DINO from scratch #1

xiaoyatang commented Nov 29, 2024

da-nial commented Nov 29, 2024 •

edited

Loading

xiaoyatang commented Nov 29, 2024

da-nial commented Nov 30, 2024 •

edited

Loading

xiaoyatang commented Nov 30, 2024

Train DINO from scratch #1

Train DINO from scratch #1

Comments

xiaoyatang commented Nov 29, 2024

da-nial commented Nov 29, 2024 • edited Loading

xiaoyatang commented Nov 29, 2024

da-nial commented Nov 30, 2024 • edited Loading

xiaoyatang commented Nov 30, 2024

da-nial commented Nov 29, 2024 •

edited

Loading

da-nial commented Nov 30, 2024 •

edited

Loading