SparkDLTrigger/Training_Distributed at master · cerndb/SparkDLTrigger

History

Name		Name	Last commit message	Last commit date
parent directory ..
MultiWorker_Notebooks		MultiWorker_Notebooks
MultiWorker_PythonCode		MultiWorker_PythonCode
Training_TFKeras_CPU_GPU_K8S_Distributed		Training_TFKeras_CPU_GPU_K8S_Distributed
4.3a-Model_evaluate_ROC_and_CM.ipynb		4.3a-Model_evaluate_ROC_and_CM.ipynb
README.md		README.md

README.md

Distributed training

This folder contains code for training the Inclusive classifier with tf.keras in distributed mode. tf.distribute strategy with MultiWorkerMirroredStrategy is used to parallelize the training. tf.data is used to read the data in TFRecord format.

MultiWorker_Notebooks: distributed training and model performance metrics visualization using notebooks
MultiWorker_PythonCode: distributed training using "manual" Python code, suitable for local node testing
Training_TFKeras_CPU_GPU_K8S_Distributed: Distributed training using Kubertes and a custom tool TF_Spawner.

Note: see also Training_TFKeras_CPU_GPU_K8S_Distributed for distributed training on Kubernets clusters, using the custom TF-Spawner tool

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training_Distributed

Training_Distributed

README.md

Distributed training

Files

Training_Distributed

Directory actions

More options

Directory actions

More options

Latest commit

History

Training_Distributed

Folders and files

parent directory

README.md

Distributed training