Fine Tune A Chest Xray Classification Model Using Nvidia Clara Train
Training with NVIDIA Clara Train
In this section, you will fine tune a pretrained model with your own data. This model then can be used for inference. Disclaimer - This model is not for clinical use and models are not fit for clinical decision making.
For setting up Clara Train SDK refer to - https://blog.uplandr.com/2021/08/29/Setup-NVIDIA-Clara-Train-Medical-Image.html
I am referring to - https://medium.com/@integratorbrad/how-i-built-a-space-to-train-and-infer-on-medical-imaging-ai-models-part-9-26bbaae9ca2f
And few other sources.
Prepare the training data
For this exercise we will be using training images from https://github.com/ieee8023/covid-chestxray-dataset
Download the repository to ~/Downloads folder
Unzip the file in the experiments folder
The contents of the folder
Install few libraries for python coding to cleanup the input images:
Create a folder for storing training images
Installed sublime text 3 editor for source code creation.
https://linuxize.com/post/how-to-install-sublime-text-3-on-ubuntu-18-04/
Start sublime
Create convert.py python file for processing the images
Run the script to process the files
Next up, create a json file with image and finding.
Taking the metadata.csv file and cleaning it up to create a datalist.json file.
In this process removed some training data with biases or with unique findings with very few data points.
Used xl to open the csv and sort/filter/remove etc as required.
Use https://jsonlint.com/ to check the json for correctness.
Replace all the reference of .jpeg and .jpg to .png in the datalist.json file.
Take one or two of the images out of the json for running an inference on it later
Training the model
Training needs GPU resources and stopping clara deploy will help.
Get inside Clara Train container. *Correction* use clara-train-sdk:v3:0 in place of v4. The model I am going to download works with V3 (Tensorflow version of clara). From V4 Clara moved to pytorch.
Download the chest xray model (MMAR)
As you can the /clara/experiments folders are mapping inside the Clara Train container
Lets clone the MMAR to host
Some cleanup
Run the following in a separate terminal to give permission. Close the terminal after done.
Change some of the training script in the train_finetune.sh
Changing 3 data points 1) json source 2) epoch count(to 500) and 3) Learning Rate (to 0.00002)
Change training configuration
There are 6 things we are going to change-
After:
1) Changed epoch to 500, 2) learning rate to 2e-5, 3)Update the “subtrahend” and “divisor” parameters from the CenterData transform, in both the “train” and “validate” sections, 4) & 5) Change image_pipeline name to ClassificationKerasImagePipeline and added “sampling” : “automatic” in training (not in validation), 6) computeAUC aligned with 6 categories
Make #3 and #6 changes in the validation configuration file
Make following changes to environment configuration. Changes are 1) data_root and 2) dataset_json
Lets begin fine-tuning the model!
Start by making the script executable
*correction*
Datalist.json file should be in this location - /workspace/data/covid-training-set/training-images/
Else will get this error!
Few other errors may arise; fix these with data cleanup or delete.
After this, training will start
Now the training is complete.
As you can see the best metrics was at epoch 570 with validation metric of 0.83
Get inside docker container if you came out
See tensorboard output
Export the model
Exit from the docker container and check the model we trained
Models are listed
So we have fine-tuned a model that is trained on the data we created! awesome.