|Ahead of print publication
Autodelineation of organ at risk in head and neck cancer radiotherapy using artificial intelligence
Ramesh S Bilimagga1, Pichandi Anchineyan1, Murli Shivasha nmugam2, Seshashayi Thalluri2, P Sudheer Kumar Goud2
1 Department of Radiation Oncology, Healthcare Global Enterprises, Bangalore, Karnataka, India
2 Great Lakes, Bengaluru, Karnataka, India
|Date of Submission||07-Aug-2020|
|Date of Acceptance||25-Dec-2020|
|Date of Web Publication||05-Aug-2021|
Ramesh S Bilimagga,
Department of Radiation Oncology, Healthcare Global Enterprises, P. Kalingarao Road, Sampangiramnagar, Bengaluru - 560 027, Karnataka
Source of Support: None, Conflict of Interest: None
Aim: The aim of this study is to check the practical feasibility of artificial intelligence for day-to-day operations and how it generalizes when the data have considerable interobserver variability.
Background: Automated delineation of organ at risk (OAR) using a deep learning model is reasonably accurate. This will considerably reduce the medical professional time in manually contouring the OAR and also reduce the interobserver variation among radiation oncologists. It allows for quick radiation planning which helps in adaptive radiotherapy planning.
Materials and Methods: Head and neck (HN) computed tomography (CT) scan data of 113 patients were used in this study. CT scan was done as per the institute protocol. Each patient had about 100–300 slices in Dicom format. A total number of 19,240 images were used as the data set. The OARs were delineated by the radiation oncologist in the contouring system. Of the 113 patient records, 13 records were kept aside as test dataset and the remaining 100 records were used for training the UNet 2D model. The study was performed on the spinal cord and left and right parotids as OARs on HN CT images. The model performance was quantified using the Dice similarity coefficient (DSC) score.
Results: The trained model is used to predict three OARs, spinal cord and left and right parotids. The DSC score of 84% and above could be achieved using the UNet 2D Convolutional Neural Network.
Conclusion: This study showed that the accuracy of predicted organs was within acceptable DSC scores, even when the underlying dataset has significant interobserver variability.
Keywords: Contouring, organ at risk, radiotherapy, UNet, UNet 2D
|How to cite this URL:|
Bilimagga RS, Anchineyan P, nmugam MS, Thalluri S, Goud P S. Autodelineation of organ at risk in head and neck cancer radiotherapy using artificial intelligence. J Can Res Ther [Epub ahead of print] [cited 2022 Dec 4]. Available from: https://www.cancerjournal.net/preprintarticle.asp?id=323169
| > Introduction|| |
According to Globocan data, the incidence of head and neck (HN) cancer in the world is 550,000. In India, the incidence estimated for 2020 is 1,75,791 as per NCRP India. The main cause of HN cancer is the use of tobacco, alcohol, HPV infection, poor nutrition, and oral hygiene. In the years to come, we see a rise in this malignancy. The organs which are commonly affected are the mouth, throat, larynx, and sinuses. The standard treatment of HN cancer consists of surgery, radiation, and chemo/immunotherapy. The treatment management of HN cancer involves radiotherapy in 60%-70% of the patients. The dictum of radiation is to reduce the dose to normal structures as much as possible and give maximum radiation to the tumor. To accomplish this, the critical normal structure must be identified and contoured precisely and direct the treatment planning system to lessen the radiation to as minimum as possible to those areas. The radiation oncologist (RO) uses the computed tomography (CT) scans to identify and contour the organ at risk (OAR) in each slice. This is a time-consuming and laborious manual process. Even though ROs follow international guidelines for OAR delineation, there is a significant inter- and intraobserver variability in the contours. The accuracy of contouring also depends on the experience and workload of the RO professionals. Few ROs use multiatlas technology, which utilizes prepared anatomic atlas of the human body and contour the organ by deformable registrations. However, the numbers of available atlas sections are limited, and the vagaries of human anatomy are plenty. This gives rise to many errors and difficulties in the delineation of OAR. Hence, it is not an effective model.
It is a fact that the clinical workload associated with contouring has increased drastically in recent years. It has a bearing on quality, consistency, and accuracy in contouring. In HN cancer segmentation, several deep learning approaches have been proposed. Some of them are standard classification networks of patches with tailored pre- and postprocessing. Others used “UNet” based architectures. They are few commercially available autocontouring stations emerging in recent times. The objective of this study is to identify the feasibility of autocontouring OAR's using the clinical images and contours created by RO's.
| > Materials and Methods|| |
In this study, CT scan images of HN cancer patients who had undergone radiotherapy in our department were utilized. The patients were scanned using GECT scanner and CT scan images were acquired in the supine position with a custom thermoplastic mask for immobilization as per our hospital protocol. The scanned images were pushed into Eclipse Treatment Planning System (TPS) v13.7 (Varian Medical Systems, Palo Alto, CA). OARs delineated by ROs in these images were used as ground truth. From the Eclipse TPS, CT Dicom images together with the contours of 113 patients were extracted. Of these 113 patients, 13 patients' data were kept aside as test data set and the remaining 100 patients' data were used for training the model. Once the model is trained, the test data set is used to evaluate the model performance. The distribution of the number of slices per patient in 100 training and 13 patient test data set is shown in [Figure 1]. Images in 100 patient data set are further split into 80:20 ratios and used to train and validate the data set.
|Figure 1: Number of slices per patient (a) the trained data set and (b) the test data set|
Click here to view
A UNet 2D model as shown in [Figure 2], with 5 levels, as mentioned in is used for this study. At the output layer, a three-filter convolution 2D layer is used, where each filter output corresponds to individual masks per organ.
The binary cross-entropy is used as the loss function and the error values are optimized using Adam optimizer. The final model performance is evaluated using the Dice coefficient on the 13 test data set. Given two set X and Y data, the dice coefficient formula is shown in the equation below.
| > Results|| |
Data extraction and analysis
From the database, few patient records were selected randomly and the contours of the spinal cord, left parotid, and right parotid were visually inspected for validity. Since the patients were selected randomly across different timelines and doctors, the contours will have inter- and intraobserver variability. Selected patient record IDs were fed into a script that downloaded patient records, anonymized patient data as per hospital protocol, and exported details into Dicom and RT structure files. The exported data set contains one folder per patient. Each folder contains one Dicom file per slice and one metadata file, called RT structure file, containing information such as patient ID, the order of slices, and contours of each organ per slice. Each Dicom file is of dimension 512 × 512 pixels. We used the dicom_contour utility to extract pixel and metadata information from the patient folder. Pixel values of each slice are stored as a numpy file with the target file name corresponding to slice ID and in a target folder corresponding to patient id. Contour details for each slice are extracted from the RT structure file and converted to a numpy array mask. Mask of the spinal cord, left parotid, and right parotid is stacked together to create a 3D numpy array and stored as a numpy file with the target file name corresponding to slice ID. Mask numpy file that has at least one mask for the spinal cord, left parotid, and right parotid and the corresponding numpy slice file is extracted from the folder structure and used for training.
Data prepossessing and model training
The training data of 17,220 images, each containing at least one contour, are split into an 80–20 ratio based on patient ID and used as a “train” and “valid” data set. On the “train” data, 5% Gaussian noise is added to reduce variance error. A UNet-2D model is trained for 200 epochs with an image size of 512 × 512 pixels and a batch size of four. “Adam” with the learning rate of 1e-3 is used as the optimizer and binary_cross_entropy is used as the loss function. The model weight which gave the highest value for the dice coefficient on the valid data set is selected as the final model. The 2DUnet model is used to predict three OARs such as spinal cord and left and right parotids in the HN region. The median Dice similarity coefficient score of 83% was achieved on the test data for all the three OAR model predictions. The results are shown in [Figure 3] and [Figure 4]. A comparison of our results with published literature is captured in [Table 1].
|Figure 3: Dice similarity coefficient score of all organs together (a) the histogram of combined Dice similarity coefficient scores of all organ at risk, (b) the box plot of the Dice coefficient. The top of the rectangle represents the third quartile, the bottom of the rectangle represents the first quartile, and the middle of the rectangle represents the median. The vertical line extended from the top and bottom of the rectangle represents the maximum and minimum values, respectively|
Click here to view
|Figure 4: Dice similarity coefficient score of the left parotid, right parotid, and spinal cord (a, c and e) histogram of Dice coefficient score for the left parotid, right parotid, and spinal cord, respectively; (b, d, and f) the box plot for three organs at risk|
Click here to view
Visual representation of ground truth and predicted contour are shown in [Figure 5]. The green mask indicates all the three OARs delineated by RO and the red mask indicates the delineation predicted by the UNet-2D model for the spinal cord and left and right parotids.
|Figure 5: (a-e) Ground truth versus prediction in multiple slices of one representative patient |
Click here to view
| > Discussion|| |
The quality and technology of radiotherapy imaging and treatment have come a long way in the recent decade. Delineating OAR remains the cornerstone for radiotherapy treatment planning and a major factor in determining the patient's quality of life after the treatment. The process of delineation is subjective to interobserver variability. There were many efforts to reduce interobserver variability and provide a common guideline. In spite of its importance and efforts to reduce the variability, the issue still remains in day-to-day operations which are primarily attributed to human judgments. It is also noted that the variability is also observed by the same doctor (intraobserver variability) across different timelines and workload conditions. An automated system for delineating OARs will not only reduce the time but also help in reducing inter- and intraobserver variability as the system will be deterministic all the time.
There were many studies and systems created for the autocontouring of OARs. As the field of AI progressed, many studies were conducted to validate the effectiveness of AI-based approaches in contouring OARs. Methodologies evolved from pure atlas-based approaches like.,, to a combination of atlas and AI approaches like to pure AI-based approaches.,,, In spite of these advancements, most of the software still uses an atlas-based approach for autocontouring of OARs primarily due to resource constraints and complexities in the workflow.
Many studies on AI-based OAR contouring used 3D deep learning models for segmentation.,,, Training a 3D model is complex, especially for medical images. In one of the recent studies, a UNet-3D model was used for OAR contouring and trained it using 32 GPUS with the synchronous SGD optimizer. The training loop was executed in 120,000 steps with varying learning rates. We observed that when a radiotherapist contours the organs manually, it is done, slice by slice, on a 2D image one at a time. The ground truth was generated on 2D images, but the AI model is trained to contour on the 3D representation of the image. Furthermore, a study conducted by Hänsch et al. concluded that 2D based UNet model is best suited for HN OAR autosegmentation than the 3D UNet model.
Hence, we wanted to explore the feasibility of a relatively simpler AI model based on UNet 2D for medical image segmentation on a data set that had significant inter- and intraobserver variability. In the world of data science, it is often proved that a simpler solution that generalizes better would serve the use case better than a complex solution.
| > Conclusion|| |
The 2D-UNet developed as part of this study will reduce considerably the time taken by RO in manually contouring the OAR for treating patients with HN cancers by radiation. Furthermore, to some extent, minimizes the inter- and intraobserver variability of radiation oncologist in contouring OARs. In the process of preparing these modules, we learned that there is every possibility that AI deep learning techniques can perform contouring of OARs at a fairly acceptable level. We also learned that there is a great potential in the application of AIs in various activities of this medical domain, which will significantly help the cancer patient care.
We are grateful to Healthcare Global Enterprises for providing us access to head and neck CT scan images of patients who had undergone radiotherapy treatment.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| > References|| |
Jemal A, Bray F, Center MM, Ferlay J, Ward E, Forman D. Cancer statistics. CA Cancer J Clin 2011;61:69-90.
Brouwer CL, Steenbakkers RJ, Bourhis J, Budach W, Grau C, Grégoire V, et al
. CT-based delineation of organs at risk in the head and neck region: DAHANCA, EORTC, GORTEC, HKNPCSG, NCIC CTG, NCRI, NRG Oncology and TROG consensus guidelines. Radiother Oncol 2015;117:83-90.
Tong N, Gou S, Yang S, Ruan D, Sheng K. Fully automatic multi-organ segmentation for head and neck cancer radiotherapy using shape representation model constrained fully convolutional neural networks. Med Phys 2018;45:4558-67.
Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation 2015;arXiv: 1505.04597.
KeremTurgutlu. Dicom-Contour. GitHub Repository n.d.
Zhu W, Huang Y, Zeng L, Chen X, Liu Y, Qian Z, et al
. AnatomyNet: Deep learning for fast and fully automated whole-volume segmentation of head and neck anatomy. Med Phys 2019;46:576-89.
Nikolov S, Blackwell S , Zverovitch A, Mendes R, Livne M , De Fauw J, et a
l. Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy. ArXiv E-Prints 2018:arXiv: 1809.04430.
Tam C, Tian S, Beitler JJ, Jiang X, Li S, Yang X. Automated delineation of organs-at-risk in head and neck CT images using multi-output support vector regression. In: Gimi B, Krol A, editors. Medical Imaging 2018: Biomedical Applications in Molecular, Structural, and Functional Imaging, 1057824 : SPIE Proceedings Vol.10578; 2018.
Ibragimov B, Xing L. Segmentation of organs-at-risks in head and neck CT images using convolutional neural networks. Med Phys 2017;44:547-57.
Hänsch A, Schwier M, Gass T, Morgas T, Haas B, Dicken V, et al
. Evaluation of deep learning methods for parotid gland segmentation from CT images. J Med Imaging 2018;6:1-9.
Nazemi-Gelyan H, Hasanzadeh H, Makhdumi Y, Abdollahi S, Akbari F, Varshoee-Tabrizi F, et al
. Evaluation of organs at risk's dose in external radiotherapy of brain tumors. Iran J Cancer Prev 2015;8:47-52.
Brouwer CL, Steenbakkers RJ, van den Heuvel E, Duppen JC, Navran A, Bijl HP, et al
. 3D Variation in delineation of head and neck organs at risk. Radiat Oncol 2012;7:32.
Hoogeman MS, Han X, Teguh D, Voet P, Nowak P, Wolf T, et al. Atlas-based Auto-segmentation of CT images in head and neck cancer: What is the best approach? Int J Radiat Oncol Biol Phys 2008;72:S591.
Levendag PC, Hoogeman M, Teguh D, Wolf T, Hibbard L, Heijmen B, et al. Atlas based auto-segmentation of CT images: Clinical evaluation of using auto-contouring in high-dose, high-precision radiotherapy of cancer in the head and neck. Int J Radiat Oncol Biol Phys 2008;72:S401.
Chen A, Dawant B. A multi-atlas approach for the automatic segmentation of multiple structures in head and neck CT images. MIDAS J Head and Neck Auto Segmentation Chalenge 2016. http://hdl.handle.net/10380/3540
Qazi AA, Pekar V, Kim J, Xie J, Breen SL, Jaffray DA. Auto-segmentation of normal and target structures in head and neck CT images: A feature-driven model-based approach. Med Phys 2011;38:6160-70.
Hänsch A, Schwier M, Morgas T, Gass T,Haas B, Klein J, et al. Comparison of different deep learning approaches for parotid gland segmentation from CT images. In: Mori K, Petrick N, editors. Medical Imaging 2018: Computer-Aided Diagnosis. SPIE 10575, 1057519 (27 February 2018); doi: 10.1117/12.2292962.
Fritscher K, Raudaschl P, Zaffino P, Spadea MF, Sharp GC, Schubert R. Deep Neural Networks for Fast Segmentation of 3D Medical Images. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016 , Part II, LNCS 9901, pp. 158–165, 2016: Springer International Publishing AG 2016.
Ren X, Xiang L, Nie D, Shao Y, Zhang H, Shen D, et al
. Interleaved 3D-CNNs for joint segmentation of small-volume structures in head and neck CT images. Med Phys 2018;45:2063-75.
[Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5]