Fertility-sparing treatments decision in patients with endometrial cancer based on machine learning
1College of Electronics and Information Engineering, University of Sichuan, 610065 Chengdu, Sichuan, China
2Department of Neurology, Chengdu Third People’s Hospital, 610031 Chengdu, Sichuan, China
3Department of Gynecology and Obstetrics, Key Laboratory of Obstetric and Gynecologic and Pediatric Diseases and Birth Defects of Ministry of Education, West China Second Hospital, University of Sichuan, 610041 Chengdu, Sichuan, China
DOI: 10.22514/ejgo.2022.046 Vol.43,Issue 5,October 2022 pp.91-99
Submitted: 30 July 2022 Accepted: 06 September 2022
Published: 15 October 2022
Although many studies have been recently performed on fertility-sparing treatments in patients with endometrial cancer (EC) and endometrial atypical hyperplasia (EAH), most of the corresponding studies were retrospective and small sample research. However, it is essential to more thoroughly assess the necessity of hysterectomy in EC patients using current accumulated experience. With the development of machine learning (ML), it has been gradually integrated into oncologic research but seldom applied to predict the efficacy of hysterectomy due to an insufficient number of patients who did not undergo hysterectomy, leading to a learning imbalance. Thus, the commonly used machine learning models cannot provide satisfying performance. In this study, we aimed to develop ML models to predict whether hysterectomy is necessary for EC patients and help gynecologists determine the possibility of fertility-preserving treatment in EC patients. A clinical dataset of 1534 women with EC was analyzed. The Borderline-SMOTE algorithm was employed to solve imbalanced learning issues. Then, the Adaptive Boosting (AdaBoost) algorithm, which is less susceptible to overfitting than other machine learning algorithms, was used to build a high-performance ensemble classification model. The findings indicated that the method outperformed conventional machine learning models and provided a realistic strategy to make fertility-preserving treatment decisions. The proposed model provides a platform for physicians to precisely predict the efficacy of fertility-sparing therapy in EC patients, allows gynecologists to select the optimal treatment for a patient, and reduces resource waste and risks of overtreatment.
Endometrial cancer; Machine learning; Borderline-SMOTE; AdaBoost
Yue Sun,Zhi Li,Li Gao,Wenhan Yuan,Fan Yang. Fertility-sparing treatments decision in patients with endometrial cancer based on machine learning. European Journal of Gynaecological Oncology. 2022. 43(5);91-99.
 Chen W, Zheng R, Baade PD, Zhang S, Zeng H, Bray F, et al. Cancer statistics in China, 2015. CA: A Cancer Journal for Clinicians. 2016; 66: 115–132.
 Siegel RL, Miller KD, Jemal A. Cancer statistics, 2017. CA: A Cancer Journal for Clinicians. 2017; 67: 7–30.
 Wortman BG, Creutzberg CL, Putter H, Jürgenliemk-Schulz IM, Jobsen JJ, Lutgens LCHW, et al. Ten-year results of the PORTEC-2 trial for high-intermediate risk endometrial carcinoma: improving patient selection for adjuvant therapy. British Journal of Cancer. 2018; 119: 1067–1074.
 Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, et al. Artificial intelligence in healthcare: past, present and future. Stroke and Vascular Neurology. 2017; 2: 230–243.
 Serj M F, Lavi B, Hoff G, Valls DP. A deep convolutional neural network for lung cancer diagnostic. Available at: https://arxiv.org/abs/ 1804.08170 (Accessed: 22 April 2018).
 Lin H, Wei C, Wang G, Chen H, Lin L, Ni M, et al. Automated classification of hepatocellular carcinoma differentiation using multiphoton microscopy and deep learning. Journal of Biophotonics. 2019; 12: e201800435.
 Yan K, Wang X, Kim J, Khadra M, Fulham M, Feng D. A propagation-DNN: deep combination learning of multi-level features for MR prostate segmentation. Computer Methods and Programs in Biomedicine. 2019; 170: 11–21.
 Hermessi H, Mourali O, Zagrouba E. Deep feature learning for soft tissue sarcoma classification in MR images via transfer learning. Expert Systems with Applications. 2019; 120: 116–127.
 Ghassemi N, Shoeibi A, Rouhani M. Deep neural network with generative adversarial networks pre-training for brain tumor classification based on MR images. Biomedical Signal Processing and Control. 2020; 57: 101678.
 Pergialiotis V, Pouliakis A, Parthenis C, Damaskou V, Chrelias C, Papantoniou N, et al. The utility of artificial neural networks and classification and regression trees for the prediction of endometrial cancer in postmenopausal women. Public Health. 2018; 164: 1–6.
 Vezzoli M, Ravaggi A, Zanotti L, Miscioscia RA, Bignotti E, Ragnoli M, et al. RERT: a novel regression tree approach to predict extrauterine disease in endometrial carcinoma patients. Scientific Reports. 2017; 7: 10528.
 Zhou ZH. Machine Learning. 1st ed. Tsinghua University Press: Beijing. 2016.
 Oza NC, Russell SJ. Online bagging and boosting. International Workshop on Artificial Intelligence and Statistics. 2001: R3: 229–236.
 Li X, Liu X, Yu L. Aggregative model-based classifier ensemble for improving land-use/cover classification of Landsat TM images. International Journal of Remote Sensing. 2014; 35: 1481–1495.
 Nordhausen K. Ensemble methods: foundations and algorithms by Zhi-Hua Zhou. International Statistical Review. 2013; 81: 470.
 Freund Y, Schapire RE. ‘Experiments with a new boosting algorithm,’ Machine Learning: Proceedings of the Thirteenth International Conference. Morgan Kaufmann Publishers: San Francisco. 1996.
 Choi Y, Chiu CY, Sontag D. Learning low-dimensional representations of medical concepts. AMIA Summits on Translational Science Proceedings. 2016; 2016: 41–50.
 Ying L. Research on text medical record information extraction method [master’s thesis]. Hangzhou: Zhejiang University. 2009.
 Mckinney W. Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. 1st ed. O’Reilly Media, Inc.: Sebastopol. 2012.
 Kalman RE. A new approach to linear filtering and prediction problems. Journal of Basic Engineering. 1960; 82: 35–45.
 Van Buuren S, Groothuis-Oudshoorn K. Mice: multivariate imputation by chained equations in R. Journal of Statistical Software. 2011; 45: 1–67.
 Haibo He, Garcia EA. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering. 2009; 21: 1263–1284.
 Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research. 2002, 16: 321–357.
 Fernández A, Garcia S, Herrera F, Chawla NV. SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. Journal of Artificial Intelligence Research. 2018; 61: 863–905.
 Han H, Wang W, Mao B. Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. Lecture Notes in Computer Science. 2005; 3644: 878–887.
 Jia JP, He XQ, Jin YJ. Statistics. 2nd Ed. Chinese Renmin University Press: Beijing. 2004.
Science Citation Index Expanded (SciSearch) Created as SCI in 1964, Science Citation Index Expanded now indexes over 9,500 of the world’s most impactful journals across 178 scientific disciplines. More than 53 million records and 1.18 billion cited references date back from 1900 to present.
Biological Abstracts Easily discover critical journal coverage of the life sciences with Biological Abstracts, produced by the Web of Science Group, with topics ranging from botany to microbiology to pharmacology. Including BIOSIS indexing and MeSH terms, specialized indexing in Biological Abstracts helps you to discover more accurate, context-sensitive results.
Google Scholar Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines.
JournalSeek Genamics JournalSeek is the largest completely categorized database of freely available journal information available on the internet. The database presently contains 39226 titles. Journal information includes the description (aims and scope), journal abbreviation, journal homepage link, subject category and ISSN.
Current Contents - Clinical Medicine Current Contents - Clinical Medicine provides easy access to complete tables of contents, abstracts, bibliographic information and all other significant items in recently published issues from over 1,000 leading journals in clinical medicine.
BIOSIS Previews BIOSIS Previews is an English-language, bibliographic database service, with abstracts and citation indexing. It is part of Clarivate Analytics Web of Science suite. BIOSIS Previews indexes data from 1926 to the present.
Journal Citation Reports/Science Edition Journal Citation Reports/Science Edition aims to evaluate a journal’s value from multiple perspectives including the journal impact factor, descriptive data about a journal’s open access content as well as contributing authors, and provide readers a transparent and publisher-neutral data & statistics information about the journal.