Mapping the Advanced-Stage Epithelial Ovarian Cancer Landscape Goes Beyond Words: Two Large Language Models, Eight Tasks, One Journey.

Quaranta, M.; Laios, A.; Rogers, C.; Mavromatidou, AI.; Thangavelu, A.; Theophilou, G.; Nugent, D.; DeJong, D.; Kalampokis, E.

Mapping the Advanced-Stage Epithelial Ovarian Cancer Landscape Goes Beyond Words: Two Large Language Models, Eight Tasks, One Journey.

All Authors

Quaranta, M.

Laios, A.

Rogers, C.

Mavromatidou, AI.

Thangavelu, A.

Theophilou, G.

Nugent, D.

DeJong, D.

Kalampokis, E.

LTHT Author

Quaranta, Michela
Laios, Alexandros
Rogers, Charlie
Thangavelu, Amudha
Theophilou, Georgios
Nugent, David
DeJong, Diederick

LTHT Department

Oncology
Leeds Cancer Centre
Gynaecological Oncology

Publication Date

2025

Item Type

Journal Article

Abstract

Background/Objectives: The advancement of natural language processing (NLP) technologies has transformed various sectors. However, their application in the healthcare domain, particularly for analysing clinical notes, remains underdeveloped. We investigated the use of deep neural networks, specifically transformer-based models, to predict intraoperative and post-operative outcomes related to advanced-stage epithelial ovarian cancer cytoreduction (aEOC) using unstructured surgical notes. Methods: We evaluated the performance of RoBERTa, a general-purpose language model, and GatorTron, a domain-specific model, across eight binary classification tasks using the same dataset. The dataset consisted of 560 surgical records from patients with aEOC who underwent cytoreductive surgery at a tertiary UK reference centre. Predictive outcomes were converted into binary features to facilitate classification tasks. To enhance the contextual information available to the models, textual data from "operative findings" and "operative notes" were concatenated. Results: Our findings highlight the tangible benefits of employing domain-specific language models for clinical text analysis. GatorTron generally outperformed RoBERTa across most predictive tasks, underscoring the advantages of domain-specific pretraining for understanding medical terminology and context. Both models struggled to predict certain outcomes, particularly those involving post-operative events like major complications and length of hospital stay, despite adjustments in hyperparameters and training strategies. This limitation suggests that operative text alone may not sufficiently capture the complexities of post-operative recovery. Conclusions: These findings have valuable implications for developing medical AI systems to improve the delivery of modern aEOC healthcare.