ESGO/ISUOG/IOTA/ESGE Consensus Statement on preoperative diagnosis of ovarian tumours

The European Society of Gynaecological Oncology (ESGO), the International Society of Ultrasound in Obstetrics and Gynecology (ISUOG), the International Ovarian Tumour Analysis (IOTA) group and the European Society for Gynaecological Endoscopy (ESGE) jointly developed clinically relevant and evidence-based statements on the preoperative diagnosis of ovarian tumours, including imaging techniques, biomarkers and prediction models. ESGO/ISUOG/IOTA/ESGE nominated a multidisciplinary international group, including expert practising clinicians and researchers who have demonstrated leadership and expertise in the preoperative diagnosis of ovarian tumours and management of patients with ovarian cancer (19 experts across Europe). A patient representative was also included in the group. To ensure that the statements were evidence-based, the current literature was reviewed and critically appraised. Preliminary statements were drafted based on the review of the relevant literature. During a conference call, the whole group discussed each preliminary statement and a first round of voting was carried out. Statements were removed when a consensus among group members was not obtained. The voters had the opportunity to provide comments/suggestions with their votes. The statements were then revised accordingly. Another round of voting was carried out according to the same rules to allow the whole group to evaluate the revised version of the statements. The group achieved consensus on 18 statements. This Consensus Statement presents these ESGO/ISUOG/IOTA/ESGE statements on the preoperative diagnosis of ovarian tumours and the assessment of carcinomatosis, together with a summary of the evidence supporting each statement.


Introduction
The accurate characterization of newly diagnosed adnexal lesions is of paramount importance to define appropriate treatment pathways. Patients with masses that are suspicious for malignancy should be referred to a gynaecological oncology centre, in order to receive specialist care, as per the definitions of the European Society of Gynaecological Oncology (ESGO) (Querleu et al., 2017) and national and international recommendations and guidelines. For a non-gynaecological primary tumour, patients need to be referred to an appropriate specialist, while patients with benign lesions may be followed up and treated conservatively or may be suitable for less radical surgical treatment, depending on the clinical context (du Bois et al., 2009;Elit et al., 2008;Engelen et al., 2006;Froyman et al., 2019; ESGO/ISUOG/IOTA/ESGE Consensus Statement on preoperative diagnosis of ovarian tumours Facts Views Vis Obgyn, 2021, 13 (2): 107-130 ESGE Pages Vernooij et al., 2007;Woo et al., 2012). Treatment decision-making processes should be based on a combination of the patient's overall clinical picture, symptoms, preferences, previous medical and surgical history, tumour markers and clinical and radiological findings. A single diagnostic modality alone should not determine the patient's journey.
The ESGO, the International Society of Ultrasound in Obstetrics and Gynecology (ISUOG), the International Ovarian Tumour Analysis (IOTA) group and the European Society for Gynaecological Endoscopy (ESGE) have, jointly, developed clinically relevant and evidencebased statements on the preoperative diagnosis of ovarian tumours and assessment of disease spread, including imaging techniques, biomarkers and predictive models. Neither screening and follow-up modalities, nor economic analysis of the imaging techniques, biomarkers and prediction models addressed herein, are included within the remit of this Consensus Statement.

Responsibilities
The present series of statements form a consensus of the authors regarding their currently accepted approaches for the preoperative diagnosis of ovarian tumours and assessment of disease spread, based on the available literature and evidence. Any clinician applying or consulting these statements is expected to use independent medical judgment in the context of individual clinical circumstances to determine all patients' care and treatment. These statements are presented without any warranty regarding their content, use or application and the authors disclaim any responsibility for their application or use in any way.

Methods
This Consensus Statement on the preoperative diagnosis of ovarian tumours and assessment of disease spread was developed using an eight-step process, chaired by Professors Christina Fotopoulou and Dirk Timmerman (Figure 1). Aiming to assemble a multidisciplinary international group, ESGO/ISUOG/IOTA/ESGE nominated 19 practising clinicians and researchers who have demonstrated leadership and expertise in the preoperative diagnosis of ovarian tumours and clinical management of ovarian cancer patients through research, administrative responsibilities, and/or committee membership (including eight members of ESGO, fi ve members of ISUOG, four members of IOTA and two members of ESGE).
These experts included seven gynaecologists with special interest in ultrasonography, two radiologists and 10 gynaecological oncologists. They did not represent the societies from which they were selected, and were asked to base their decisions on their own experience and expertise. Also included in the group was a patient representative, who is Chair of the Clinical Trial Project of the European Network of Gynaecological Cancer Advocacy Groups, ENGAGe. An initial conference call, including the whole group, was held to facilitate introductions, as well as to review the purpose and scope of this Consensus Statement.
To ensure that the statements were evidencebased, the current literature was reviewed and critically appraised. Thus, a systematic literature review of relevant studies published between 1 May 2015 and 1 May 2020 was carried out using the MEDLINE database (Appendix 1). The literature search was limited to publications in the English language. Priority was given to highquality systematic reviews, meta-analyses and validating cohort studies, although studies with lower levels of evidence were also evaluated. The search strategy excluded editorials, letters and case reports. The reference list of each identifi ed article was reviewed for other potentially relevant articles. Final results of the literature search were distributed to the whole group, including electronic full-text versions of each article. F. Planchamp provided the methodology and medical writing support for the entire process, and did not participate in voting for statements.
The chairs were responsible for drafting preliminary statements based on the review of the relevant literature. These were then sent to the multidisciplinary international group prior to a second conference call. During this conference call, the whole group discussed each preliminary statement and a first round of binary voting (agree/disagree) was carried out for each potential statement. All 20 participants took part in each vote, but they were permitted to abstain from voting if they felt they had insufficient expertise to agree/disagree with the statement or if they had a conflict of interest that could be considered to influence their vote. Statements were removed when a consensus among group members was not obtained. The voters had the opportunity to provide comments/suggestions with their votes. The chairs then discussed the results of this first round of voting and revised the statements if necessary. The voting results and the revised version of the statements were again sent to the whole group and another round of binary voting was organized, according to the same rules, to allow the whole group to evaluate the revised version of the statements. The statements were finalized based on the results of this second round of voting. The group achieved consensus on 18 statements. In this Consensus Statement, we present a summary of the supporting evidence, the finalised series of statements, and their levels of evidence and grades.

General remarks
Even though the test performance of any biochemical or radiological diagnostic test appears to increase after excluding borderline ovarian tumours and non-gynaecological primary tumours, such as of the gastrointestinal tract or breast, we included in our literature assessment studies addressing all types of adnexal tumour, as this is a better reflection of clinical reality.

Ultrasonography
A transvaginal ultrasound examination is often regarded in clinical practice as the standard firstline imaging investigation for the assessment of adnexal pathology Meys et al., 2016;Timmerman, 2004;Valentin et al., 2001). The diagnostic accuracy of ultrasonography in differentiating between benign and malignant adnexal masses has been shown to relate to the expertise of the operator (Timmerman et al., 1999;Valentin, 1999;Yazbek et al., 2008). The European Federation of Societies for Ultrasound in Medicine and Biology has published minimum training requirements for gynaecological ultrasound practice in Europe, including standards for theoretical knowledge and practical skills (Education and Practical Standards Committee, European Federation of Societies for Ultrasound in Medicine and Biology, 2006). These identify three levels of training and expertise. Thus, Level-III (expert) can be attributed to a practitioner who is likely to spend the majority of their time undertaking gynaecological ultrasound and/or teaching, research and development in the field. A Level-II practitioner should have undertaken at least 2000 gynaecological ultrasound examinations. The training required to attain this level of practice would usually be gained during a period of expert ultrasound training, which may be within, or after completion of, a specialist training program. To maintain competence at Level-II, practitioners should perform at least 500 examinations each year. A Level-I practitioner should have performed a minimum of 300 examinations under the supervision of a Level-II practitioner or an experienced Level-I practitioner with at least 2 years' regular practical experience. To maintain Level-I status, the practitioner should perform at least 300 examinations each year. A prospective randomized controlled trial to assess the effect of the quality of gynaecological ultrasonography on the management of patients with suspected ovarian cancer has demonstrated that women with a Level-III (expert) ultrasound examination undergo significantly fewer unnecessary major procedures and have a shorter inpatient hospital stay compared with those having a Level-II (routine) examination by a sonographer (Yazbek et al., 2008).
Subjective assessment by expert ultrasound examiners has excellent performance to distinguish between benign and malignant ovarian tumours (Meys et al., 2016;Timmerman, 2004;Timmerman et al., 1999;Valentin, 1999;Valentin et al., 2001;Yazbek et al., 2008). In many cases, expert examiners should be able to narrow the diagnosis down further, to a specific histological subtype. The typical pathognomonic ultrasound features of some key histological types have been published in the series, 'Imaging in gynecological disease', in Ultrasound in Obstetrics and Gynecology. The most common and typical findings for each pathology are summarized in Table I.

Risk of malignancy index (RMI) and risk of ovarian malignancy algorithm (ROMA)
Several attempts have been made to develop more objective ultrasound-based approaches for discriminating between benign and malignant adnexal tumours. These include the risk of malignancy index (RMI), a scoring system based on menopausal status, a transvaginal ultrasound score and serum cancer antigen 125 (CA 125) level (Jacobs et al., 1990). Many studies have demonstrated the diagnostic performance of the        Westwood et al. (2018) pooled data comparing the ROMA with the RMI-I to guide referral decisions for women with suspected ovarian cancer and found similar performance if women with borderline tumours and non-epithelial cancers were excluded from the analyses. More recently, another meta-analysis showed a higher specificity of the RMI-I than the ROMA in premenopausal women but a similar performance for detecting ovarian cancer in postmenopausal women presenting with an adnexal mass (Chacon et al., 2019). Limitations of the RMI are the absence of an estimated risk of malignancy, and its considerable dependence on serum CA 125, the latter resulting in a relatively low sensitivity for early-stage invasive and borderline disease, especially in premenopausal women (Kaijser et al., 2013;Timmerman et al., 2007) (see Tumour Markers).

IOTA methods
To homogenize and standardize the quality, description and evaluation of ultrasonography across different centres, and thereby increase diagnostic accuracy, the IOTA group first published a consensus paper on terms and definitions to describe adnexal lesions in 20003 (Timmerman et al., 2016). Using this standardized methodology, the IOTA group has developed different prediction models based on logistic regression analysis (Timmerman et al., 2005;Timmerman et al., 2016;Van Calster et al., 2014). In a large-scale external validation study, Van Holsbeke et al. (2012) showed that the IOTA logistic regression models 1 (LR1, with 12 variables) and 2 (LR2, with six variables) outperformed 12 other models, including the RMI. The LR2 model was easier to use than the LR1 A randomized controlled trial assessing surgical intervention rates and the oncologic safety of decision-making processes using on an RMI-based protocol developed by the British Royal College of Obstetricians and Gynaecologists (RCOG) vs triage using the IOTA Simple Rules (Nunes et al., 2017) showed that the IOTA protocol resulted in lower surgical intervention rates compared with the RMI-based RCOG protocol. The IOTA Simple Rules did not result in more cases in which a diagnosis of cancer was delayed. It was found that the addition of biomarkers such as serum CA 125 and HE4 when using the IOTA Simple Rules, with or without subjective assessment by an expert sonographer, offered no additional diagnostic advantage for the characterization of ovarian masses, but was more costly than a threestep strategy based on the sequential use of the IOTA Simple Descriptors, Simple Rules and expert evaluation (Alcazar et al., 2016;Piovano et al., 2017). The IOTA group have also developed the Assessment of Different NEoplasias in the adneXa (ADNEX) model. This multiclass prediction model is the first risk model to differentiate between benign and malignant tumours, whilst also offering subclassification of any malignancy into borderline tumours, Stage-I and Stage-II-IV primary cancers and secondary metastatic tumours. The IOTA ADNEX model was developed and validated using parameters collected by experienced ultrasound examiners (Van Calster et al., 2014). Several external validation studies have shown good to excellent performance of the ADNEX model in discriminating different types of ovarian tumour, with a higher clinical value than the RMI (Araujo et al., 2017;Meys et al., 2017;Sayasneh et al., 2016;Szubert et al., 2016;Van Calster, 2017;Van Calster et al., 2016;Wynants et al., 2017). A study aiming to validate the ADNEX model when applied by Level-II examiners has confirmed that it can be used successfully by less-experienced examiners (Viora et al., 2020). A large multicentre cohort study of 4905 masses in 17 centres, comparing six different prediction models (RMI, LR2, Simple Rules, Simple Rules risk model and ADNEX model with or without CA 125), demonstrated the IOTA ADNEX model and the IOTA Simple Rules risk model to be the best models for the characterization of ovarian masses in patients who present with an adnexal lesion (Van Calster et al., 2020).

GI-RADS
The Gynaecologic Imaging Reporting and Data System (GI-RADS) was first introduced by Amor model. Demonstrating the standardization and reproducibility of the IOTA models, Sayasneh et al. (2013) showed that even less-experienced sonographers are able to differentiate accurately between benign and malignant ovarian masses using the IOTA LR1 model. The IOTA group also developed 'Simple Rules' that may be applied to a mass based on the presence or absence of five benign and five malignant ultrasound features. These rules can be applied to about 80% of adnexal masses, with the rest being classed as inconclusive. They have now been broadly accepted and are widely used in clinical practice (Alcazar et al., 2013;Hartman et al., 2012;Knafel et al., 2016;Nunes et al., 2014;Ruiz de Gauna et al., 2015;Sayasneh et al., 2013;Tantipalakorn et al., 2014;Timmerman et al., 2010;Timmerman et al., 2008). More recently, a logistic regression model based on the ultrasound features of the original Simple Rules was developed, i.e. the Simple Rules risk model. This model is able to provide an individual estimated risk of malignancy for any type of lesion (Timmerman et al., 2016). A summary of the main models and scoring systems for the preoperative diagnosis of ovarian tumours is presented in Table II.
As many ovarian masses can be recognized relatively easily, the IOTA group also proposed four 'Simple Descriptors' of the features typical of common benign lesions and two suggestive of malignancy, which can give an 'instant diagnosis' and reflect the pattern recognition that is a key part of ultrasonography. These are applicable to about 43% of adnexal masses (Ameye et al., 2012). A three-step strategy, consisting of the sequential use of Simple Descriptors, Simple Rules and subjective assessment by an expert, had high accuracy for discriminating between benign and malignant adnexal lesions (Ameye et al., 2012). A systematic review and meta-analysis reported better performance of the IOTA Simple Rules and the IOTA LR2 model compared with all other scoring systems, including the RMI . Besides confirming these findings, another meta-analysis highlighted that a two-step approach, with the IOTA Simple Rules as the first step and subjective assessment by an expert for inconclusive tumours as the second step, matched the test performance of expert ultrasound examiners (Meys et al., 2016). The IOTA Simple Rules have been integrated into several national clinical guidelines for the evaluation and management of adnexal masses (American College of Obstetricians and Gynecologists' Committee on Practice Bulletins, 2016) and they were considered the main diagnostic strategy (Glanc et al., 2017) as part of a first international consensus report for the assessment of adnexal masses.  (Amor et al., 2011;Amor et al., 2009). This reporting system quantifies the risk of malignancy into five categories: GI-RADS 1, definitively benign (estimated probability of malignancy (EPM) = 0%); GI-RADS 2, very probably benign (EPM < 1%); GI-RADS 3, probably benign (EPM = 1-4%); GI-RADS 4, probably malignant (EPM = 5-20%); and GI-RADS 5, very probably malignant (EPM > 20%). More recently, several studies have demonstrated the value of the GI-RADS system for the assessment of malignant adnexal masses in women who are candidates for surgical intervention. Furthermore, the addition of GI-RADS to CA 125 improves the identification of adnexal masses at high risk of malignancy compared with using CA 125 alone (Basha et al., 2019;Behnamfar et al., 2019;Koneczny et al., 2017;Migda et al., 2018;Zhang et al., 2017;Zheng et al., 2019).

O-RADS
The Ovarian-Adnexal Reporting and Data System (O-RADS) lexicon for ultrasound was published in 2018, providing a standardized glossary that includes all appropriate descriptors and definitions of the characteristic ultrasound appearance of normal ovaries and various adnexal lesions (Andreotti et al. 2018;. The O-RADS ultrasound working group developed an adnexalmass triage system based either on the O-RADS descriptors or on the risk of malignancy assigned to the mass using the IOTA ADNEX model to classify ovarian tumours into different risk categories (Andreotti et al., 2020). However, to date, neither the triage system nor the O-RADS descriptors have been externally validated. Basha et al. (2020) determined the malignancy rates, validity and reliability of the O-RADS approach when applied to a database of 647 adnexal masses collected before the development of the O-RADS system. In this retrospective study, the O-RADS system had significantly higher sensitivity than did the GI-RADS system and the IOTA Simple Rules, with a non-significant slightly lower specificity compared with both GI-RADS and IOTA Simple Rules, and with similar reliability.

Tumour markers
Statements on ultrasonography (Statements 1-6) 1. Subjective assessment by expert (Level-III) ultrasound examiners has the best performance to distinguish between benign and malignant ovarian tumours.
-Level of evidence: 5 -Grade of statement: D -Consensus: 5% threshold, 10% (n = 2); 10% threshold, 75% (n = 15); 15% threshold, 0% (n = 0); 20% threshold, 0% (n = 0); abstain, 15% (n = 3) Levels of evidence and grades are described in Table III. According to a systematic quantitative review assessing the accuracy of CA 125 level in the diagnosis of benign, borderline and malignant ovarian tumours, CA 125 is the best available single-protein biomarker identified to date (Medeiros et al., 2009). Although it lacks sensitivity and specificity for early stages of the disease and has a relatively low specificity overall, it can help direct treatment options in patients with suspicious ovarian masses. Pooled analyses have highlighted that a high body mass index and ethnicity might influence CA 125 levels, representing an additional diagnostic challenge (Babic et al., 2017). Other factors that influence CA 125 levels are the age of the patient, pregnancy, inflammatory processes and the presence of fibroids or endometriosis (Babic et al., 2017;Cramer et al., 2010;Johnson et al., 2008;Pauler et al., 2001).

Statements on tumour markers (Statements 7-12)
7. CA 125 is the best single-protein biomarker for the preoperative characterization of ovarian tumours. However, it is not useful as a screening test for ovarian cancer.
-Level of evidence: 2b -Grade of statement: B -Consensus: yes, 60% (n = 12); no, 10% (n = 2); abstain, 30% (n = 6) 10. CA 125 is helpful as a biomarker in cases of suspected malignancy and it helps to distinguish between subtypes of malignant tumours, such as borderline and early-and advanced-stage primary ovarian cancers and secondary metastatic tumours.
-Level of evidence: 3b -Grade of statement: C -Consensus: yes, 75% (n = 15); no, 5% (n = 1); abstain, 20% (n = 4) Levels of evidence and grades are described in Table III. Note: A minus sign '-' may be added to denote evidence that fails to provide a conclusive answer because it is either (a) a single result with a wide confidence interval; or (b) a systematic review with considerable heterogeneity. Such evidence is inconclusive, and therefore can only generate Grade D recommendations. *'Absolute SpPin' is a diagnostic finding whose specificity is so high that a positive result rules in the diagnosis; 'Absolute SnNout' is a diagnostic finding whose sensitivity is so high that a negative result rules out the diagnosis.

Magnetic resonance imaging
Several reports have found that magnetic resonance imaging (MRI), alone or in combination with computed tomography (CT), predicts accurately the presence of peritoneal carcinomatosis in patients undergoing preoperative evaluation for cytoreductive surgery, particularly when the assessment is carried out by an experienced radiologist (Dohan et al., 2017;Gadelhak et al., 2019;Low et al., 2015;Torkzad et al., 2015).
Recently, a prospective study reported higher specificity of the IOTA LR2 model compared with subjective interpretation of MRI findings by an experienced radiologist, as well as similar sensitivities for both imaging modalities for discriminating between benign and malignant tumours (Shimada et al., 2018). The addition of diffusion-weighted techniques to conventional imaging modalities has been shown in multiple pooled studies to increase diagnostic accuracy in discriminating between benign tumours and ovarian cancer, especially in the Caucasian population, with data even suggesting a value in predicting resectability (Dai et al., 2019;Espada et al., 2013;Meng et al., 2016;Michielsen et al., 2017;Rizzo et al., 2020). However, the true extent of such a benefit needs to be validated further in multicentre, large-scale prospective randomized studies, which are currently being designed or underway (Michielsen et al., 2017). The addition of quantitative dynamic contrast-enhanced MRI to diffusion-weighted imaging and anatomical MRI sequences and the development of a 5-point scoring system (O-RADS MRI score) is another modern diagnostic development with promising potential for the differentiation between benign and malignant adnexal masses in cases in which ultrasound is unable to arrive at a clear diagnosis (i.e. indeterminate masses). When this technique is enhanced with volume quantification, it can help to discriminate between Type-I and Type-II epithelial ovarian cancers (Carter et al., 2013;Gity et al., 2019;He et al., 2020;Li et al., 2018;Malek et al., 2019;Thomassin-Naggara et al., 2012;Thomassin-Naggara et al., 2020). However, there are only limited data available on the impact of these modern MRI techniques on clinical decisionmaking and further studies are needed, with larger sample populations (Dirrichs et al., 2020).

Computed tomography
Dedicated multidetector CT protocols with standardized peritoneal carcinomatosis index forms are the most common diagnostic tool used in routine clinical practice to assess the extent of tumour dissemination and the presence of peritoneal carcinomatosis (Ahmed et al., 2019;Byrom et al., 2002;Esquivel et al., 2010;Marin et al., 2010;Nasser et al., 2016). A radiological peritoneal carcinomatosis index applied at preoperative CT within an expert setting has been shown to have low performance scores as a triage test to identify patients who are likely to have complete cytoreduction to no macroscopic residual disease (Avesani et al., 2020). On retrospective analysis, preoperative CT imaging showed high specificity but rather low sensitivity in detecting tumour involvement at key sites in ovarian cancer surgery (Nasser et al., 2016). Multiple studies that have attempted to cross-validate the accuracy of CT scans in predicting unresectable disease and incomplete cytoreduction have shown a substantial drop in accuracy rates when attempts have been made to validate them in other cohorts (Axtell et al., 2007;Bristow et al., 2000;Dowdy et al., 2004;Gemer et al., 2009;Kim et al., 2014;Nelson et al., 1993;Rutten et al., 2015;Shim et al., 2015). Thus, CT should not be used as the sole tool to predict the resectability of peritoneal carcinomatosis and exclude patients from surgery; rather, the full clinical context should be taken into account. Its widespread availability makes CT useful as a firstline diagnostic tool to identify patients who should not be selected for cytoreductive surgery, such as those with large/multifocal intraparenchymatous distant metastases, acute thromboembolic events or secondary metastatic tumours that limit the prognosis. The role of radiomics as an additional quantitative mathematical segmentation of conventional preoperative CT images has shown some promising results in preliminary studies; however, larger studies are necessary for validation before this technique is implemented in clinical practice (Lu et al., 2019).

Positron emission tomography-computed tomography
Positron emission tomography-computed tomography (PET-CT) may be useful in differentiating malignant from borderline or benign ovarian tumours, with the limitation that its diagnostic performance can be impacted negatively by certain tumour histological subtypes, due to the lower fluorodeoxyglucose uptake in clear-cell and mucinous invasive subtypes (Castellucci et al., 2007;Kitajima et al., 2011;Nam et al., 2010;Risum et al., 2007;Tanizaki et al., 2014;Yamamoto et al., 2008). PET-CT can also play a role as an additional technique in the diagnosis of lymphnode metastases, especially outside the abdominal cavity, or in characterizing unclear lesions in key areas that would alter clinical management, for example chest lesions (Dauwen et al., 2013;Kim & Lee, 2018;Laghi et al., 2017). However, PET-CT does not seem to be a relevant additional diagnostic modality for the true extent of peritoneal spread of ovarian cancer, specifically bowel and mesenteric serosa, and therefore fails to predict resectability in those key sites, especially in the presence of low-volume disease (Michielsen et al., 2014). Furthermore, PET-CT has been shown to have a low diagnostic value in differentiating borderline from benign tumours and should therefore not be used in clinical decision-making processes in that context, especially when considering fertilitysparing procedures (Kitajima et al., 2011;Tanizaki et al., 2014;Yamamoto et al., 2008).

Statements on MRI, CT and PET-CT (Statements 13-17)
13. MRI with the inclusion of the functional sequences, dynamic contrast-enhanced and diffusion-weighted MRI, is not a first-line tool but may be used as a second-line tool after ultrasonography to further differentiate between benign, malignant and borderline masses. a key role in understanding metastasis and tumorigenesis and provide comprehensive insight into tumour evolution and dynamics during treatment and disease progression, they still have not been established as part of routine clinical practice (Barbosa et al., 2018;Chen et al., 2019;Giannopoulou et al., 2018). One meta-analysis suggested that quantitative analysis of cell-free DNA has unsatisfactory sensitivity but acceptable specificity for the diagnosis of ovarian cancer (Zhou et al., 2016). In a more recent meta-analysis, cell-free DNA appeared to be slightly better than CA 125 and similar to HE4 with respect to its diagnostic ability to discriminate individuals with from those without ovarian cancer . Nevertheless, the diagnostic value of cell-free DNA in ovarian cancer patients remains unclear and the data should be interpreted with caution. Further large-scale prospective studies are strongly recommended to validate the potential applicability of using circulating cell-free DNA, alone or in combination with conventional markers, as a diagnostic biomarker for ovarian cancer, and to explore potential factors that may influence the accuracy of ovarian cancer diagnosis (Zhou et al., 2016).
15. PET-CT cannot differentiate reliably between borderline and benign tumours.

Circulating cell-free DNA and circulating tumour cells
Circulating cell-free DNA and circulating tumour cells as non-invasive cancer biomarkers and in non-invasive biopsy (sometimes called 'liquid biopsy') have been investigated in multiple studies (Barbosa et al., 2018;Chen et al., 2019;Giannopoulou et al., 2018;Guo et al., 2018;Kolostova et al., 2015;B. Li et al., 2019;N. Li et al., 2019;Lou et al., 2018;Phallen et al., 2017;Suh et al., 2017;Vanderstichele et al., 2017;Widschwendter et al., 2017;Yu et al., 2019;Zhou et al., 2016). DNA methylation patterns in cellfree DNA show potential to detect a proportion of ovarian cancers up to 2 years in advance of diagnosis. They may potentially guide personalized treatment, even though validation studies are lacking. The prospective use of novel collection vials, which stabilize blood cells and reduce background DNA contamination in serum/plasma samples, will facilitate the clinical implementation of liquid biopsy analyses (Widschwendter et al., 2017). A prospective evaluation of the potential of cell-free DNA for the diagnosis of primary ovarian cancer using chromosomal instability as a read-out suggested that this might be a promising method to increase the specificity of the presurgical prediction of malignancy in patients with adnexal masses (Vanderstichele et al., 2017). However, even though these circulating biomarkers play

Statement on circulating cell-free DNA and tumour cells (Statement 18)
18. Circulating cell-free DNA and circulating tumour cells should not yet be used in routine clinical practice to differentiate between benign and malignant ovarian masses.

Overview of consensus
The experts also reached a consensus on a flowchart describing steps recommended to distinguish between benign and malignant tumours ( Figure 2) and to direct patients towards appropriate treatment pathways. Ultrasonography is recommended as a first step to stratify patients with symptoms suggestive of an adnexal mass, and in those with an incidental finding of an adnexal mass on imaging. If the scan rules out normal ovaries and physiological changes (i.e. rules out O-RADS 1), the IOTA ADNEX model could be applied as a next step in order to determine the risk malignancy, in order to exclude secondary cancers, thromboembolic events, and multifocal intraparenchymal distant metastases that would preclude operability. The final management and treatment journey of the patient should be determined within an expert multidisciplinary setting, taking into account both the diagnostic fi ndings and the overall patient profi le, including symptoms, patient preferences and prior surgical, medical and reproductive history, with the ultimate aim of defi ning an individualized approach for every patient. A consensus was also reached on further steps necessary to differentiate between subgroups of malignancy and extent of disease within gynaecological oncology centres (Figure 3). Ultrasound assessment by an expert or application of the IOTA ADNEX model in combination with the tumour marker profi le (CA 125 and CEA, complemented with other markers in specific cases) can often indicate the specifi c subtype of malignancy. If available, diagnosis of the primary lesion can be confi rmed with diffusionand perfusion-weighted MRI, especially in cases in which fertility-sparing surgery is considered. A CT scan of chest, abdomen and pelvis is mandatory before planned surgery for presumed

Figure 2
Patient with symptoms suggestive of adnexal mass or incidental finding of adnexal mass on clinical examination or imaging (e.g., CT, MRI)  (chair) and François Planchamp (methodologist) wrote the first draft of the manuscript. All other contributors actively gave personal input, reviewed the manuscript and gave final approval before submission.

Appendices
Appendix 1: Identification of scientific evidence: literature search in MEDLINE.
Research period 1 May 2015-1 May 2020 Indexing terms adnexal masses, alpha fetoprotein, assessment of different neoplasias in the adnexa, assessment of different neoplasias in the adnexa masses, assessment of different neoplasias in the adnexa model, benign ovarian masses, benign ovarian tumours, beta-human chorionic gonadotropin, biomarker, borderline tumours, carbohydrate antigen 19.9, carbohydrate antigen 125, carcinoembryonic antigen, cell-free deoxyribonucleic acid, circulating cancer cells, circulating cell-free deoxyribonucleic acid, circulating free deoxyribonucleic acid, circulating tumour cells, circulating tumour deoxyribonucleic acid, clinical routine, computed tomography, consensus statement, daily practice, diagnosis, diagnostic performance, diagnostic models, diffusion-weighted imaging, diffusion-weighted magnetic resonance imaging, dynamic contrast-enhanced magnetic resonance imaging, expert ultrasound examiners, first line test, functional sequences, gynaecology imaging reporting and data system, human epididymis protein, imaging, imaging methods, immunohistochemical diagnosis, inhibin, international ovarian tumour analysis, international ovarian tumour analysis methods, international ovarian tumour analysis rules, intraoperative ultrasound, investigations, logistic regression 1 test, logistic regression 2 test, magnetic resonance imaging, malignant ovarian masses, malignant ovarian tumours, marker, maximum standardized uptake value, molecular biology, molecular marker, morphological scoring system, multivariate analysis, ovarian cancer, ovarian masses, ovarian tumours, ovary, positron emission tomography, positron emission tomography-computed tomography, pre-operative characterization, pre-operative diagnosis, prognostic factor, prognostic value, protein biomarker, risk factors, risk of malignancy score, risk of malignancy index, risk of ovarian malignancy algorithm, scoring system, screening test, secondary metastatic tumours, second line test, simple rules, simple rules risk, simple rules risk model, single protein biomarker, standardized uptake value, suspected malignancy, suspected metastatic tumour, test performances, threshold risk, transabdominal ultrasound, transvaginal ultrasound, tumour markers, ultrasonography, ultrasound, ultrasound (3D), ultrasound-based diagnostic models, ultrasound-based risk models, ultrasound examiners, vascular endothelial growth factor, vascular endothelial growth factor, whole body diffusion magnetic resonance imaging.

Study design
Priority was given to high-quality systematic reviews, meta-analyses and validating cohort studies, but lower levels of evidence were also evaluated. Search strategy excluded editorials, letters and case reports. Reference list of each identified article was reviewed for other potentially relevant papers.