eISSN: 2084-9869
ISSN: 1233-9687
Polish Journal of Pathology
Current issue Archive Manuscripts accepted About the journal Supplements Editorial board Abstracting and indexing Subscription Contact Instructions for authors Ethical standards and procedures
SCImago Journal & Country Rank
vol. 72
Original paper

Interlaboratory agreement in assessment of gynaecological cytology in Cervical Cancer Screening Programme in Poland – a pilot evaluation

Kinga Zalewska-Otwinowska
Anna Macios
1, 2
Katarzyna Komerska
Małgorzata Rekosz
Andrzej Nowakowski

Department of Cancer Prevention, Maria Sklodowska-Curie National Research Institute of Oncology, Warsaw, Poland
Department of Gastroenterology, Hepatology and Clinical Oncology, Centre of Postgraduate Medical Education, Warsaw, Poland
Pol J Pathol 2021; 72 (1): 75-83
Online publish date: 2021/05/31
Article file
- 08-PJP-01869.pdf  [0.32 MB]
Get citation
JabRef, Mendeley
Papers, Reference Manager, RefWorks, Zotero


Cervical cancer (CC) is the fourth most common cancer and cancer-related cause of death in women worldwide [1] and it is still an unsolved epidemiological problem especially in low- and middle-income countries [2]. In 2017 in Poland 2,502 new cases and 1,609 deaths due CC were reported by the National Cancer Registry which translates into world-age standardized incidence and mortality ratios in women of 7.7 and 4.2 per 100,000, respectively [3]. In Poland, more than half of all CC cases occur among women between 45 and 64 years old. Incidence and mortality have been decreasing for the last few decades in women under the age of 60 but these rates are still stable in older women [4]. Also 5-years relative CC-related survival rates have remained unchanged throughout last decade [5]. Screening for CC is recommended by European Commission since 2003 as a form of secondary prevention [6] and has the potential to greatly reduce both CC incidence and mortality through early identification of precancerous lesions which can be treated far more effectively than in case of symptomatic invasive disease [7]. In most of the developed countries which run organised/opportunistic screening programmes, exfoliative cytology (PAP test) is still the basic screening test. Even in countries which switched to high risk human papillomavirus (hr-HPV) test as a more sensitive screening method, PAP tests are an integral part of triage algorithms after a positive hr-HPV test [8, 9, 10, 11].
All types of cytological procedures are affected by the problem of ambiguous results that hinder the implementation of proper diagnostic and therapeutic procedures. As a result of a multidisciplinary effort, in 1988 The Bethesda System (TBS) was developed in order to create a standardized framework for cervical cytology reports (including evaluation of specimen adequacy, optional general categorization, descriptive diagnosis) [12, 13]. TBS was accepted internationally. The system was revised in 1991, 2001 and 2014 in reaction to evolution of management and new research on cervical cancer [13]. TBS for assessing abnormalities in the cervix and in the thyroid gland was very readily accepted by pathologists and clinicians, and the presented classifications created the foundation for a unified clinical procedure [12]. In 2006 in Poland, an organised CC screening programme for women at the age between 25 and 59 was initiated. Pap test is offered every 3 years, in accordance with the guidelines of the European Commission and Polish Gynaecological Society [7, 14]. In Poland, by law, cytopathologists (medical doctors with specialty in pathology) and cytodiagnosticians (laboratory diagnosticians with specialty in medical cytomorphology) are entitled to assess cytological samples. In OCCSP all Pap tests with abnormal results must be approved be a senior cytodiagnostician or a cytopathologist. Results are coded according to modified TBS 2001. Information about screening results is stored in an IT System for Prevention Monitoring (pol. System Informatyczny Monitorowania Profilaktyki, SIMP) and National Cancer Registry (NCR) collects reports of cancer diagnosis. Algorithms for abnormal results are set and include repeated testing and colposcopy/biopsy.
Apart from the coverage of target population by the screening programme, quality assurance is a key factor for obtaining full impact of screening on CC epidemiology. Certification of laboratories and personnel is also important to maintain high quality of cytological cervices. Evaluation of agreement between expert cytological diagnosis and laboratories/cytodiagnosticians routinely working in CC screening programmes may be one of the components of certification process. It may also be important for quality assurance which plays a crucial role in cytology-based CC screening [7]. In Poland quality assurance is led by Central Coordination Centre (pol. COK – Centralny Oœrodek Koordynuj¹cy). Although cytological laboratories carrying out CC screening programme in Poland undergo certification, to our knowledge interlaboratory agreement has never been evaluated and is the subject of this pilot project.

Material and methods

In 2018 COK carried out a pilot study to assess interlaboratory variability of cytological diagnoses in selected laboratories operating in the Cervical Cancer Screening Programme (CCSP). A set of 50 expert-selected conventional cytology slides with clinical, colposcopic and histological confirmation of diagnoses, collected in 2012-2015 was prepared. Slides originally diagnosed by cytodiagnosticians of the Cytological Laboratory of the Maria Sklodowska-Curie National Research Institute of Oncology in Warsaw, Poland, were reviewed again by an expert cytodiagnostician with 20 years of experience from the same lab. The expert confirmed cytological diagnoses taking into account clinical, colposcopic and histopathology reports and selected smears formed a set representative for each category of diagnoses according to the Bethesda 2001 system (expert diagnoses are presented in Table I). The laboratory is high-throughput, with internal quality-assurance and is co-operating with a Cervical Pathology Clinic performing large-scale screening, colposcopic/histological triage and treatment of women with cervical pathology also within the CCSP.
The set with each-time blinded and mixed slides was sent to 15 cytodiagnostic laboratories including 8 laboratories in Mazovian Voivodeship, 4 in Lublin Voivodeship, 2 in Œwiêtokrzyskie Voivodeship and 1 in £ódŸ Voivodeship, all operating in the CCSP. Standard clinical data accompanied each slide. By the decision of the head in each lab, a cytodiagnostician or cytodiagnosticians were asked to evaluate the set smears over a week along with routine everyday practice. Each slide was to be assessed according to the Bethesda 2001 system. After the evaluation three types of coding were applied in COK to allow for agreement analyses as follows:

• Coding 1 (general): unsatisfactory for evaluation vs normal (no intraepithelial lesion or malignancy – NILM) vs. abnormal (atypical squamous cells of undetermined significance [ASC-US] or low-grade squamous intraepithelial lesion [LSIL] or atypical squamous cells cannot exclude HSIL [ASC-H] or high-grade squamous intraepithelial lesion [HSIL] or atypical glandular cells [AGC] or squamous cell carcinoma [SCC]);
• Coding 2 (aggregated): unsatisfactory for evaluation vs. normal (NILM) vs. low-grade lesions (LSIL or ASC-US) vs. high-grade lesions (ASC-H or HSIL or AGC or SCC);
• Coding 3 (detailed): unsatisfactory for evaluation vs. normal (NILM) vs. ASC-US vs. LSIL vs. ASC-H vs. AGC vs. HSIL vs. SCC. The study followed the Declaration of Helsinki and the protocol was approved by the Ministry of Health.

Statistical analysis

Percentages of laboratories’ diagnoses coherent with expert’s evaluations were calculated in each type of coding. Unweighted Cohen’s κ statistics were estimated for laboratories participating in pilot study. Weighted (κw) κ were also computed since unweighted κ does not account for severity of the diagnoses. Weights were adjusted so that unsatisfactory for evaluation diagnosis was considered as in total disagreement with any distinct diagnosis and other diagnoses were squarely weighted. Level of <0.05 was established as significant. Landis and Koch’s approach was applied to interpret agreement expressed by κ coefficients as:  < 0 – poor; 0.0-0.2  – slight; 0.2-0.4 – fair; 0.4-0.6 – moderate; 0.6-0.8 – substantial and 0.8-1.0 – almost perfect. Stata 15 software [15] was used for statistical analyses.


Of 15 diagnostic centres involved in study, 13 labs evaluated all of the 50 prepared slides, 1 lab evaluated 49 smears and 1 lab – only 29 slides. Since the distribution of normal and abnormal slides in the set evaluated by last laboratory was not coherent with the distribution of set of all 50 smears, this laboratory was excluded from further analysis (Table I: full source data with expert’s and laboratories’ diagnoses). Total number of 728 readings was collected and 699 of them were analysed: 21 considered as unsatisfactory for evaluation and 678 as adequate. However, only 32.1% of truly unsatisfactory for evaluation assessments were recognized correctly (Table II). Slides originally assessed as adequate for evaluation were recognized properly in 99.5%. Cytodiagnosticians managed to evaluate correctly 94.8% of normal smears overall (range from 86.4% to 100% by lab) and recognized correctly 76.2% of abnormal slides (range from 50% to 95.8% by lab). Of 336 slides with expert’s diagnosis of “presence of abnormal cells”, 79 were apprised as normal (23.5%). On the other hand, 14 of 307 expert’s “normal (no intraepithelial lesion or malignancy) readings were identified as abnormal (4.6%). The vast majority of slides assessed originally as “unsatisfactory for evaluation” by the expert (38 of 56) were diagnosed as normal (67.9%).
High-grade lesions were correctly recognized more often than low-grade ones (64.3% vs. 51.6%). Slides with ASC-US or LSIL were considered as normal rather than high-grade (56 readings vs. 31 readings, 30.8% vs. 17.0%). Of 8 categories in detailed coding, smears with ASC-US and ASC-H expert’s diagnoses were evaluated discordant by cytodiagnosticians most frequently (21.4% and 28.6% of coherent diagnoses, respectively). Of 14 squamous cell carcinoma readings in laboratories undergoing assessment, in 3 the cases were apprised as normal (21.4%). Among 408 slides considered as normal by laboratories, 11 were misdiagnosed (79 of them (19.4%) was abnormal according to expert’s opinion). The proportion of correct diagnoses, weighted and unweighted κ coefficients were estimated for each laboratory. The median percentage was calculated as 82% (range 66-92%) in general coding, 72% (range 60-84%) in aggregated coding and 66% (range 56-80%) in detailed coding (Fig. 1). Unweighted κ coefficients among labs ranged from 0.40 to 0.86 for general coding, from 0.37 to 0.76 for aggregated coding and from 0.34 to 0.73 for detailed coding and median unweighted κ coefficients correspond with substantial (0.67), moderate (0.59) and moderate agreement (0.53), respectively (Table III). Weighted κ coefficients were calculated to account for differences in severity of diagnoses. In the detailed coding κw ranged from 0.40 to 0.76 with median of 0.65, identifying substantial agreement.
Each slide was assigned to percentage of diagnostic centres which diagnosed it correctly by each type of coding (Table II). Of 50 smears, 35 (70%) were recognized correctly by more than 80% of laboratories in general coding, 30 (60%) in aggregated coding and 23 (46%) in detailed coding. However, 4 slides (8%) were properly evaluated by at most 20% of diagnostic centres in general, 7 (14%) in aggregated and 8 (16%) in detailed coding. Two slides were identified incorrectly by all of the laboratories in detailed and one on them also in aggregated coding. Expert diagnosed one of these slides as LSIL and one as ASC-US. Most of smears with more than 80% proper identifications were normal ones [22]. Overall, each slide was recognized properly by average of 81.0% laboratories in general, 71.9% in aggregated and 65.7% in detailed coding (median 92.9%, 85.7% and 78.6%, respectively).


Cytological laboratories participating in the national population-based cervical cancer screening programmes should be certified and subject to both internal and external quality assurance [7]. Internal quality assurance include re-screening of slides, monitoring primary screening detection rates, assessment of cyto-histologic, cyto-clinical, cyto-virological correlations and audit of interval cancers according to EU Guidelines [7]. External quality assurance is less defined, varies between countries and may include proficiency testing and comparison of laboratory and personal reporting rates with set national standards (p. 167 in [7]). Although cervical cytology is commonly used world-wide in developed countries, its accuracy in detection of cervical neoplasia may vary greatly since it is a very subjective method [14, 16] and diagnoses may differ from lab to lab [17]. Some studies show only fair agreement (with κ < 0.3) between cytodiagnosticians evaluating cytology samples [18]. Low grade abnormalities causes most troubles in establishing correct diagnosis [19]. In this study we have aimed to develop grounds for future periodic proficiency testing in assessment of cytological slides and run a pilot evaluation of selected cytodiagnosticians operating in cervical cancer screening programme in Poland. These activities are a part of effort to form a certification/periodic recertification process. We have also aimed to assess interlaboratory agreement in assessment of a set of highly selected cytological slides representing a wide spectrum of normal and abnormal conditions of the uterine cervix. Although the CCSP in Poland has been in place since 2006/2007 such activities have been attempted for the first time.
In general, we have found average to very high agreement between the original expert cytological diagnosis and diagnosis reached by laboratories participating in our pilot project. Interpretation of Pap smears is subjective so achieved concordance with expert diagnoses of each slide by average of 81.0% laboratories in general, 71.9% in aggregated and 65.7% in detailed coding is a fairly good result. One laboratory (number 1) had repeatedly highest agreement rate in all three types of coding. On the other hand, another laboratory (number 4) had the lowest agreement rate in two types (general, aggregated) coding of diagnoses according the Bethesda system. This may indicate that quality of assessment may differ between cytological laboratories participating in national screening and reasons for high disagreement in outlying laboratories should be sought. Our results show that cytodiagnosticians found proper classification of unsatisfactory for evaluation slides from normal slides troubling. Only 32.1% of slides with original expert diagnosis of “unsatisfactory for evaluation” were recognized correctly. This should be taken under consideration in future verification of cytological diagnoses and more impact should be place on the smear quality during the process of training of cytodiagnosticians in Poland.
Percentage of slides correctly diagnosed as abnormal was 76.2%, however, results strongly vary between specific diagnoses from 21.4% slides correctly diagnosed as ASC-US to 61.2% slides correctly diagnosed as HSIL. Some of previous studies show that the greatest source of disagreement in cytology results involved interpretation of low-grade lesions [20, 21, 22] and our study seems to be in agreement with these findings. The number of recognized abnormal slides varied from 50% to 95.8% by laboratory which suggests great differences in reporting between some of them and should trigger corrective measures in labs with low detection rates which directly influence the quality of the whole screening process. Slides were evaluated additionally to normal workload in laboratories, so it may have influenced the results. Other limitation of our pilot study is the limited number of laboratories which took part in our project. However we have initiated actions to run similar analyses for each cytodiagnostician participating in the CCSP in Poland and make it a future requirement of certification and periodic recertification.
COK as part of Department of Cancer Prevention, The Maria Sklodowska-Curie National Research Institute of Oncology, Warsaw, Poland, on the basis of the agreement made with the Ministry of Health is obliged, among other responsibilities, to monitor the quality of services provided by laboratories participating in the CCSP. We have carried out this study to evaluate consistency of cytological diagnoses in selected laboratories in Poland. The results of the pilot study will lead to the development of methods for evaluation laboratories in the CCSP and therefore to ensure the high quality cytological diagnoses. Periodic skills evaluation system is planned to be introduced to certify cytologists working in the CCSP. The analysis carried out on the basis of the data collected during the pilot study clearly indicates the necessity of conducting further verification of the consistency in cytological diagnoses. External audits should be run systematically and also include monitoring of colposcopy examinations, histopathological results and effectiveness of treatment and COK initiated actions and is working to implemented these monitoring activities into the CCSP in Poland.
The authors declare no conflict of interest.
The study was financed by the Polish Ministry of Health through the National Cancer Control Programme within the objective of coordination and monitoring of quality of cervical and breast cancer screening.


1. Bernard W. Stewart and Christopher P. Wild. IARC: World cancer report 2014, Lyon 2014.
2. World Health Organization: Cervical cancer, Geneva 2018. Available at: https://www.who.int/cancer/prevention/diagnosis-screening/cervical-cancer/en/ (access 16.07.2019).
3. Wojciechowska U, Czaderny K, Ciuba A, et al. Cancer in Poland in 2016. Polish National Cancer Registry, Department of Epidemiology and Cancer Prevention 2018.
4. Nowakowski A, Wojciechowska U, Wieszczy P, et al. Trends in cervical cancer incidence and mortality in Poland: is there an impact of the introduction of the organised screening? Eur
5. J Epidemiol 2017; 32: 529-532.
6. The Maria Sklodowska-Curie National Research Institute of Oncology, Warsaw: National Cancer Registry. Available at: http://onkologia.org.pl/nowotwory-szyjki-macicy-kobiet/ (access 16.07.2019).
7. Council of the European Union (2003). Council recommendation of 2 December 2003 on cancer screening (2003/878/EC). Off J Eur Union 2003; L 327: 34-38.
8. Arbyn M, Anttila A, Jordan J, et al. (eds.). European guidelines for quality assurance in cervical cancer screening. 2nd edition. International Agency for Research on Cancer 2008.
9. Gultekin M, Karaca MZ, Kucukyildiz I, et al. Initial results of population based cervical cancer screening program using HPV test in one million Turkish women. Int J Cancer 2018; 142: 1952-1958.
10. Rebolj M, Rimmer J, Denton K, et al. Primary cervical screening with high risk human papillomavirus testing: observational study. BMJ 2019; 364: 1240.
11. Polman NJ, Snijders PJF, Kenter GG, et al. HPV-based cervical screening: rationale, expectations and future perspectives of the new Dutch screening programme. Prev Med 2019; 119: 108-117.
12. Bergengren L, Lillsunde-Larsson G, Helenius G, et al. HPV-based screening for cervical cancer among women 55-59 years of age. PLoS One 2019; 14: e0217108.
13. Zelazik M, Michalska A, Radowicz-Chil A, et al. Unified cytological reports determine the clinical management. Medical Studies 2019; 35: 147-152.
14. Nayar R, Wilbur DC. The Bethesda System for Reporting Cervical Cytology: A Historical Perspective. Acta Cytol 2017; 61: 359-372.
15. Polskie Towarzystwo Ginekologiczne: Rekomendacje PTG dotycz¹ce diagnostyki, profilaktyki i wczesnego wykrywania raka szyjki macicy. Ginekol Pol 2006; 77: 655-659.
16. StataCorp. 2017. Stata Statistical Software: Release 15. College Station, TX: StataCorp LLC.
17. Baena A, Guevara E, Almonte M, et al. Factors related to inter-observer reproducibility of conventional Pap smear cytology: a multilevel analysis of smear and laboratory characteristics. Cytopathology 2017; 28: 192-202.
18. Klinkhamer PJ, Vooijs GP, de Haan AF. Intraobserver and interobserver variability in the quality assessment of cervical smears. Acta Cytol 1989; 33: 215-218.
19. da Fonseca AJ, Murari RS, Moraes IS, et al. Validity of cervicovaginal cytology in a Brazilian State with high incidence rate of cervical cancer. Rev Bras Ginecol Obstet 2014; 36: 347-352.
20. Sørbye SW, Suhrke P, Revå BW, et al. Accuracy of cervical cytology: comparison of diagnoses of 100 Pap smears read by four pathologists at three hospitals in Norway. BMC Clin Pathol 2017; 17: 18.
21. Stoler MH, Schiffman M. Interobserver reproducibility of cervical cytologic and histologic interpretations: realistic estimates from the ASCUS-LSIL Triage Study. JAMA 2001; 285: 1500-1505.
22. Türkmen IC, Başsüllü N, Bingöl B, et al. Interobserver variability in cervical smears from patients with a history of abnormal cytology: comparison of conventional Pap smears and liquid-based cytology. Erciyes Med J 2013; 1: 13-17.
23. Wright TC Jr, Stoler MH, Behrens CM, et al. Interlaboratory variation in the performance of liquid-based cytology: insights from the ATHENA trial. Int J Cancer 2014; 134: 1835-1843.
Copyright: © 2021 Polish Association of Pathologists and the Polish Branch of the International Academy of Pathology This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License (http://creativecommons.org/licenses/by-nc-sa/4.0/), allowing third parties to copy and redistribute the material in any medium or format and to remix, transform, and build upon the material, provided the original work is properly cited and states its license.
Quick links
© 2021 Termedia Sp. z o.o. All rights reserved.
Developed by Bentus.
PayU - płatności internetowe