Improving Accuracy and Completeness in the Collaborative Staging System for Stomach Cancer in South Korea

BACKGROUND
Cancer staging enables planning for the best treatments, evaluation of prognosis, and predictions for survival. The Collaborative Stage (CS) system makes it possible to significantly reduce the proportion of patients labeled at an "unknown" stage as well as discrepancies among different staging systems. This study aims to analyze the factors that influence the accuracy and validity of CS data.


MATERIALS AND METHODS
Data were randomly selected (233 cases) from stomach cancer cases enrolled for CS survey at the Korea Central Cancer Registry. Two questionnaires were used to assess CS values for each case and to review the cancer registration environment for each hospital. Data were analyzed in terms of the relationships between the time spent for acquisition and registration of CS information, environments relating to cancer registration in the hospitals, and document sources of CS information for each item.


RESULTS
The time for extracting and registering data was found to be shorter when the hospitals had prior experience gained from participating in a CS pilot study and when they were equipped with full-time cancer registrars. Evaluation of the CS information according to medical record sources found that the percentage of items missing for Site Specific Factor (SSF) was 30% higher than for other CS variables. Errors in CS coding were found in variables such as "CS Extension," "CS Lymph Nodes," "CS Metastasis at Diagnosis," and "SSF25 Involvement of Cardia and Distance from Esophagogastric Junction (EGJ)."


CONCLUSIONS
To build CS system data that are reliable for cancer registration and clinical research, the following components are required: 1) training programs for medical records administrators; 2) supporting materials to promote active participation; and 3) format development to improve registration validity.


Introduction
Cancer is the number one cause of death in South Korea, accounting for 27.6% of the total national death toll in 2012 (Statistics Korea, 2013).As cancer deaths have been steadily increasing, public and individual burdens for cancer treatment are also expected to increase accordingly.The cancer registration project is both fundamental and critical to a national cancer control program that will plan and monitor the priority of cancer management policy (Moore, 2013).Cancer registration statistics lead to predictions regarding medical staff, hospitals, and costs for cancer treatment.Causes of cancer can also be identified by checking the incidence trends and group incidences in the statistics.Furthermore, accurate statistics make it possible to check the effects of prevention, diagnosis, and treatment programs, which can also be used to create materials for cancer information training and promotion.The focus of the cancer registration project is the cancer registration data collected from medical institutions.Factors contributing to a successful cancer registry in South Korea include timeliness of registration reports,

Improving Accuracy and Completeness in the Collaborative Staging System for Stomach Cancer in South Korea
Hyun-Sook Lim 1& , Young-Joo Won 2& , Yoo-Kyung Boo 3* completeness and validity of data, and accuracy of reporting data.The cancer registration project, which began in the 1980s (Shin et al., 2005), has made various efforts such as training cancer registrars to ensure quality registration data, distribution of quality management programs, and development of registration guides and training materials.
Stage information plays an essential role in making treatment decisions for patients by helping to predict a patient's prognosis and to compare treatment results so that increasing survival rates for particular cancers can be identified (AJCC, 2010).
The Korea Central Cancer Registry has conducted stage classification using the Surveillance Epidemiology and End Results (SEER) Summary Stage (SS) since 2003.The SS was developed to categorize the extent of disease (EOD), which represents how widely the cancer has spread from the primary site (Shambaugh et al, 1977;Young et al, 2012).Although this classification system is applicable to every cancer type and is easy to use, the SS is not readily utilized in clinical research due to the difficulty of associating it with other stage systems, as well as a lack of information provided for clinicians.The data items collected for cancer registration in South Korea consist of personal characteristics and cancer information, including primary site, morphology, dates of diagnosis, methods of final diagnosis, initial treatment, etc.However, the data are limited regarding information on incidence rates for cancer patients.
To overcome these limitations, efforts have been made to produce various cancer statistics by expanding additional factors, and the application of the Collaborative Stage (CS) system is also currently being considered.The CS system was developed to merge the principles of three different cancer staging coding systems: the American Joint Committee on Cancer (AJCC) primary tumor, regional lymph nodes, and distant metastasis (TNM) staging; Summary Stage 1977 (SS77); and Summary Stage 2000 (SS2000) (Collaborative Staging Task Force, 2006;Collaborative Staging Task Force, 2012).The CS system is being used to assign stages for cancer cases diagnosed after 2003 in the US, and it has a decisive benefit of properly adjusting and assigning the stage at diagnosis, even when the stage classification system has changed over time.In South Korea, adopting the CS system for the collection of stage information for major cancers (stomach, colorectal, and breast) is being considered.
CS registration was planned for 47 training hospitals as a part of the "cancer registry statistics project" in 2012.However, CS collection items and customized training for the medical records administrators who would conduct the collection were not yet ready, and resources for efficient work performance have not yet been made available.Accordingly, it is necessary to show evidence for the accuracy and validity of CS data in order to achieve the goal of improving the quality of cancer registration data by expanding the items included in cancer registration.To do this, the factors which influence the accuracy and validity of CS data must be determined.
The purpose of this study is to examine the basic information needed to make specific plans for designing CS classification training and to improve the quality of CS registry data by analyzing the major factors influencing its accuracy and validity.

Materials and Methods
The study identified CS items and their medical record sources in registered cases of stomach cancer, which has one of the highest incidence and mortality rates in Korea (Leung et al., 2008;Seo et al., 2012;Cho et al., 2013;Jung et al., 2014).

Case Selection and Data Collection
The list of participating hospitals for the CS 2012 survey was provided by the Korea Central Cancer Registry (KCCR).From the available cases, the participating hospitals randomly selected 10% of their subject cases by themselves.Two questionnaires were developed: one assessing hospital characteristics and the other assessing the source of the information in the medical records for stomach cancer.Questionnaires were developed in consultation with registered cancer registrars through preliminary tests.In the preliminary tests, cancer registrars in ten hospitals were asked to collect 2 or 3 stomach cancer cases using the questionnaire that had been developed, and the questionnaire items were then verified.The main survey was conducted in 39 hospitals, and participants were instructed to answer the verified questionnaires.

Questionnaire for institutions
To identify factors influencing the registration of CS data and to understand the hospitals' circumstances, the following six variables were assessed: location of hospital, regional cancer registry operation, participation in the CS pilot study from the KCCR project in 2011, all cancer cases diagnosed in 2009, number of stomach cancer cases diagnosed in 2009, and existence of full-time medical record administrators for cancer registration.

Questionnaire for registering CS
Codes were assigned to CS questionnaire items according to stomach cancer schema (Table 1).Each item's values and original record sources were collected from the medical record documents.In order to measure the time required for abstracting and registering CS information, the length of time was reported for each case.

Statistical analysis
A chi-square test was conducted to check whether the average time spent in abstracting and registering CS in each institution was related to the six variables measured in the institution questionnaire.Frequency analysis was used for diagnostic path, subsite, morphology type, and differentiation for 233 stomach cancer cases.We created a cross tabulation table for the distribution of information collected for CS by medical record sources in the hospitals.Finally, to identify the factors potentially influencing the accuracy, validity, and efficiency of CS registration, factors contributing to errors were analyzed by checking the missing and erroneous items in each case.All p values are two-tailed, with p<0.05 considered to be statistically significant.All statistical analyses were conducted using SPSS Statistics 21 (SPSS, Chicago, IL, USA).
Among the 39 hospitals, a mean of 23.6 minutes (SD=12.9minutes) was found for the time it took to register CS data for one case of stomach cancer.The relationship between the hospital locations and the average time for CS data extraction and registration was not statistically significant (p=0.63).In addition, no statistically significant difference in the average time for abstracting and registering CS data was found for the following variables: regional cancer registry operation, number of cancer cases (all types/sites of cancer) diagnosed in 2009, and number of stomach cancer cases diagnosed in 2009 (p=0.27).In contrast, participation in the pilot study and the existence of full- The percentage of hospitals exceeding 20 minutes for abstracting and registering CS was lower by 70% in those that had participated in the CS pilot study compared to those that did not participate.The percentage of hospitals exceeding 20 minutes was also lower by approximately 30% in those with full-time cancer registrars compared to those without such registrars (Table 2).
Characteristics of the cancers in the 233 study cases are as follows.The most frequent precipitant of the stomach cancer diagnosis was the patient's own perception of symptoms (42.1%), and the next was screening (40.3%).The most common stomach cancer subsites were antrum (42.9%) and body (33.9%).Tubular adenocarcinoma was the most common morphological type of stomach cancer (43.8%), followed by adenocarcinoma, not otherwise specified (27.0%).Regarding differentiation, poorly differentiated type accounted for the highest proportion by 34.8% (Table 3).
Table 4 shows the sources of the medical records used for extracting CS system information, presented according to CS item.Three noteworthy points were identified from the analysis.First, major items, excluding Site-Specific Factors (SSF), were usually obtained from pathology reports and discharge summaries, while information on distant metastasis at the time of diagnosis was abstracted from radiology reports.Second, the information for SSF 13, SSF 14, and SSF 15, which are tumor markers, was found to be mostly abstracted from laboratory reports.Finally, the "missing" rate was relatively high, at 30% of SSF items.
Errors were examined by analyzing CS registration results for each hospital (Table 5).The most frequent errors were found in "CS Extension," "CS Lymph Nodes," and "CS Metastasis at Diagnosis (Mets at DX)."The errors    from these three items were similar in that they were derived from misinterpretation of the clinical descriptions in the medical record.Two types of errors were found in the SSF data.One was a case of inappropriately applying the given schema: the stomach schema was applied when the subsite of a tumor location was the esophagogastric junction (EGJ), meaning that the EGJ schema should have been used.The other SSF error involved confusion among three SSF codes: 988 "not applicable," 998 "test not done," 999 "unknown or no information."Specific examples of the errors and corrections are provided in Table 5.
In this study, the error list from the CS registration data revealed that the most frequent code errors were found in "CS Extension," "CS Lymph Nodes," "CS Mets at DX," and "SSF25 Involvement of Cardia and Distance from EGJ." Major errors were primarily coding errors, which were thought to result from registrars' applying coding guides incorrectly; input errors were also found.There were some cases when the proper schema could not be selected from the available choices of schema for stomach and EGJ.Different schema should be used for stomach cancer according to the subsite, and the corresponding SSF items should differ accordingly.However, the available selection of schema was frequently incorrect.In general, there was considerable missing information in the medical records, which are the bases for coding, and, in many cases, the contents were not detailed enough to be used for coding.More detailed and specific guides for selecting items should be created since item selection for medical records is not easy using the information in the current CS registration guide.
We propose corrections based on the results of analyzing in detail the common errors found in 39 hospitals.In SSF1, there were a number of cases when lymph adenopathy or lymph node enlargement was coded as being involved in cancer.Lymph adenopathy or lymph node enlargement should be coded as 999 (Regional lymph nodes involved pathologically, clinical assessment not stated; Unknown if regional lymph nodes clinically evident; Not documented in patient record).In case of coding of lymphovascular invasion was frequently confused with that of regional lymph node invasion; lymphovascular invasion should not be coded under regional lymph node invasion.The code 988 was incorrectly used; code 988 is usually used as "not applicable (information not collected for this case)" when the item is not relevant to the cancer type.It should not be confused with "no information" or "unknown information" for the given items.A metastatic or extension site was often double-coded as both extension and metastasis.The areas for metastasis should be anatomically understood for each cancer type, and either extension or metastasis should be selected for coding.Reports of non-specific regional lymph node involvement or lymph node in pathology report were often coded as specific lymph node involvement using unsuitable codes for each unspecified item.It should be confirmed whether specific regional node involvement was reported or specific lymph nodes were indicated in the pathological report, and appropriate codes should be assigned.Codes for TNM information were omitted.When no information is found other than T3 or N2, it should be recognized that relevant codes are in the schema, and proper codes should be assigned.
Based on the findings in this study, we propose the following suggestions for successful adoption of the CS registration system.First, training objectives should be determined for the future expansion of CS registration institutions or cancer types.The following are proposed as factors for successful CS registration expansion: one-on-one customized training, genetic tests for SSF registration, pathology-related training, and customized group clinical and integrative training.Second, data collection needs to be complete to ensure the validity of the CS registration data.Insufficient data, due to missing tests for registration items, have a negative influence on the validity of the data.To solve this problem, flexible expansion of items is needed and can be achieved by limiting CS registration items to only the essential or by putting priority on only those items needed for domestic registration.For other cancer types, registration items need to be selected only after checking the current conditions of the medical institutions and their ability to collect data for the given items.
We can draw the following implications from the present study.First, the use of CS registration in hospitals can be expanded by developing supporting materials to promote the active participation of hospitals.Second, education programs for medical record administrators in participating hospitals can be developed.Third, CS registration information can be utilized as the basis for developing an electronic data-processing environment used to extract information.Finally, the study results can be used as base material for the development of a CS registration format, which can help improve the validity of CS item registration.
Figure 1.Flow Chart of Data Collection of CS Registration for Stomach Cancer Cases DOI:http://dx.doi.org/10.7314/APJCP.2014.15.21.9529Improving Accuracy and Completeness in the Collaborative Staging System for Stomach Cancer in South Korea time cancer registrars had borderline significant effects on the average time spent in CS information extraction and registration, with p values of 0.06 and 0.08, respectively.

Table 3 . Detailed Items Distribution of Registered Stomach Cancers (233 Cases)
a NOS=Not Otherwise Specified *Except for M codes above

Table 4 . Distribution of Information for CS Registration By Medical Record Sources
DOI:http://dx.doi.org/10.7314/APJCP.2014.15.21.9529Improving Accuracy and Completeness in the Collaborative Staging System for Stomach Cancer in South Korea medical scientists, who use them for study, and clinical doctors, who use them for treatment and prognosis.