Assessing the Impact of Socio-economic Variables on Breast Cancer Treatment Outcome Disparity

Surveillance Epidemiology and End Results (SEER) (http://seer.cancer.gov/) is a public use cancer registry of United States of America (U.S.A.). Cancer is a major human burden. One out of three women and one out of two men in the U.S.A. will develop cancer in a lifetime (Siegel et al., 2012). Breast cancer is the most common cancer among women. It is estimated that 226,870 women will be diagnosed with and 39,510 women will die of cancer of the breast in 2012. The age-adjusted death rate was 23.0 per 100,000 women per year. The age-adjusted incidence rate was 124.3 per 100,000 women per year. These rates are based on cases diagnosed in 2005-2009 from 18 SEER geographic areas (http://seer.cancer.gov/ statfacts/html/breast.html). SEER is funded by National Cancer Institute and Center for Disease Control to cover 28% of all oncology cases in U.S.A. SEER started collecting data in 1973 for 7 states and cosmopolitan registries. Its main purpose remains through collecting and distributing data on cancer, it strives to decrease the burden of cancer. SEER data are used widely as a bench-mark data source for monitoring cancer outcomes in U.S.A. and in other countries ( Shavers et al., 2003; Wampler et al., 2005; Gross et al., 2008; Lund et al., 2008; Schootman et al., 2009; Downing et al., 2010; Martinez et al., 2010; Rudat et al., 2012; Schlichting et al., 2012; Yao et al.,


Introduction
Surveillance Epidemiology and End Results (SEER) (http://seer.cancer.gov/) is a public use cancer registry of United States of America (U.S.A.).Cancer is a major human burden.One out of three women and one out of two men in the U.S.A. will develop cancer in a lifetime (Siegel et al., 2012).Breast cancer is the most common cancer among women.It is estimated that 226,870 women will be diagnosed with and 39,510 women will die of cancer of the breast in 2012.The age-adjusted death rate was 23.0 per 100,000 women per year.The age-adjusted incidence rate was 124.3 per 100,000 women per year.

Assessing the Impact of Socio-economic Variables on Breast Cancer Treatment Outcome Disparity
Min Rex Cheung 2012).Because of the uniformity and scope of the socioeconomic data collected by SEER, their data are ideal for identifying the disparity in oncology outcome in different geographical and cultural areas ( Harlan et al., 1995;Shavers et al., 2003;Wampler et al., 2005;Gross et al., 2008;Lund et al., 2008;Downing et al., 2010;Martinez et al., 2010;Martinez et al., 2012;Schlichting et al., 2012;Yao et al., 2012).This study will focus on the disparity in breast cancer treatment outcome in the state of Georgia in U.S.A. Georgia is a relatively typical American state.By focusing on one state, this study aimed to find out if there were unintended socio-economic factors impacting on breast cancer outcome in a relatively homogeneous, medium sized state as a model.Similar epidemiology studies focusing on more homogeneous areas have also been done in Australia (Roder et al. 2012;Roder et al. 2013).We used SEER 18 data that covered the Atlanta and rural Georgia since 1974, and Greater Georgia since 2010.In particular, we sought to identify the socio-economic factors that contribute to worse outcome in breast cancer treatment with a goal to eliminate the disparity in the future.

Materials and Methods
The Georgia cancer registry data were obtained from SEER 18 database.SEER registry has massive amount of data available for analysis, however, manipulating this data pipeline could be challenging.SEER Clinical Outcome Prediction Expert (SCOPE) is designed and implemented to mine SEER data and construct accurate and efficient prediction models (Cheung, 2012).
SEER is a public use database that can be used for analysis with no internal review board approval needed.Seer*Stat (http://seer.cancer.gov/seerstat/) was used for listing the cases  '2004','2005','2006','2007','2008','2009'.The last update of the SEER data was in November 2011 incorporating some of 2010 U.S. census data.
This study examined a long list of socio-economic factors (SEFs), staging and treatment factors that were available in SEER database with the goal of identifying the best SEFs to explain the disparity in breast cancer specific survival (COD='breast' in SEER).We used receiver operating characteristic curve to select the best pretreatment univariates for further analyses (Cheung et al., 2001a;2001b).Similar strata were fused to make more efficient models if the ROC performance did not degrade (Cheung et al., 2001a;2001b).Survival analysis was used to compute time to breast cancer specific survival data, Kolmogorov-Smirnov 2-sample test was used for comparing two survival curves and Cox multivariate analysis was performed to ascertain if SEFs were independently prognostic.To estimate the relative importance of SEFs versus traditional factors by Cox regression, the most significant pretreatment factors AJCC (American Joint Committee on Cancer) 2006 stage, estrogen receptor status (0=positive, 1=otherwise), progesterone receptor status (0=positive, 1=otherwise), race (0=non African American, 1=African American) and education attainment of county of residence (0>25% college graduate, 1=otherwise) of were used in a multivariate Cox model.These factors were scored as 1 for high-risk groups and 0 otherwise as indicated.All statistical analyses was performed in Matlab (http:// www.mathworks.com/matlabcentral/fileexchange/authors/37883).

Results
We analyzed 34,671 breast cancer cases diagnosed from 2004 to 2009 obtained from SEER database (Figure 1A).We used receiver operating characteristic curve (Hanley and McNeil, 1982) to study the performance of various univariate predictors.We have identified race/ethnicity, county percent graduating from college, American Joint Committee on Cancer staging (according to the AJCC 6th edition manual), estrogen receptor (ER) status and progesterone receptor (PR) status were discriminating models that could be used to build multivariate models.
The American Joint Commission on Cancer (AJCC) staging had the highest ROC (S.D.) area of 0.83 (0.004) among the factors tested.The overall surviving fraction for the selected cases is 85% (Figure 1a).About one third of the deaths were not related to the diagnosis of the breast cancer.Therefore, this study explicitly built models to predict the cause specific survival as opposed to overall survival.
The AJCC model of the breast cancer was fed into SCOPE to be successively tested if adjacent strata could be merged (implemented as a subroutine of SCOPE) based on the ROC areas.The 7 risk strata of AJCC metastatic model were simplified to 3 strata without sacrificing the accuracy (ROC area (S.D.) is 0.82 (0.01).The second best model was AJCC non-metastatic model with (ROC area (S.D.) is 0.768 (0.01).The next tier of predictive models are biological ER/PR (ROC area (S.D.) is 0.656 (0.009); and treatment surgery/radiotherapy receipt (ROC area (S.D.) is 0.654 (0.005); socio-economic race/county% college bivariate (ROC area (S.D.) is 0.652 (0.004).Among the SEFs tested, Race/ethnicity and college education attainment in a county were the most predictive and the two SEFs combined has better ROC performance than the individual ones.
Figure 1B shows the surviving proportions when the patients were separated by race.African American (n=9240) had statistically worse survival outcome (Kolmogorov-Smirnov 2-sample test: h=1; p=1.6224e-09; k=0.5459).Table 1 shows the proportions of surviving patients separated by education attainment of county of residence.Patients lived in less educated (n=19764) areas had statistically worse survival outcome (Kolmogorov-Smirnov 2-sample test of college education attainment of county of residence: h=1; p=0.0300; k=0.2429).The difference in cause specific survival was about 2% for county education level and 10% for race/ethnicity at 60 months (Figure 1B and 1C).AJCC 2006 stage was the most predictive factor tested in this study for breast cancer specific survival (Figure 1D).Table 1 shows the Cox Proportional Hazard multivariate analysis.Multivariate analysis demonstrated that the SEF race and college degree attainment of the county of residence added independently significant information even when modeled with these For county with less than 25% of college graduate, there were 56% patients did not receive radiotherapy and more county with higher education attainment, there were 46% patients received radiotherapy.35% and 34% were the comparable percentages of black patients receiving and not receiving radiotherapy respectively.Thus the predictive power of black/college bivariate could only be partially explained by whether the patients received radiotherapy or not.In terms of no receipt of radiotherapy, we hypothesize that living in a less educated county appears to fare less well because fewer patients received radiotherapy.

Discussion
Using SEER data to study the effects of radiotherapy for breast cancer is an active area of investigation.The newest SEER data Nov 2011 have incorporated 2010 US census.However, the newly available socio-economic data have not been used in recent studies (Agarwal et al., 2012;Jagsi et al., 2012;Martinez et al., 2012;Sail et al., 2012;Yan et al., 2012).AJCC staging was found to be the best biological model to predict breast cancer specific survival.For comparison, the ROC area in predicting PSA failure based on Gleason Score, T-stage and PSA was about 0.75 in our previous studies (Cheung et al., 2001a;2001a).Hormonal status (Davies et al., 2011) has been found to be predictive of breast cancer outcome in previous studies and was confirmed here.Race could predict in breast case about 10% cause specific survival decrement at 5 years (Figure 1B) and was significant in multivariate analysis with AJCC stage and hormonal receptor status (Table 1).
The Georgia cancer registry (one the SEER 18 registries) was used in our current studies.This state was used as a model for this study for several reasons.The entire state of Georgia is now covered by the SEER registry.While the state is a medium size state that provides relative social and economic uniformity, there are also significant social and economic variations that could be used as a model to study the impact of socio economic factors on oncology outcome.
It has been suggested that the disparity of using post-operative radiotherapy may have an impact on the survival of advanced breast cancer patients (Martinez et al., 2012).More studies are needed as related to the impact of socio-economic factors.For example, half of the breast cancer patients in this cohort (Table 1) did not receive radiotherapy.We investigated the disparity in outcome due to Race/% college graduate of County in relation to receipt of post-operative radiation treatment.Lack of radiotherapy has been shown to be associated with inferior outcomes as shown in other studies (Dragun et al., 2012;Feltner et al., 2012;Yao et al., 2012).We found that the breast cancer patients lived in less educated counties were at a disadvantage in terms of cause specific survival.Based on our data, we suggest that educating the public and patient about the utility of radiotherapy in the treatment of breast cancer may improve the frequency of receipt of radiotherapy and potentially the cause specific survival.Of note, a 10% improvement in 5 year actuarial cause specific survival would be more than the benefits of most chemotherapy regiments and the same as the benefit of post-operative radiotherapy after breast conservation therapy for patients with 3 or fewer positive lymph nodes at 15 years (Clarke et al., 2008;Voordeckers et al., 2009;Beal et al., 2010).

Figure 1 .
Figure 1.A) Cause Specific Survival of Breast Cancer; B) Survival Plots by Race; C) Survival Plots by Country Education Attainment; and D) Survival Plots by AJCC 2006 Stage