Case Report - Annals of Cardiovascular and Thoracic Surgery (2019) Volume 2, Issue 1
External validation of European System for Cardiac Operative Risk Evaluation II in a Tunisian population
Chighaly El Hadj Sidi*, Imen Mgarrech, Amine Tarmiz, Sofiane Jerbi
Department of Cardiovascular and Thoracic Surgery, Sahloul University Hospital, Sousse, Tunisia
- Corresponding Author:
- Chighaly El Hadj Sidi
Department of Cardiovascular and Thoracic Surgery
Sahloul University Hospital, Sousse Tunisia
E-mail: [email protected]
Accepted date: January 29, 2019
Citation: Sidi CEH, Mgarrech I, Tarmiz A, et al. External validation of European system for cardiac operative risk evaluation II in a Tunisian population. Ann Cardiovasc Thorac Surg. 2019;2(1):10-17.
Objective: The main objective of this study is to evaluate the performance of the predictive model (EuroSCORE II) on a Tunisian population in order to validate its use in our country. Methods: This is a retrospective study of data from 418 adult patients undergoing cardiac surgery with cardiopulmonary bypass between 1st January 2015 and 31 December 2016 in the department of cardiovascular and thoracic surgery of the Sahloul University Hospital of Sousse. The EuroSCORE ΙΙ is calculated using the application validated on the site www.euroscore.org. The performance of the score is evaluated by analyzing its discriminative power by constructing the ROC curve and analyzing its calibration using the Hosmer-Lemeshow statistics. Results: The EuroSCORE II shows good discriminative power in our population with an area under the ROC curve >0.7 in all study groups (0.864 ± 0.032 for general cardiac surgery, 0.822 ± 0.061 for coronary surgery, 0.864 ± 0.052 for valvular surgery, and 0.900 ± 0.041 for urgent cardiac surgery). The model appears to be calibrated as well by obtaining ρ values above the statistical significance level of 0.05 (0.638 for general cardiac surgery, 0.543 for coronary surgery, 0.179 for valvular surgery, and 0.082 for urgent cardiac surgery). Conclusion: The EuroSCORE II presents acceptable performance in our population, attested by a good discriminative power and an adequate calibration.
Cardiac Surgery, EuroSCORE II, Surgical Mortality, Discriminative power, Calibration
AUC: Area Under the Curve; CABG: Coronary Artery Bypass Grafting, CI: Confidence Interval; DF: Degrees of Freedom; EuroSCORE: European System for Cardiac Operative Risk Evaluation; MI: Myocardial Infarction; MVR: Mitral Valve Replacement; n: Number; NYHA: New York Heart Association; ROC: Receiver Operating Characteristic; SMR: Standardized Mortality Ratio; STS: Society of Thoracic Surgeons; χ2: Chi-Square
In recent years, adult cardiac surgery has experienced a significant increase in operative risk due to the recruitment of an increasingly elderly population with multiple comorbidities. In addition, she benefited from improved surgical techniques and postoperative resuscitation care . Despite these technical advances and accumulated knowledge, it remains a high-risk surgery, burdened with many potentially fatal complications.
The risk scores in cardiac surgery are intended to estimate the operative mortality according to the characteristics of the patient and the modalities of the surgery. They, therefore, have an important role in estimating the benefit/risk ratio of the interventions and for informing the patient, thus guiding the therapeutic choice . These scores are also useful in comparing postoperative outcomes and improving the quality of care in cardiovascular facilities . They have the advantage of reducing the subjectivity of the estimation of the operative risk, but must be interpreted with caution and can never be a substitute for clinical judgment.
Many predictive models have been proposed and used for cardiac surgery. The most widely used currently are the score of the STS (Society of Thoracic Surgeons), the main score applied in North America, and EuroSCORE (European System for Cardiac Operative Risk Evaluation), which is the most used model in Europe . The EuroSCORE II, published in 2012, was developed on a database of 154 centers in 43 predominantly European countries , it must be tested and validated in developing countries such as Tunisia before being used as a model risk stratification and serve as information for the patient seeking care or as an element of monitoring and evaluation of cardiac surgery services.
To our present knowledge, EuroSCORE II has not been validated in Tunisia. In this work, we proposed to evaluate the performance of this risk stratification model (EuroSCORE II) on a Tunisian population in order to validate its use in our country.
Patients and Methods
This is an observational, transversal study conducted on a retrospective model in the Department of Cardiovascular and Thoracic Surgery at the Sahloul University Hospital of Sousse. This study focuses on adult patients undergoing cardiac surgery with Cardiopulmonary Bypass (CPB), over a period of 2 years from 1st January 2015 until 31 December 2016.
We included in our study all adult patients who had cardiac surgery under cardiopulmonary bypass, with or without aortic clamping. In the end, 418 patients were included in this study; they were enrolled and followed up to the 30th postoperative day.
Patient data were collected from department of archived records at Sahloul University Hospital referring to the factors in EuroSCORE II. EuroSCORE II was calculated for each patient using the validated application on www.euroscore.org.
The statistical analysis was done using SPSS software version 20.0. The quantitative data were represented as means ± standard deviations and the qualitative variables in number and percentage.
The comparison between the different data made the call to Pearson's χ2 test for proportions and Student's T-test for averages. A univariate analysis was used to identify independent predictors of hospital mortality and a value of ρ less than 0.05 was set as the statistical significance level.
The basic overall performance parameter was the Standardized Mortality Ratio (SMR) calculated according to the formula:
Observed mortality ÷ Expected mortality
The analysis of the validity of the score was carried out by two approaches:
Study of discriminative power by constructing the ROC curve, which has for abscissa the rate of false positive represented by the value (1-specificity) and for ordinate the rate of true positives represented by the value of the sensitivity. Thus, the area under the ROC curve (AUC) was obtained according to the method of Hanley and McNeal
Study of the calibration using the Hosmer-Lemeshow goodness-of-fit test, and then building the calibration plot
The study included 418 patients underwent cardiac surgery, 245 men (58.6%) and 173 women (41.4%), with a sex ratio of 1.4. The mean age is 55.84 ± 13.84 years with extremes ranging from 18 to 87 years. Women are younger (55 ± 14 years) than men (57 ± 13 years), with no significant difference (ρ=0.09).
These patients have undergone different types of heart surgery. Table 1 shows the frequency of various cardiac surgical procedures and their corresponding mortalities in the validation study. Table 2 how’s the distribution of risk factors in our population and their relationship to mortality (a ρ value <0.05 is considered statistically significant).
|Type of surgery||Frequency||Mortality|
|Coronary surgery||160 (38.3%)||11 (6.8%)|
|Valvular surgery||204 (48.8%)||17 (8.3%)|
|Mixed valvulo-coronary surgery||16 (3.8%)||3 (18.7%)|
|Surgery of the thoracic aorta||26 (6.2%)||7 (36.7%)|
|Correction of congenital heart disease||7 (1.7%)||0 (0%)|
|Resection of a heart tumor||4 (1%)||1 (25%)|
|Removal of an endocavity PM probe||1 (0.2%)||0 (0%)|
Table 1: Frequencies of cardiac surgical procedures and their mortalities in our population.
|Age||55.84 ( ± 13.84)||0.361|
|Diabetes on insulin||54 (12.9%)||0.629|
|Extracardiac arteriopathy||48 (11.5%)||0.063|
|Previous cardiac Surgery||20 (4.8%)||0.916|
|Poor mobility||10 (2.4%)||0.305|
|Chronic lung disease||12 (2.9%)||0.058|
|Recent MI||38 (9.1%)||0.043|
|Angina at rest||50 (12%)||0.025|
|NYHA||Grade I||20 (4.8%)||˂0.001|
|Grade II||212 (50.7%)|
|Grade III||168 (40.2%)|
|Grade IV||18 (4.3%)|
|Creatinine clearance||>85 ml/min||204 (48.8%)||˂0.001|
|51-85 ml/min||156 (37.3%)|
|˂51 ml/min||54 (12.9%)|
|Critical preoperative state||12 (2.9%)||˂0.001|
|Pulmonary hypertension||<3155 mmHg||0.008|
|31 - 55 mmHg||124 (29.9%)|
|>55 mmHg||60 (14.1%)|
|Active endocarditis||25 (6.0%)||0.009|
|Weight of the intervention||Isolated CABG||159 (38.0%)||0.379|
|Single non CABG||151 (36.1%)|
|2 procedures||92 (22.1%)|
|3 or more||16 (3.8%)|
|Surgery on thoracic aorta||26 (6.2%)||0.001|
Table 2: Distribution of risk factors and their relationship to mortality.
Of the 418 patients in our study, 39 died with a global mortality rate of 9.3%.
The mean age of the deceased patients was 60 ± 14 years versus 55 ± 14 years in the survivors without significant difference (ρ=0.361). 61.5% of them are males, while 38.5% are females without significant difference (ρ=0.697). The observed mortality is 6.8% in the coronary subgroup, 8.3% in the valvular subgroup and 23.3% in the urgency subgroup.
Validation of EuroSCORE II
Standardized mortality ratio
The mortality predicted by EuroSCORE II in the total population (3.25%) is significantly lower (ρ˂0.001) than the observed mortality (9.3%) so that the SMR is 2.86.
In the coronary subgroup, the mortality predicted by EuroSCORE II (2.32%) is lower than the observed mortality (6.8%) without statistical significance (ρ=0.052), so that the SMR is 2.93. whereas in the valvular subgroup this predicted mortality (3.39%) is significantly (ρ˂0.001) lower than the observed mortality (8.3%) with an SMR of 2.44.
The mortality predicted in the urgency sub-group (6.99%) is lower than the observed mortality (23.3%), but in a nonsignificant way (ρ=0.335) the SMR is 3.33.
The discriminative power of EuroSCORE II was estimated by the area under the ROC curve (AUC). It seems to have good discrimination in the total population as well as in all subgroups studied, the area under the curve is 0.864 ± 0.032 (between 0.801 and 0.927 with CI=95% and ρ˂0.001) for the total population, 0.822 ± 0.061 (between 0.703 and 0.941 with CI=95% and ρ=0.001) for the coronary subgroup, 0.864 ± 0.052 (between 0.762 and 0.967 with CI=95% and ρ˂0.001) for the valvular subgroup, and 0.900 ± 0.041 (between 0.819 and 0.981 with CI=95% and ρ˂0.001) for the urgency subgroup (Figure 1).
To this extent, we performed a Hosmer-Lemeshow goodnessof- fit test that gives a χ2 value of 4.28, with a df of 6 and a ρ value of 0.638 in the total population. This test gives a χ2 value of 2.14 with a df of 3 and a ρ value of 0.543 for the coronary subgroup, a χ2 value of 4.90 with a df of 3 and a ρ value of 0.179 for the valvular subgroup, and a χ2 value of 6.70 with a df of 3 and a ρ value of 0.082 for the urgency subgroup.
The EuroSCORE II also seems to have a good calibration in the total population and the coronary subgroup, but less good in the other two subgroups (ρ value remains greater than 0.05
Tables 3 correspond to the contingency table of the Hosmer- Lemeshow goodness-of-fit test in the total population, while Figure 2 illustrates the calibration plots in the total population and the coronary, valvular, and urgency subgroups.
|Groups||n||Expected mortality||Observed mortality|
Table 3: Contingency table of the Hosmer-Lemeshow test in the total population.
Table 4 presents the distribution of the variables designed for the development of EuroSCORE II in our study as well as in the initial study conducted by Nashef et al.
|Age||55.84 (± 13.84)||64.6 ( ± 12.5)||˂0.001|
|Female gender||173 (41.4%)||6919 (30.9%)||˂0.001|
|Diabetes on insulin||54 (12.9%)||1705 (7.6%)||˂0.001|
|Extracardiac arteriopathy||48 (11.5%)||N-A|
|Previous cardiac Surgery||20 (4.8%)||N-A|
|Poor mobility||10 (2.4%)||713 (3.2%)||=0.359|
|Chronic lung disease||12 (2.9%)||2384 (10.7%)||˂0.001|
|Recent MI||38 (9.1%)||N-A|
|Angina at rest||50 (12%)||N-A|
|NYHA||Grade II||212 (50.7%)||N-A|
|Grade III||168 (40.2%)||N-A|
|Grade IV||18 (4.3%)||N-A|
|Creatinine clearance||51-85 ml / min||156 (37.3%)||N-A|
|˂51ml / min||54 (12.9%)||N-A|
|Dialysis||4 (1%)||244 (1.1%)||=0.795|
|Critical preoperative state||12 (2.9%)||924 (4.1%)||=0.199|
|Ejection fraction||31-50%||84 (20.1%)||N-A|
|Pulmonary hypertension||31-55 mmHg||124 (29.9%)||N-A|
|>55 mmHg||60 (14.1%)||N-A|
|Active endocarditis||25 (6%)||497 (2.2%)||˂0.001|
|Urgency||Urgent||66 (15.8%)||4135 (18.5%)||=0.160|
|Emergent||26 (6.2%)||972 (4.3%)||=0.063|
|Salvage||11 (2.6%)||109 (0.5%)||˂0.001|
|The weight of the intervention||Single non CABG||151 (36.1%)||N-A|
|2 procedures||92 (22.1%)||N-A|
|3 or more||16 (3.8%)||N-A|
|Surgery on thoracic aorta||26 (6.2%)||1636 (7.3%)||=0.396|
Table 4: Comparison of patients’ characteristics between the original EuroSCORE II population and our population.
The average age of our population was 55.84 years, while women accounted for 41.4%, against an average of 64.6 years and a female representing nearly a third of the population of EuroSCORE II .
These differences may be related to longer life expectancy and a lower incidence of rheumatic heart disease (more common among women in our country) than in European countries. The external validation studies of EuroSCORE II in different Western European countries [6-8] and Eastern Europe [3,9,10] gave results similar to those of the initial study, as well as than that conducted by Borracci et al. .
Since the population of EuroSCORE II is 9 years older than our population, there should be more comorbidities, but paradoxically, we found that diabetes on insulin was more common in our population while chronic lung disease was more common in the population of the original study. Better management of diabetes mellitus and a spread of smoking (the leading risk factor for lung diseases) in developed countries can explain these facts.
Although early surgery improves the vital prognosis in the active phase of infective endocarditis as indicated by the work of Nagai et al. , we found that the presence of this pathology is an independent factor of mortality after cardiovascular surgery.
A higher percentage of patients with active infectious endocarditis in our population, compared to the population of EuroSCORE II, can also be explained by the prevalence of valvular and infectious diseases in our country.
This factor has undoubtedly contributed to increasing our mortality.
With regard to emergency cardiac surgery, we found no significant difference between our results and those provided by the original study, with the exception of the rescue category, which is more common in our study. We can assume that this difference can seriously affect our results while taking into account that the notion of urgency is subjective and not yet codified.
Table 5 summarizes the results of the EuroSCORE II performance analysis (SMR, discriminative power and calibration) in our study compared to those provided by the literature. By analyzing the overall performance of the model in our population, we found a generally high SMR (2.86 for the total population, 2.93 for the coronary subgroup, 2.44 for the valvular subgroup and 3.33 for the urgency subgroup), which is very different from the results published by most cardiac surgery centers in recent years, with numbers approaching 1 indicating a good performance.
|Atashi ||Iran||All||2581||NA||0.667||0.648- 0.685||936.66||<0.01|
Table 5: Review of the main results of the EuroSCORE II validation studies compared with our results.
Only a few series offer results comparable to ours, such as that of Stavridis et al.  (2.23) or that of Kar et al.  (1.94) for any type of cardiac surgery included, or that of Laurent et al.  for aortic valve replacement or that of Kalender et al.  (6.77) for emergency coronary surgery or that of De Oliveira et al.  and that of Taamallah et al.  for Surgery of infectious endocarditis.
Calculating the area under the ROC curve (AUC) according to the Hanley and McNeal method finds acceptable figures (0.864 for the total population, 0.822 for the coronary subgroup, 0.864 for the valve subgroup, and 0.900 for the urgency subgroup) with confidence intervals whose lower limits always greater than 0.7, which defines the threshold for the model to be discriminating . These results are comparable to the statement made by the authors of the original EuroSCORE II article, which was 0.809 (0.782-0.836) .
The majority of the external validation studies also showed results similar to those of the initial study, while that carried out in Egypt by Amr et al.  in Egypt found disappointing results with an AUC of 0.52. In light of these findings, the results of our work show that the discriminating power of EuroSCORE II is adapted to our population for all groups studied: general cardiac surgery, coronary surgery, valvular surgery, and urgent cardiac surgery.
The results obtained by the Hosmer-Lemeshow goodness-of-fit test for evaluating the calibration of EuroSCORE II in our population require careful analysis. This test is currently under discussion because of its sensitivity to the number of groups and the size of the sample . It was used for this study because it was used in the internal validation of the model.
The Hosmer-Lemeshow goodness-of-fit test in our population shows small χ2 values with ρ values always above the limit for statistical significance determination which is 0.05 and this for all the subgroups studied except for the urgency subgroup whose value is approaching (0.638 for the total population, 0.543 for the coronary subgroup, 0.179 for the valvular subgroup and 0.082 for the urgency subgroup). Therefore, there is no statistically significant difference between expected mortality and observed mortality.
These results are in contradiction with those provided in the literature, which is generally in favor of a bad calibration of EuroSCORE II in parallel with the results published in the initial article , which shows a ρ value (0.0505) very close to the limit of determination of statistical significance.
Other series show disappointing results with values of ρ less than 0.05, like that of Amr et al.  in Egypt and that of Wang et al.  in China, both of which are studied in patients undergoing valvular surgery, or as the multicentric study by Grant et al.  made on the largest number of patients undergoing cardiac surgery in emergency (3342). These authors conclude that there is a significant difference between their observed mortality and their expected mortality.
In general, we can say that the EuroSCORE II shows a good calibration in our population subject to the small sample size.
Despite the differences in the profile of risk factors between the Tunisian population and the population constituting the database used for the development of EuroSCORE II, we can say that this risk model presents acceptable performances in our population, as evidenced by adequate discrimination and calibration.
However, we reproach him with an underestimation of the mortality especially in the patients supposed to be low risk.
At the end of this work, we proclaim the need to start prospective and especially multicentric studies on larger samples before concluding definitively on the performance of this model in our country, or even to develop an adapted version.
- Biancari F, Vasques F, Mikkola R, et al. Validation of EuroSCORE II in patients undergoing coronary artery bypass surgery. Ann Thorac Surg. 2012;93(6):1930-5.
- Ad N, Holmes SD, Patel J, et al. Comparison of EuroSCORE II, original EuroSCORE, and the Society of thoracic Surgeons risk score in cardiac surgery patients. Ann Thorac Surg. 2016;102(2):573-9.
- Koszta G, Sira G, Szatmári K, et al. Performance of EuroSCORE II in Hungary: A single-centre validation study. Heart Lung Circ. 2014;23(11):1041-50.
- García VA, Mestres CA, Bernabeu E, et al. Validation and quality measurements for EuroSCORE and EuroSCORE II in the Spanish cardiac surgical population: A prospective, multicentre study. Eur J Cardiothorac Surg. 2016;49(2):399-05.
- Nashef SAM, Roques F, Sharples LD, et al. EuroSCORE II. Eur J Cardiothorac Surg. 2012;41(4):734-44.
- Barili F, Pacini D, Capo A, et al. Does EuroSCORE II perform better than its original versions? A multicentre validation study. Eur Heart J. 2013;34(1):22-9.
- Carnero AM, Guisasola JAS, Lacruz FJR, et al. Validation of EuroSCORE II on a single-centre 3800 patient cohort. Interact Cardiovasc Thorac Surg. 2013;16(3):293-300.
- Chalmers J, Pullan M, Fabri B, et al. Validation of EuroSCORE II in a modern cohort of patients undergoing cardiac surgery. Eur J Cardiothorac Surg. 2013;43(4):688-94.
- Stavridis G, Panaretos D, Kadda O, et al. Validation of the EuroSCORE II in a Greek cardiac surgical population: A prospective study. Open Cardiovasc Med J. 2017;11:94-101.
- Nezic D, Spasic T, Micovic S, et al. Consecutive observational study to validate EuroSCORE II performances on a single-center, contemporary cardiac surgical cohort. J Cardiothorac Vasc Anesth. 2016;30(2):345-51.
- Borracci RA, Rubio M, Celano L,et al. Prospective validation of EuroSCORE II in patients undergoing cardiac surgery in Argentinean centres. Interact Cardiovasc Thorac Surg. 2014;18(5):539-43.
- Atashi A, Amini S, Tashnizi MA, et al. External validation of European System for Cardiac Operative Risk Evaluation II (EuroSCORE II) for risk prioritization in an Iranian population. Braz J Cardiovasc Surg. 2018;33(1):40-6.
- Kar P, Geeta K, Gopinath R, et al. Mortality prediction in Indian cardiac surgery patients: Validation of European System for Cardiac Operative Risk Evaluation II. Indian J Anaesth. 2017;61(2):157-62.
- Pillai BS, Baloria KA, Selot N. Validation of the European System for Cardiac Operative Risk EvaluationII model in an urban Indian population and comparison with three other risk scoring systems. Ann Card Anaesth. 2015;18(3):335-42.
- Nagai T, Takase Y, Hamabe A, et al. Observational study of infective endocarditis at a community-based hospital: Dominance of elderly patients with comorbidity. Intern Med. 2018;57(3):301-10.
- Laurent M, Fournet M, Feit B, et al. Simple bedside clinical evaluation versus established scores in the estimation of operative risk in valve replacement for severe aortic stenosis. Arch Cardiovasc Dis. 2013;106(12):651-60.
- Kalender M, Adademir T, Tasar M, et al. Validation of EuroSCORE II risk model for coronary artery bypass surgery in high-risk patients. Kardiochir Torakochirurgia Pol. 2014;11(3):252-56.
- De Oliveira JLR, Dos Santos MA, Arnoni RT, et al. Mortality predictors in the surgical treatment of active infective endocarditis. Braz J Cardiovasc Surg. 2018;33(1):32-39.
- Taamallah K, Ibala W, Ghodhbane W, et al. Value of EuroSCORE II to predict operative mortality in infectious endocarditis surgery. Tunis Med. 2017;95(7):471-76.
- Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology. 1983;148(3):839-43.
- Amr MA, El-shorbagy AAM. Evaluation of accuracy of EuroSCORE II in prediction of in-hospital mortality in patients underwent mitral valve replacement in Egypt. Journal of the Egyptian Society of Cardio-Thoracic Surgery. 2016;24(2):135-42.
- Lemeshow S, Hosmer DW. A review of goodness of fit statistics for use in the development of logistic regression models. Am J Epidemiol. 1982;115(1):92-06.
- Wang C, Li X, Lu F, et al. Comparison of six risk scores for in-hospital mortality in Chinese patients undergoing heart valve surgery. Heart Lung Circ. 2013;22(8):612-17.
- Grant SW, Hickey GL, Dimarakis I, et al. Performance of the EuroSCORE models in emergency cardiac surgery. Circ Cardiovasc Qual Outcomes. 2013;6(2):178-85.