Predicting American Board of Emergency Medicine Qualifying Examination Passage Using United States Medical Licensing Examination Step Scores

Background: The objective of the current study was to determine whether emergency medicine residents' United States Medical Licensing Examination (USMLE) scores are significantly associated with first-attempt passage of the American Board of Emergency Medicine (ABEM) qualifying (written) examination. We hypothesized that USMLE Step 2 Clinical Knowledge (CK) scores would be useful in predicting students who passed the ABEM qualifying examination on their first attempt. Methods: For this retrospective cohort study, we examined the data of residents who successfully completed training at two emergency medicine residency programs between the years 2002-2013. Because scores on the USMLE Step examinations varied greatly across years, we obtained means and standard deviations from the National Board of Medical Examiners. We subtracted the mean score for the year each resident took the examination from the resident's examination score, creating centered Step 1 and centered Step 2 CK scores. Results: A multivariate logistic regression analysis indicated that centered Step 2 CK scores could be used to predict the odds of passing the ABEM qualifying examination (odds ratio = 1.05 [95% confidence interval 1.02 to 1.08, P < 0.001]). Using a Step 2 CK score cutoff of 7 points lower than the mean yielded 64% sensitivity and 81% specificity for predicting passage of the ABEM written examination on the first attempt. Conclusion: Program directors and selection committees may wish to consider whether applicants' Step 2 CK scores are near the national average when making ranking decisions, as this variable is highly predictive of passing the ABEM qualifying examination on the initial attempt.


INTRODUCTION
In the 2017 National Resident Matching Program, a total of 2,047 emergency medicine positions were offered, and the number of applicants was 2,703. 1 This discrepancy in the number of applicants and available positions results in applicants applying to many residency programs to maximize their potential for matching. In 2013, the Accreditation Council for Graduate Medical Education (ACGME) mandated that emergency medicine residency programs demonstrate an 80% first-time passage rate on the American Board of Emergency Medicine (ABEM) qualifying (written) examination. 2 Consequently, selection committees want to identify applicants who will be able to pass the certifying examination on their initial attempt. Residency program directors and se-lection committees review large numbers of applications from medical students with various backgrounds and training and select a small percentage of competitive applicants to interview. Few standardized metrics are available to help identify applicants who will be successful.
Because all medical students complete the United States Medical Licensing Examination (USMLE), residency program directors place an emphasis on these scores during the selection process, particularly when selecting applicants to invite to interview. 3 Selection committees and program directors assume that USMLE scores are good predictors of resident outcomes. Evidence indicates that USMLE scores are associated with emergency medicine residency intraining examination (ITE) scores, 4 and ITE scores correlate with certifying examination scores. Studies from other medical specialties indicate that USMLE scores are significantly associated with performance on qualifying examinations in internal medicine, 5 pathology, 6 orthopedic surgery, 7 and the American Board of Surgery. 8 Maker et al reported that USMLE Step 2 scores are also associated with passage of the American Board of Surgery oral certifying examination. 9 Only one study has examined the relationship between USMLE Step 1 and Step 2 scores with the ABEM qualifying examination. 10 The authors reported that a Step 1 score of 227, a Step 2 Clinical Knowledge (CK) score of 225, and a composite score of 444 predicted a 95% chance of passing the ABEM qualifying and oral examinations. However, one limitation of this study is that the authors examined raw scores. The mean score for the USMLE examinations has increased steadily since its inception, limiting the utility of raw scores.
The purpose of this study was to examine whether difference from the mean USMLE scores (centered Step scores) can be used to select residents who will pass the written ABEM qualifying examination on their initial attempt. We hypothesized that residents who scored higher than the USMLE mean for the year they took the examinations would be significantly more likely to pass the initial ABEM qualifying examination. We also hypothesized that centered Step 2 CK scores would be more predictive than centered Step 1 scores because of the clinical content of the Step 2 CK examination.

Study Design
This retrospective cohort study examined the data of residents who successfully completed training at two emergency medicine residency programs between the years 2002-2013. This study was determined to be exempt by both medical schools' institutional review boards.

Study Setting and Population
One residency program is located in the Southern Region.

Study Protocol
The residency programs provided deidentified databases to the investigators. Demographic variables such as sex, age, and ethnicity were not collected because the resident classes tend to be relatively homogenous and outliers might have made these variables identifiable. Data collected included USMLE Step 1 and Step 2 CK scores and performance on the written ABEM qualifying examination (dichotomously coded as pass or fail).

Data Analysis
The national average of USMLE Step 1 and Step 2 CK scores varied greatly during the study period. The investigators contacted the National Board of Medical Examiners to obtain the mean and standard deviations for the USMLE Step 1 and Step 2 CK scores for each year. Because the standard deviations were relatively consistent across years, we calculated the distance from the mean for the year that each resident completed Step 1 and Step 2 CK, respectively. We subtracted the mean score for that year from each resident's USMLE score to create a centered Step score. For example, if a resident scored 220 in a year when the Step 2 CK mean score was 212, his or her centered Step 2 CK score would be +8. However, if a resident scored 220 in a year when the mean score was 228, his or her centered Step 2 CK score would be −8. This approach allows for an equitable comparison of Step scores across the years.
Logistic regression analyses were used to determine whether USMLE centered Step scores were predictive of passing the ABEM qualifying certification examination. First, univariate logistic regression was used to determine whether the centered Step 1 or centered Step 2 CK scores were predictive of initial ABEM qualifying examination passage separately. Then multivariate logistic regression was used to examine if the combination of the centered Step 1 and Step 2 CK scores was more predictive than the simpler univariate model. Results are presented as odds ratios (ORs) with 95% confidence intervals (CIs). Statistical significance was set at P < 0.05. A receiver operating characteristic (ROC) curve was constructed to obtain area under the curve (AUC) and its 95% CI. The ROC was also used to examine potential cutoff scores for centered Step 2 CK scores in predicting passage of the ABEM qualifying examination on the first attempt. Sensitivities and specificities were calculated using the ROC curve. Analyses were conducted using IBM SPSS Statistics v.22.

RESULTS
In all, 299 residents graduated from the programs within the study period. Data were available for 135 residents from the Mid-Atlantic residency program and 164 residents from the Southern residency program. Complete data were missing for 8 residents from the Southern residency program. USMLE Step 1 data were missing for 16 individuals (4 Mid-Atlantic and 12 Southern), and USMLE Step 2 CK data were missing for 68 individuals (29 Mid-Atlantic and 39 Southern). ABEM qualifying examination data were missing for 12 individuals (10 Mid-Atlantic and 2 Southern). Complete data were available for a total of 206 residents, and Step 2 CK and ABEM qualifying examination data were available for 212 residents.
In the univariate logistic regression analyses, centered Step 1 (OR = 1.05, 95% CI 1.03 to 1.08, P < 0.001) and centered Step 2 CK (OR = 1.06, 95% CI 1.04 to 1.09, P < 0.001) scores were both significant predictors of ABEM qualifying examination passage when considered separately. However, in a multivariate logistic model, the centered Step 1 score was not a significant predictor (OR = 1.02, 95% CI .99 to 1.05, P < 0.17), while the Step 2 CK score remained significant (OR = 1.05, 95% CI 1.02 to 1.08, P < 0.001). That is, the Step 1 score did not contribute uniquely to the prediction of ABEM qualifying examination passage when Volume 18, Number 3, Fall 2018 the centered Step 2 CK score was in the equation. Based on these results, a more complex multivariate model is not warranted. The OR of 1.06 from the univariate model represents an increase or decrease in the odds of passing the ABEM qualifying examination with each 1-point increase or decrease in centered Step 2 CK scores, respectively. That is, the odds of passing the ABEM qualifying examination on the first attempt double with every 12-point increase in the centered Step 2 CK score. The Figure illustrates the probability of passing the ABEM qualifying examination on the first attempt as predicted by centered Step 2 CK scores.
The AUC was significant (AUC = 0.80, 95% CI 0.72 to 0.87, P < 0.001). The Table presents the sensitivity and specificity of various points from the mean. Overall, scores that were at the mean for that year (a centered score of 0), yielded 97% (95% CI 86.3% to 99.7%) specificity and 53% (95% CI 45.5% to 60.0%) sensitivity for predicting passage of the ABEM written examination on the first attempt. Only one resident who scored at or above the mean for his/her year failed the ABEM written examination on the first try. However, of the 180 residents in our sample who passed the ABEM written examination on their first attempt, 85 residents had Step 2 CK scores that were below the mean for the year they took the examination but still passed the ABEM written examination on their first attempt. Using a cutoff of Step 2 CK scores of 7 points lower than the mean yielded 64% (95% CI 56.7% to 70.6%) sensitivity and 81% (95% CI to 65.4% to 91.8%) specificity. Using this cutoff score, 6 individuals who had centered Step 2 CK scores between −7 and −1 failed the ABEM qualifying examination on their first attempt. However, 65 individuals had Step 2 CK  scores more than 7 points below the mean and still passed the ABEM qualifying examination on their first attempt.

DISCUSSION
The purpose of this study was to examine whether USMLE Step 1 and Step 2 CK scores are predictive of passing the written ABEM qualifying examination on the initial attempt. This study yielded 3 important findings. First, while Step 1 scores were predictive of initial ABEM qualifying examination passage when entered alone in a logistic regression, they were no longer significant once Step 2 CK scores were entered into the logistic regression.
Step 2 CK scores were a better predictor of ABEM qualifying examination passage. Second, all but one person in our sample who had a Step 2 CK score at or above the mean in the year he/she took the examination passed the ABEM qualifying examination on the first attempt. Finally, Step 2 CK scores within 7 points of the mean yielded adequate specificity for passing the initial ABEM qualifying examination.
Previous studies have reported that USMLE Step 1 scores are valid predictors of board passage rates for other specialties. 8,11,12 Our results are consistent with the findings of Harmouche et al 10 in that Step 2 CK scores were better predictors of board passage than Step 1 scores.
Step 2 CK scores are possibly more relevant than Step 1 scores in emergency medicine specifically. However, another important point is that previous retrospective studies did not account for variations in the scores across years. This difference in the study analysis may be the reason for differences between our study and prior work.
USMLE Step 2 CK scores are available for most applicants to residency programs. While the USMLE provides minimum passage scores, we found that the applicant's performance vs the national average (centered Step 2 CK scores) predicts ABEM board passage. Harmouche et al 10 suggested using raw scores to predict success in passing the initial ABEM examination; however, medical students consistently perform better and better on the USMLE, raising the mean score each year. For this reason, we used a standardized approach to determining scores that would predict initial success on the ABEM and remain useful over time. As stated earlier, the ACGME now requires 80% of a program's graduates from the preceding 5 years to pass the ABEM. 2 Our analysis shows that a cutoff score of 7 points below the mean predicts an 80% probability of passing the ABEM. While the USMLE Step 2 CK score may be a readily available and useful tool, it should not be used alone in making ranking decisions because many individuals in our cohort who scored below the national average still passed the ABEM qualifying examination on their first attempt. Therefore, emergency medicine residency program directors and selection committees should consider other biographic and academic information prior to making final decisions about ranking applicants. Future studies could examine additional variables that may contribute to the prediction of successful residents.
Limitations of our study include its retrospective nature as well as results collected from only 2 residency programs. While our results indicate a significant relationship between USMLE Step 2 CK scores and ABEM qualifying examination passage, additional research is needed to ensure generalizability. In addition, prospective data from multiple institutions would be beneficial in validating these results and giving program directors increased confidence in using USMLE for making decisions when ranking applicants. Examination of all factors associated with performance on the ABEM was beyond the scope of this study; however, programmatic changes during the decade of the study period may have impacted the results.

CONCLUSION
This study has important implications for emergency medicine residency program directors and selection committees, as it proposes a means of quantifying the risk of initial ABEM qualifying examination failure that program directors may be taking when selecting applicants with lower Step 2 CK scores. A centered Step 2 CK score within 7 points of the national average suggests a candidate is likely to pass the ABEM qualifying examination on the initial attempt. This knowledge can be beneficial in making selection decisions in light of ACGME requirements and the increasing applicant pool. However, other applicant data should be considered as well prior to making a final ranking decision. Further studies are needed to validate these findings; however, this study is the first to demonstrate a relationship between centered Step 2 CK scores and emergency medicine board passage.