Board Scores

Years – sometimes a lifetime – of work to get to this point, and a single three-digit score from one day of testing determines your future.  If it’s better than expected, you can dream big; if it’s good enough, you can feel some measure of confidence; if it’s “fair to middling” you have some work to do; and if it’s low, then you need some soul searching.  

Sadly, the United States Medical Licensing Examination (USMLE) was not developed to serve this purpose.  It was initially administered by the National Board of Medical Examiners (NBME) in 1992 after many years of attempting to find a way to unify the complicated, erratic interstate physician licensure process; it’s official purpose is to aid authorities granting medical licenses (and also assure stake holders that licensed physicians have “attained a minimum standard of medical knowledge”)  I hate to be redundant, but to drive it home:  the psychomotor validity of USMLE scores is as a pass/fail measure for decisions related to physician licensure.  

It was not and is not meant to serve as a “Residency Aptitude Test”, and at one point, the NBME had to issue a disclaimer to that effect.

Why, then are USMLE scores used in the residency selection process?  Because they provide a nationwide standard for comparison.  If a measure is uniform, validated, easy to interpret and seemingly objective, we are all over it.  “The USMLE scores are currently the only nondemographic continuous variable by which applicants can be rapidly screened.”  Given the perpetual increase in applications programs are receiving, anything that allows applications to be “rapidly screened” is undeniably going to be emphasized.

But, stepping back … what are we screening?  What do the USMLEs reliably predict?  On its face, Step 1 is a multiple-choice test of basic science, little of which has much direct relevance to the practice of ophthalmology (to no one’s surprise).  Perhaps even less shocking is that very little of the material tested is retained by most students – there is significant decline in examinee performance after just one or two years.  There is little correlation between Step 1 scores and patient care or clinical outcomes. The strongest link is between Step 1 and performance on other standardized tests.  

I’m going to throw down this awesome quote – let it stew for a while:  “Such associations raise a question of whether these instruments are truly independent measures of knowledge – or whether we are simply repeatedly assessing a skill in test taking.  While success on standardized tests is a skill prized in our society, it is not necessarily one that adds value to patient care.  In an era when medical knowledge is more accessible than ever before, it seems curious that we have chosen to prioritize a measure of basic science memorization over higher-level analysis and critical thinking.”

Ok, but let’s be realistic – even if that’s all this test measures, standardized test performance is a big deal.  We can’t forget the summative standardized test in medicine – the specialty boards.  Since board certification is of obvious importance on an individual and program level, surely this test is an appropriate predictor of that?  Well, let’s look at the data.

I’m only aware of three studies in our literature (please inform me if you find others), and first a bit of alphabet soup that will become second nature if you matriculate through ophthalmology residency: 

So, first – the most recent publication that just came out this year.  It is an online survey sent to program directors and then disseminated to residents. It was anonymous, self-reported and only 19 programs (15.7%) passed it on to residents, for a completion rate of 13.8% of all ophthalmology residents (read:  major limitations).  Respondents selected their USMLE scores in increments of 10 (210-220, 220-230, etc.) and similarly reported their OKAP scores.  The authors found that in this sample, a 9-point increase in OKAP percentile and a 2.5 higher odds of scoring about the 75thpercentile on the OKAPs when USMLE scores moved up by every 10-point category.  Take home – major limitations, but suggests that a higher USMLE score correlates to a better OKAP performance. 

Second, a study of 76 residents from 15 consecutive training classes (1991-2006) at 1 ophthalmology residency training program found that OKAP scores were significantly associated with WQE pass rate, and that passing or failing the OKAP exam all three years of residency was associated with a significant odds of passing or failing the WQE, respectively (“passing” on OKAPs was considered above the 30thpercentile in this study).  Interestingly, the authors did not find an association with USMLE Step 1 scores and WQE performance.  Take home – in this single institutional longitudinal study, passing OKAPs was correlated with passing the boards (and vice versa), but USMLE scores were not. 

Lastly, a study of 15 residency programs for a total of 339 residents graduating between 2003-2007 were evaluated to determine whether five variables (USMLE scores, OKAP scores years 1, 2, and 3 and maximum OKAP scores) were predictors of passing or failing the WQE.  The authors found that OKAP scores during the final year of residency was the best and USMLE scores were the poorest predictor of board performance.  Take home – in this older study, but the most robust in our field, doing well on your OKAPs just prior to taking the boards is way more predictive of board pass rate than USMLE scores. 

All data and conclusions need careful scrutiny, but based on what I’ve seen, there is little evidence to support using USMLEs as a residency screening tool.  Having hopefully established there is not much demonstrating the scores are helpful, in an upcoming post I’ll cover why this practice is potentially harmful.  In the subsequent post (and last on this topic), I’ll discuss a proposed and seemingly likely major change to USMLE reporting coming our way, and what may (or may not) replace the void in screening. 

Thanks for reading.  Comments are most welcome!

We’re back

It’s that time of year again – leaves are falling, temperatures are supposed to be dipping, pumpkin spice is doing its thing and the collective groans of program directors/coordinators/faculty when wading through residency applications are only silenced by the palpable and justified anxiety of the applicants themselves.

Looking through all these applications, I am annually and increasingly reminded how comparatively unremarkable my application must have been and wonder how I squeezed through.  I am also progressively aware of the limitations and biases embedded within our recruitment process.

With that in mind, I want to share a few posts on some of these issues, including one that has only recently been pointed out to me. I know this website gets little traffic, and I missed the boat by posting this after our application deadline, but hope it sparks some interest for those that read it – whenever that may be. Much of what I will be discussing was provided initially and experientially by others going through this process, and I am very grateful for those that guide me through my ignorance.  I hope these posts can continue that conversation and look forward to any further insights you all have.

Concerns with the match process


It’s been several months since the last post, and in the interim we had a very successful match.  We are incredibly excited to have the opportunity to train the four individuals coming to our department.  From a personal perspective, match day is a polarizing affair – the thrill of first viewing the results and opportunity to call and welcome our new trainees, mixed with the initial concern that our program wasn’t as high on their list as they were on ours.

More disheartening is scrolling through our overall list of ranked applicants and finding those that did not match.  It would be ignorant to think anything other than luck and circumstance separated my medical school match day from theirs.

Clearly, the match is an imperfect process.  Not only are there highly qualified and deserving unmatched applicants each year, but:

  • The number of applications per applicant continues to increase, likely driving even greater metrics-based screening of applications by programs.
  • The costs to applicants (and programs) are substantial.
  • It is a time consuming process, occupying a sizable portion of the 4th year of medical school.

The San Francisco Match recently released the 2019 summary match report.  This document demonstrates the average matched applicant submitted 75 applications this past fall.  In 2004, this number 41.  In 2009 it was 50.  Despite this significant increase in the number of applications, the competitiveness of the match (percent matching) has not changed.

We recently performed a financial analysis of the 2018 match, and found that, conservatively, the mean estimated cost to match for an ophthalmology applicant was $6,613, with an aggregate of $4,636,950 spent by all applicants.  We estimated that our department spent a total of $179,327 in direct and indirect costs over four interview days, or $3,736 per each interviewed applicant.


In the current system, applicants are incentivized to apply to as many programs as possible, while programs respond in large by limiting interview applications to candidates with pre-approved metrics and stronger objective criterion on applications.  What can be done to stop the swell and improve this?  Multiple suggestions have been brought forth recently, and I’ll comment on a few of the more common themes.

A mutually beneficial option would be to limit the number of applications an individual can submit.  Data from the 2017 and 2018 ophthalmology residency match found that the number of interviews offered did not increase beyond 40 applications.  Using this number as a cap, the application costs would decrease from $1,665 to $410 per applicant.  Of course, this would come at the cost of the SF Match and its beneficiaries, with an estimated 80% loss in revenue if no further changes were made in the tiered cost structure.  This would similarly result in an average of 176 fewer applications received and estimated 14.6 hours of time saved reviewing charts at a program level.  This approach presents several reasonable objections, reviewed in detail here.  Regardless of potential merits, leaders in ophthalmology and the ACGME both have suggested this possibility is exceedingly unlikely given each applicant’s consumer rights to apply to as many programs as financially feasible.

Another proposal is to conduct an interview match prior to the standard match process.  After applications have been submitted and reviewed, both applicants and programs would create ranks lists and utilize the same matching algorithm to fill a more limited number of interview spots.  Individual programs would be able to modify their interview limit based on competitiveness.  In this system, both parties would theoretically interview preferentially with fewer required interviews.  This proposal was initially for the surgical fellowship interview process and would likely need to first be trialed on a smaller scale (? such as ophthalmology) prior to widespread consideration.  Further, this system necessitates applicants signal interest in a program prior to the interview itself; there are multiple examples (including my own) of a relatively surprising interview invite and experience ultimately influencing an applicant’s rank list and match results.

A final proposal tested a computer model of the 2014 Otolaryngology match and found that offering applicants the opportunity to provide programs with preference leads to an increase in overall interview invitations, and allows programs the opportunity to review applications more “holistically” instead of using strict cut-off parameters.  This proposal is entirely voluntary and at the time of initial application allows the applicant to choose to reveal if a program is within their list of top programs.  Early editorials of this approach have been very favorable.

While each of these suggestions have relative advantages, any change would require governing bodies to act.  Impetus aside, the financial implications of these and any other proposed changes will be important.  It is worth noting that 2015 fees for all Electronic Residency Application Services (ERAS) applications was $72 million, representing approximately 40% of the Association of American Medical Colleges operating revenue for that year.  It may be naïve to hope any substantial changes favor the pockets of the students.


Lastly, one seemingly universal need in this process is increased transparency.  Programs should divulge internal metrics utilized to screen applicants and additional pertinent information to allow applicants sufficient information to make educations on where to selectively apply.  There has been movement within the Association of University Professors of Ophthalmology (AUPO), the voice of academic ophthalmology, to create standard statistics and disclosure for all of our programs that would be readily available to applicants.  While still in the development phase, this is one change that appears quite likely in the near future.  I hope we start to see more.

Residency Applications

Residency interview season is upon us, with our four dates scheduled for the non-holiday Fridays in November (11/2, 11/9, 11/16 and 11/30).  It’s an exciting time for the department and program, but the selection process always weighs heavily on me.

I am increasingly impressed by the caliber of the applications we receive and genuinely wonder how I ever was fortunate enough to match into this specialty.  Attempting to select appropriate candidates from a large pool of exceptional individuals is truly somewhat arbitrary, but I thought I would share some insight into our process.  There are many different ways programs tackle this, so by no means are we a sterling example, but I hope transparency can stimulate some discussion on ways we can improve our system.  I’ll share a couple of thoughts for potential future directions in a later post.

This application cycle we received 388 applications.  We require all applicants to submit a secondary, short essay on “Why I want to come to UK” to attempt to limitedly differentiate those that are broadly applying versus those truly interested in our program.  The deadline for the secondary application is usually the middle of September.  At that point, I review each of the completed applications to get a gestalt of our applicant pool and distribute the applications to faculty volunteers.  They will then review the applications and whittle them down to a group of roughly 90-100.  Our faculty are free to ultimately utilize their own criteria when reviewing applicants, but are encouraged to complete a scoring sheet to frame the process.  Included in this evaluation are six sections:  Aggregate USMLE Scores, Clinical Performance, Academic and Research Accomplishments, Letters of Recommendation, Interest in our Program, and Outside Interests and Diversity Enhancement.

four people holding green check signs standing on the field photography
Photo by on

Once I have the final group of applications from our faculty, I select 48 applicants to invite for an interview (12 applicants on each of the 4 days).  We also offer “alternate” status to an additional 20 applicants, who are able to pick up interview slots that either go unfilled or are dropped.  Because of the fluid nature of the interview acceptance process, we generally have 10 or more dropped slots between the invitation for interview (around October 1st) and the interviews themselves.  Hence, we consider both groups of 68 (48 invites + 20 alternates) as essentially the same:  applicants on paper that seem most appropriate for our program.  I then generate a list of the applicants and summative statistics (see below) for final faculty review prior to sending invites.  The goal of this final review is to ensure we have an equitable list of applicants that seem to fit our program priorities.

This final step – going from 100 to 68 – is the most difficult for me.  What differentiates each applicant at this point?  Are we being as fair as possible?  How can we know if applicant A is more or less likely to ultimately have interest in our program than applicant B?  These and so many other questions underlie the potential limitations and bias in this process.

How’d it look this year?  I’ve copied a table below of some of our statistics for the total group.  We determined gender based on the pronoun used in letters of recommendation, and Underrepresented Minority (URM) is self-identified.  A personal connection included a rotation at our program or other associations to the program/university.  For Mean USMLE, we took an average of Step 1 and 2 (if Step 2 was available), weighted for Step 1 scores:  [(Step 1 score *2)+(Step 2 score *1)] / 3).

Gender URM Region Personal Connection Mean USMLE
M = 54% 12% MW = 35% 37% 242 (Range 200-265)
F = 46% SA = 28%
SE = 18%
SW = 15%
W = 7%
NE = 1%

So:  388 applicants trimmed to 68, 48 applicants will be interviewed, and 4 new residents welcomed into the program this coming January.  This is by no means an idealized system, and at the end of the day, both my biggest hope and concern is that we treat our program and all our applicants justly.

At any rate, let me know what you think – about the entire system or how we manage it at UK.  I’ll share some thoughts and data I’ve collected on potential changes to the process sometime soon.

Tri for Sight

On Sunday, September 9th, we had the 16thannual “Tri for Sight” Sprint Triathlon/Duathlon under … wet conditions.  The triathlon was founded and continues to be organized by our own Dr. Sheila Sanders, and all the proceeds go to support our department’s UK Eye Research Fund.  It’s estimated that over $300,000 have been generated and some of this year’s proceeds will benefit our newly established UK GO outreach division.

Over 350 athletes swam the 400 meter serpentine length of the pool,

Our PYG2 resident Justin Gagel midway through the swim.

biked the (usually picturesque) 12.6 mile ride through the rolling horse farms,

Because of the mist, it’s hard to tell whether that’s Justin or Lance Armstrong.

and 3.1 mile jog around the Spindletop estate.

PGY4 Laura Coyne jogging in her standard rain gear.

Unbeknownst to certain program directors that thought they finally possessed superhuman speed and endurance at the end of the jog, the course was actually shortened to 2 miles this year at the last minute because of the rain and lightning risk.

Despite the weather, it was another great event.  We had numerous faculty, staff and even medical student volunteers that arrived before sunlight to spend the next 6 hours waterlogged.

Tri 2018 crew 2.jpgTri 2018 crew.jpg

We also had several faculty and residents compete this year, with a notable relay team of Drs. Ellen Sanders, Laura Coyne and Michelle Abou-Jaoude completing the event, aptly named “Velocirefractors”


The date for next year’s race has already been set – Sunday, September 8th.  More than enough time to put it on the calendar!

Department Champs from left to right:  Katie Gagel; Ellen Sanders, OD; Laura Coyne, PGY4; Michelle Abou-Jaoude, PGY4; Justin Gagel, PGY2.


Work in Progress

Sooo …. I’m new to all of this website design and social media activity.  My hope, though, is to start a blog of sorts to discuss happenings in our department and residency program, comment on topics in the field and other random things that may or may not be of interest.  We’re going to try and link other social media to this and create an interactive platform.  More to come …

Feel free to comment below and offer any other suggestions!work-in-progress-gif-11.gif