At Grand Rounds, we use data to match patients with high-quality doctors. But what does being a high-quality physician actually mean? One of the first challenges here is simply identifying which physicians have the appropriate specialization to address a patient’s needs. This may sound like a straightforward question, but physician specialty data can actually be shockingly inaccurate. For example, a recent Health Affairs study found that doctors’ specialty information in health plan directories was wrong 30% of the time. In other words, if you searched for a new primary care physician, for example, there is a 30% chance that you would end up trying to schedule an appointment with a doctor who does not even practice primary care.
Doctors self-report their specialty to the National Plan & Provider Enumeration System (NPPES) when they register for a National Provider Identifier (NPI), typically at the beginning of their careers. However, sometimes they misreport their specialty or, more commonly, a doctor will further specialize during a fellowship and not update their information to reflect this. One way of closing this gap and identifying a doctor’s specialty would be to call each doctor’s office and record the information. But with more than 900,000 doctors in the U.S. and assuming each phone call takes five minutes, that process would take more than eight years! A more scalable approach is to look at what procedures a doctor performs and use machine learning to classify the doctor’s specialty. For example, our data science team recently built a machine learning classifier to identify which orthopedic surgeons had further specialized with a fellowship in one of the six subspecialties of Orthopedic Surgery: Adult Reconstructive Surgery, Foot and Ankle Surgery, Hand Surgery, Spine Surgery, Sports Medicine, and Trauma Surgery.
To train a machine learning model to identify which doctor is in a given subspecialty, we need features that represent the procedures that they perform, and labels—doctors who we know are members of the subspecialty as well as doctors who are not. For features, we looked at insurance claims to determine the number of times a doctor billed for a specific procedure. While certain procedures are predictive for a subspecialty (e.g., an anterior cruciate ligament, or ACL, repair is predictive of a Sports Medicine doctor), these highly specific procedures ended up not being the best features because doctors use multiple billing codes for the same procedure. Additionally, the claims dataset we have is incomplete. Thus, while a procedure might have been performed, it might be missing from our dataset.
Alternatively, we found that the number of X-rays on specific body parts more accurately represented a surgeon’s caseload, since surgeons perform much more X-rays than actual surgeries. More specifically, the figure below shows the average percentage of X-rays for specific body parts per surgeons in each subspecialty (using surgeons who updated their self-reported subspecialty to NPPES as our positive labels). Adult Reconstructive surgeons, who primarily perform hip and knee replacement on elderly patients, mostly X-rayed patients’ hips and knees. Whereas Sports Medicine surgeons, who do procedures like arthroscopic knee and shoulder surgery for athletic-related injuries, predominantly X-rayed patients’ shoulders and knees. Overall, the average distribution of X-rays for each of the subspecialties agrees with the procedures and conditions that these subspecialties treat.
While on average the number of X-rays performed looked like good features, we also wanted to determine if the X-ray features were representative of the individual surgeons. As the number of X-rays of a certain body part represented a dimension in our vector space (seven dimensions total), we applied t-Distributed Stochastic Neighbor Embedding (t-SNE) to the features. t-SNE is good at reducing high-dimensional data into two or three dimensions, which we can then represent graphically.
The figure below shows the fraction of X-rays for each orthopedic surgeon (represented by colored circles) after applying t-SNE, where surgeons with a similar distribution of X-rays are represented as closer together, and dissimilar distribution of X-rays are farther apart. Overall, the orthopedic surgeons separate out well into defined clusters based on their subspecialty, suggesting that the distribution of X-rays by body part constitutes a representative feature for orthopedic surgery. Not surprisingly, trauma surgeons have the highest overlap with some of the other subspecialties (a), as they treat a wider range of conditions during emergency situations. Some doctors appear in a different cluster than their self-reported subspecialty (b), suggesting that they are either incorrectly identified in NPPES or may have more than one subspecialty.
To train our machine learning model to correctly identify doctors’ subspecialties, we used the number of X-rays performed on each body part and additional relevant procedures doctors performed in each subspecialty as the features. And as the labels, we used doctors who updated their self-reported their subspecialty to NPPES.
As doctors may be members of more than one subspecialty, we trained a separate binary (which predicts yes or no) classifier for each of the six Orthopedic Surgery subspecialties. Each of the classifier models was able to correctly identify the doctor’s subspecialty with a precision of 95% or greater. It was important for us to maximize precision in identifying a doctor’s subspecialty, as we strive to match patients with the right doctors for their specific condition, allowing them to get the appropriate care from the very beginning. These models were then used to predict which orthopedic surgeons had further specialized with a fellowship in one of the previously mentioned six subspecialties. Using our machine learning classifiers, we reclassified 15,000 orthopedic surgeons as being more specialized. One example of how we’re using these new specialty tags effectively: we can better match patients with joint pain, for example, to a doctor that specializes in this and thus can best treat their needs. Building classifiers to identify a doctor’s subspecialty is just one of the ways the data science team at Grand Rounds is using machine learning to match patients to high-quality and appropriate doctors for their specific clinical needs.