Consumers in the United States spend about US$2.4 billion per day on healthcare. To help the public make better-informed decisions, the Centers for Medicare and Medicaid Services (CMS) has long shared data on hospital care quality via a site called Hospital Compare (HC). Patients just need to type a location to view and compare nearby Medicare-certified hospitals.
Last July, CMS updated the site by rolling out a star rating system that conveniently combined up to 57 metrics into a single quality indicator. Consumers applauded. The industry, not so much. In fact, the site update was postponed for three months while CMS scrambled to contain the backlash against it.
Industry insiders and analysts were aghast that a number of reputable hospitals, such as the Cleveland Clinic, failed to get the full five stars. Institutions the likes of Brigham and Women’s Hospital (affiliated with Harvard Medical School) received a mere two or three stars. The methodology unduly penalised teaching and safety-net hospitals, critics said.
Was the controversy created by bruised egos or are there real issues at hand? In our paper, “Mortality Rate Estimation and Standardization for Public Reporting: Medicare’s Hospital Compare”, my co-authors and I make a case against the statistical modelling used by HC to measure the heart attack mortality rate, a key metric behind the star ratings. Our research shows it gives patients the wrong advice on where to go in case of a heart attack, a condition that affects more than a million Americans per year.
Size matters
HC’s computation of the heart attack mortality rate has two major flaws. The main one is the way it deals with small hospitals that treat too few heart attack patients to have meaningful data. It essentially says: “If there’s not enough data for a given hospital, we predict the mortality rate to be just like the national average.”
This runs counter to a large number of peer-reviewed medical studies that have consistently shown the mortality risk at small hospitals (as a group) to be well above the national average. It is also counter to what the Medicare data say: Pool the data and you find that the aggregate mortality in small hospitals is worse than in larger hospitals. The HC model is not calibrated to Medicare’s own data.
It’s a matter of common sense, too. Treatment for a heart attack would most likely be better at a hospital that sees two heart attack patients a day, as opposed to one that only sees a handful of such cases a year. CMS looks at small hospitals – such as rural establishments with little equipment – and claims they’re just as good as the larger hospitals where most patients go. In short, not taking hospital attributes into account leads to recommendations that contradict sound general advice that is supported by Medicare’s own data.
We vastly improved the HC model by including patient volume, nurse and resident staff strengths, as well as the ability to perform procedures to open clogged coronary arteries. By further adding the interaction between patient age and hospital volume, the model could even determine that the best hospital for a given patient may differ from the best hospital for another patient.
A better way to compare apples with apples
The second flaw in HC’s modelling concerns the way it standardises mortality data. The goal of standardisation is to ensure fair comparisons. Hospitals that treat sicker patients – those more likely to die no matter the hospital they go to – should not be penalised in terms of ranking. Conversely hospitals that see a higher percentage of patients who happen to be in relatively good health should not, on that basis alone, be described as excellent hospitals.
What we find is that HC used a form of indirect standardisation that actually fails to accomplish its purpose. In practice, the formula underestimates even further the mortality rates at small hospitals, which were already distorted by the assumption, described above, that small hospitals are just as good as large ones.
In our paper, we propose a direct standardisation model with far greater accuracy. In essence, we suggest to standardise patient sickness by playing a hypothetical game: What if every hospital treated the same population of patients? What if ALL patients in the U.S. had been treated at a given hospital? How would your hospital perform if it treated the typical American? We then run the numbers for every single hospital, factoring in its attributes such as patient volume, staffing and relevant equipment. Though computationally heavy, this method has the potential to become the new gold standard for standardisation.
Life and death decisions
In the U.S., someone suffers a heart attack every 42 seconds. When this happens, the person should head quickly to a nearby hospital as the first two hours following the onset of symptoms are critical. However, a study showed that, as more information about hospital outcomes has become available, heart attack patients have increasingly chosen to travel to hospitals with (published) higher heart attack survival rates.
Rankings do matter. The public acts on them, literally staking their lives on them. They influence hospitals’ market share. Officials thus have a clear duty to publish the most accurate information they can.
Policymakers should ensure that their models are well calibrated. Results from the models should, on average, agree with general advice that their own data would yield about broad questions, such as the relationship between hospital volume and mortality. Models should agree with data, not override it.
No comments yet.