Getting “Testy” with Medical Testing!

Back in the 1700’s an English mathematician and Presbyterian minister named Thomas Bayes published two works in his life, one on math and the other on theology. Mr. Bayes never published the work he is most remembered for yet Bayes Theorem has been widely applied from mathematics to bioethics.

On the face of it Bayes Theorem isn’t that complicated. Essentially what the math symbols say is that predicting the future likelihood of any event is based on the present likelihood of that event and whatever evidence (test results) you have that might alter prediction of that future event.

Breast cancer offers a concrete example. To keep things simple, ignoring familiar cancer risk, the prevalence of breast cancer increases with age. A 75-year-old woman has a greater chance of breast cancer than does a 40-year-old woman all other things being equal. A clinician may do an examination in women these ages and find a lump yet the post-examination odds of breast cancer are going to be less in the 40-year-old woman because the pre-examination odds were lower to begin with.

Small numbers multiplied by typically small factors (the technical term is Likelihood ratios) render small results. If my odds for a disease are 1:1000 and a physical examination finding or test only contributes a multiplier (a.k.a. Likelihood ratio) of 2 then the odds of the disease remain low. Thought of differently, if my Likelihood ratio was 10 (and that’s an uncommonly good test) the change in probability of my subject having the outcome of interest increases 45%. That sounds like a lot but a 45% increase of a very small number is still a very small number.

Healthy people are well…healthy. The prevalence of many diseases just aren’t so common to make meager test characteristics add up to much diagnostic clarity. Herein lies the problem of screening healthy people. Among the sick some tests don’t improve upon what could be fairly well estimated based on cheaper studies. Bladder testing among low risk women is a good example. In this setting, basic office examinations have been shown to be as effective in predicting suitability for surgical treatment as expensive testing. Mr. Bayes was a genius.

Recently the American Board of Internal Medicine conveniently culled together the Five Things Physicians and Patients Should Question based on the evidence-based recommendations of specialty societies representing 500,000 physicians (check it out at Among the recommendations submitted by the American College of Obstetricians and Gynecologists was to not treat mild cervical dysplasia that is less than 2 years duration and not to screen for ovarian cancer in asymptomatic women at average risk. The reasoning behind these and the many other excellent society recommendations can often be traced back to Bayes Theorem. Either the prevalence of the disease is rare or the test/treatment performance is poor or both rendering Bayes equation to calculate it’s just not worth the effort. Indeed, beyond the cost of testing, false positive tests cause a lot of harm to a lot people every year.

Dancing gorillas say a lot about another problem with testing. In a recent study researchers snuck the image of a dancing gorilla on a radiograph that was later reviewed by radiologists. The radiologists did their job and reported what clinically they saw in the radiograph but when asked about the dancing gorilla they were stumped. Few saw the dancing gorilla. Is this really that surprising? Not really. There is collusion between our eyes and brain such that what we see is what our brain says is to be seen. Dan Gilbert’s book, Stumbling on Happiness nicely describes research supporting this eye-brain conspiracy. The problem of dancing primates is that a test might not always clarify one diagnosis or another but merely confirm the plan already hatched in the mind of the person ordering the test.

Such seems to be the case for a young woman I saw who had undergone three procedures to treat her bladder problem. She had undergone the usual testing to justify the procedures done. Surprisingly none of the tests confirmed the diagnosis pursued in doing the procedures. In other words, the tests didn’t matter. This scenario could be helped with a better understanding of Mr. Bayes theorem. For this young woman, is a troublesome bladder problem common such that three procedures would prove ineffective? Not so much. Herein was a red flag that the pre-test odds might make any test result suspect inspiring needed diagnostic and therapeutic caution. Understanding how common a disease is in your population can help you see the dancing gorilla.

I have surely abused some treasured statistical concept in my simplification of what can quickly become very complex. For example I have mixed odds and probabilities as if they are synonyms. They are related but importantly different. I also haven’t mentioned that tests results can be combined. Pre-test disease prevalence multiplied by test 1 and test 2 and test 3 may offer a remarkably good prediction for the presence of a given disease. Nonetheless Bayes theorem says a lot about the limits of evidence-based medicine even if it is at the heart of many of its recommendations. The “art” of medicine may be a clinician’s capacity to understand the prevalence of a given condition among her population. Good medicine will always rely on the merger of common sense and science and in a way this is what Bayes Theorem is saying.

Mr. Bayes’ ideas on conditional probability were not immediately embraced by his mathematics peers. Many clinicians would not embrace statistical methods in their daily practice. I am reminded of George Santayana who said, “Those who do not remember the past are condemned to repeat it.” In health care, for both patient and practitioner, Mr. Santayana’s sentiment echo Mr. Bayes theorem and could as much be reconfigured as, those who do no appreciate mathematics are doomed to spend lots of money.



“It’s hard not to be romantic about baseball.” So goes a refrain uttered by Billy Beane in the movie, “Moneyball.” If you haven’t seen the movie it is based on the true story of William Lamar “Billy” Beane, the general manager of the Oakland Athletics. Billy, given his personal disappointments in how ineffective baseball scouts were in estimating the “value” of a player, turned to “data” to guide how and who should play. In short, in 2006 Billy took a $62 million team to the American League Championships the same year the New York Yankees spent $195 million for nothing. Spending more money in baseball doesn’t necessarily guarantee better outcomes. Sound familiar?

Here’s a dialogue from the movie. The two characters are Billy and a fictional character named Peter Brand. Peter in the movie is portrayed as the one who taught Billy how to apply economic principles in interpreting a player’s performance.

Billy Beane: It’s hard not to be romantic about baseball. This kind of thing, it’s fun for the fans. It sells tickets and hot dogs. Doesn’t mean anything.
Peter Brand: Billy, we just won twenty games in a row.
Billy Beane: And what’s the point?
Peter Brand: We just got the record.
Billy Beane: Man, I’ve been doing this for… listen, man. I’ve been in this game a long time. I’m not in it for a record, I’ll tell you that. I’m not in it for a ring. That’s when people get hurt. If we don’t win the last game of the Series, they’ll dismiss us.
Peter Brand: Billy…
Billy Beane: I know these guys. I know the way they think, and they will erase us. And everything we’ve done here, none of it’ll matter. Any other team wins the World Series, good for them. They’re drinking champagne, they get a ring. But if we win, on our budget, with this team… we’ll have changed the game. And that’s what I want. I want it to mean something.

In real life the person who best fits the character of Peter Brand is Paul DePodesta and in truth the 20 consecutive game wins was in 2002 and the ALS Division Championship was in 2006 and its more likely that in reality Billy and Paul DePodesta were less concerned about “changing the game” than simply fielding a decent team after the Athletic’s owners slashed the budget in 1995. Facts can really foul up a romantic tale.

For a moment, however, consider the analogs to medicine. Over the years, U.S. medicine has been flush with money. The “value” of a physician was estimated on metrics that had little to do with bringing patient value to a hospital or more importantly a community. To medicine’s shame, baseball has vastly more “stats” on their players and these “stats” actually mean something insofar as how players are used across a team or teams. Ask a surgeon what are their reoperation rates or their rates of transfusion or average length of stay…silence. To be fair these are difficult numbers. Many point out that numeracy is so poor among patients that they wouldn’t understand these numbers anyway. But lots of folks can understand a batting average or a pitchers ERA. Medicine should and can do better.

It’s hard not to be romantic about baseball. I love baseball. There is some palpable sense that when sitting across from third base that the differences between rich and poor, black and white, young and old, times good and bad are all narrowed. The world seems a better place when considered across the bright green of a manicured baseball field. Medicine can be romantic too. All our differences should be no less narrowed and while the Listerine floors of a ward pale in comparison to a baseball diamond, the sick should sense the same opportunity for betterment in this “house.” Billy Beane turned down an offer from the Boston Red Sox to be their general manager at a salary that would have made him among the highest paid GMs in history. That sort of decision has to come from some deep place that values something more than money. Mission balances margin and margin is not the mission. Do you want to change the “game” of medicine? I do.


Compared to what?

A flyer appeared in my mailbox the other day advertising a surgical technique for a particular hospital system. I’ll refrain from specifics. The advertisement, however, wasn’t unlike a lot of health-related advertisements. In this case the advertisement claimed, “less pain, “shorter hospital stay,” and “faster return to your regular activities” with this new surgical technique.  When confronted with this sort of claim, the savvy health consumer should ask, “compared to what?”


Health care is expensive and fee for service still keeps the lights on for most hospitals. Getting folks in the door is the first step in getting those fees rolling. I hate to sound like medicine is just another business but in some ways it is. It would be naïve to think advertisements by health systems are motivated exclusively by some sense that coming to one health system over another is in your best interests. That would be altruism and there are some who legitimately feel this way about advertising health services. In some cases they are right (e.g. going to one hospital is better than going to another) but in many/most ways the consumer of health services needs to embrace the adage, “let the buyer beware.”


Most of medicine is practiced without substantial proof that it works. Indeed according to the now defunct Office of Technology Assessment (OTA), fewer than 30% of procedures currently used in conventional medicine have been rigorously tested. Since the OTA was retired in 1995, and given the introduction of new procedures and technologies have likely exceeded credible investigations as to their clinical merit, that 30% number is likely vastly underestimated. This is certainly true for the surgical tool marketed in my advertisement mailer. Indeed credible investigations comparing this surgical approach with other convention approaches do not fully support the claims made. Even more poignant is the innovation was being promoted for a procedure one study estimated was done without good cause in 70% of cases. Ouch!


To be sure physicians and by extension the health systems they represent, want to do well for their patients. At least part of the problem is that much of medicine doesn’t understand what constitute “proof” of effectiveness. For example, one study found sixty-percent of gynecologists asked to estimate a woman’s chance of having breast cancer following a positive screening mammogram overstated the risk. Statistics, that branch of mathematics that is often central to judging the comparisons made in clinical studies, can be really confusing and acquiring and maintaining a working knowledge of this is difficult for most clinicians.  To make matters worse, industry influence and just plain bad study methods, has made identifying credible investigations almost like looking for a lost needle on the hospital operating room floor…or haystack.


So what are you suppose to do? Here are some helpful rules of thumb about consuming medicine:

1)   Carefully weigh any recommended therapy with what you find important seeking to understand what evidence exists regarding how the therapy works for addressing what you find important.

2)   Take advantage of a medical librarian. Many hospitals have medical librarians who can help you find studies relevant to your problem. See what they can find. You can “Google” your problem but sometimes that just brings up junk. A medical librarian can help sift some of the junk out.

3)   Don’t be afraid to ask your physician hard questions about what proof justifies the treatment recommendation. If they appear offended check out the next rule of thumb.

4)   Get a second opinion…maybe a third?

5)   Understand the bias. Bias occurs anytime there are hidden influences that push opinions in a specific direction. Bias can be good or bad but you’d like to minimize bias in medical studies or in treatment recommendations. One problem folks have with industry buying physician lunches is the problem of bias. On September 30, 2014, as part of the Sunshine Act, the federal government will release most of the data showing how much money industry gave to physicians. In pursuit of understanding the bias, that data might be helpful to look over when it becomes available.

6)   Seek a consultation with a physician who publishes research. If a physician conducts and publishes research then there is a better chance they are better at interpreting the current literature on a topic. Note I just said “better.”

7)   If waiting is clinically appropriate, resist the temptation to decide on any aggressive therapy the day you are offered it.


My list isn’t perfect. There are always going to be gaps between what you and your clinician know about a given medical problem. That gap is safeguarded historically by a privileged ethical relationship wherein the patient’s interests are to be given priority over those of the clinician or the health system. There are lots of good and bad reasons why understanding what’s best is not such an easy task. Facing that challenge, when presented medical facts regarding a treatment being offered to you don’t forget to ask, “compared to what?”


Lake Wobegon Medicine

If you’re a fan of Garrison Keillor and the NPR show, “A Prairie Home Companion”, you know about Lake Wobegon. Keillor’s closing words from this fictional Mid-West destination were always the same, “Well, that’s the new from Lake Wobegon, where all the women are strong, all the men are good looking, and all the children are above average.” An amusing sentiment since clearly not all of any population can be above average with respect that that population.

An average is colloquially the arithmetic middle of a collection of numbers. Precisely half of that collection is above the average (or mean), half is below. I acknowledge there are other measures of central tendency. My point, however, is that across a collection of numbers or children or physicians, not everyone can be above average. It’s just not possible.

A cited criticism of the U.S. educational system, particularly with respect to mathematics education, is the favoritism to those students who possess an innate talent for a given subject. This favoritism produces a sense that you should have no interest in subjects for which you possess no talent. Nonsense.   Being innately “below average” may indeed have some benefits. Arguably the best teachers of a difficult subject are those who have struggled with it and can show others a path toward comprehension.

“C = MD” is often cited among many in medical school and in part it is true but unlike the children of Lake Wobegon clearly there are at least some future physicians who are NOT above average among their peers. Does this matter? It can in a number of aspects but the aspect I am most interested in is that among practicing physicians there can be a sense that we’re all equal when indeed clearly this can’t be case. Honest appraisal of our talents induces humility to admit those instances where we can learn and become better or avoid and let others with better talents take over. This sort of appraisal can get really messed up. Pride, lack of reference, remuneration, time, and sloth, among other things can make appraising our talents difficult. Note I’m not saying physicians consciously avoid self-appraisal but who among us, in any context, wants to hear what we need to improve? That’s like saying you want to have your teeth cleaned – you do it because you know its good for you, not because you want to. I’m reminded of the ancient Greek aphorism, “know thyself.” There is growing pressure for physicians to “know thy outcomes” as part of reforming health care but for those efforts to work medicine needs to first acknowledge we all possess a need to know. Lake Wobegon is fiction and so too is the notion we in medicine are in every respect all above average…everyone needs to be working to be better.


Seeking Synthesis?

Georg Wilhelm Friedrich Hegel was one of those philosophers that nobody seemed to really understand in Philosophy 101. He appeared soon after Immanual Kant on the syllabus and perhaps mental fatigue had set in. One of Hegel’s idea I appreciated was the triad, thesis, antithesis, and synthesis. It’s disappointing to now learn from professor Wikipedia that some experts discredit Hegel as its author. So much for idealism? Nevertheless according to professor Wikipedia the triad is described as follows:

  • “The thesis is an intellectual proposition.
  • The antithesis is the negation of the thesis, a reaction to the proposition.
  • The synthesis solves the conflict between the thesis and antithesis by reconciling their common truths and forming a new thesis, starting the process over.”

This idea of truth being iterative, a revolving cycle of idea, counter idea, and synthesis of them, strikes me as very appealing and sadly uncommon.

Medicine is full of theses and antitheses. This state of affairs is even truer today than 10 years ago. The problem I often encounter is there isn’t much synthesis. In this sense I’d say there is possibly less of this than 10 years ago. Error is regarded with such fear that instead of respecting and learning from it – the effect of synthesis when allowed to happen – everyone upholds their respective thesis or antithesis in their respective corners and respecting each is regarded as “tolerance.” To seek synthesis, to debate openly with passion, is the epitome of poor taste.

Thomas Jefferson in writing from his Alma Mater, no not The University of Virginia but The College of William Mary, penned, “For here we are not afraid to follow the truth wherever it may lead, nor to tolerate any error so long as reason is left free to combat it.” I think Jefferson might have appreciated the triad. Error shouldn’t be tolerated but it ought be respected insofar as it can perhaps best identify where the truth may lead. Today physicians, hospital administrators, patients, everyone fears error and truth suffers. If our present care systems have rendered a poor product – high cost, mediocre outcomes – we have to set off in a new direction and to do that we must not be afraid of truth; we need to confess, respect and synthesize from error. Seeking synthesis will appear to some as brutishness. Truth seeking is not for the timid.



If a patient has a fixed belief that a treatment outcome includes something not regularly reported for it or commonly experienced by the treating physician, yet this is the primary reason the patient seeks the treatment, should the physician administer it? You’re considering a question of competency – not competency in the legal sense but in the ethical sense of informed choice. Is the patient’s choice for the treatment really competent? What moral responsibility does a physician have toward ensuring a patient understands what they are getting into? There is a fine line between a physician abusing the unequal power across the physician-patient relationship and paternalism.

I don’t have easy answers to this question. Fundamentally why do we have the sense that maybe a physician to act in such a situation might be doing something wrong? Why do we have the sense to not act affronts a patient’s choice?

Does the risk or cost of a treatment matter in such a question? Should how the physician is paid matter? In surgery both the costs and risks can be significant and for years medicine has operated in a fee-for-service model that compels physicians to do more because they get paid more. It is this physician-first, patient-second sense that studies citing the Dartmouth Atlas sometimes intimate and when acknowledged spawn alarming headlines.

The rates of hysterectomy across different parts of the United States are not uniform. Women are far more likely to have a hysterectomy in the South or Midwest than in the Northeast. If, for the sake of argument, women in the South are having hysterectomies without being fully competent of that treatment choice (note an article appeared in Obstetrics and Gynecology in 2000 citing 70% of hysterectomies are without sufficient medical indication – I’m guessing that % has improved over 14 years) are the physicians violating the categorical imperative; using persons as a means to an end and not as ends? Alternatively, to not do these hysterectomies will “hysterectomy eligible” southern women view their physicians negatively, as not delivering good care? Ironically, given today’s emphasis on “patient experience” in physician evaluations, such a view among these women could eventually favor physicians who are less evidence-based. This sort of reason has been proposed for why the surgical robot is misused in gynecologic surgery.

As I said I don’t have any easy answers but perhaps what can be appreciated is the sense that these questions have moral roots. The systems at work in driving better care are rooted in tough moral questions that in a pluralistic society may never have a stable response. It should also be appreciated that there is a balance at work here that can easily tilt in favor of endpoints that are both negative. Respecting a patient’s autonomy (in my example maybe the patient really does know something the physician doesn’t) is balanced against a physician’s duty to not harm the patient and to equitably administer limited health resources (among others). Such appreciation would I hope inspire humility among us all.