Visual Displays in Evidence-Based Science: From Aristotle to Modern Graphs

Pictures at an Exhibition:The role of visual displays in an evidence-based science Howard Wainer National Board of Medical Examiners

We typically date the beginning of empirical science with Aristotle (384 BC - 322 BC) Both before and after Aristotle there was strong opposition to an empirical approach, for data did not always support popular opinion. Aristotle got away with it because he had a Great friend who protected him. The next big step forward took more than 1500 years and is generally credited to the work of the fabulous Bacon boys.

Modern evidence-based science probably begins with Roger Bacon (1214-1294)

Although it is more often credited to Francis Bacon (1561 -1626)

Before ways to look at data could be developed we needed the epistemology that confirmed that the path to knowing required data. It was surely not an accident that breakthroughs in looking at data appeared after the British empiricists John Locke (1632 –1704) and George Berkeley (1685-1753) and the Scot David Hume (1711 – 1776) expanded and amplified Bacon’s work .

Language developed before science and so was not ideally suited to it. A more visual communication medium was needed to allow us to look at scientific evidence. One started to appear in the 17th century, but achieved most of its modern day strength with the publication of Playfair’s Atlas in 1786. By the 19th century scientific presentations were so laconic that the words almost disappeared entirely.

Much of modern science involves, to some extent, the atheoretical plotting of points and a search for suggestive patterns.“The greatest value of a graph is when is forces us to see what were not expecting.” J. W. Tukey, 1977

The balance of this presentation is a single illustration of the marvelous breadth of possibilities and the lessons it provides.

Designer Will Burtin, whose 1951 graph of the efficacy of 3 antibiotics on 16 bacteria, forms the core of this presentation.

The entries of the table are the minimum inhibitory concentration (MIC) that represents the concentration of antibiotic required to prevent growth in vitro. The covariate “Gram staining” describes the reaction of the bacteria to Gram staining. Gram-positive bacteria are those that are stained dark blue or violet; Gram-negative bacteria do not react that way.

The cause of evidence-based science requires looking at evidence to advance the science. How is this cause helped by presenting data in tables? “Getting information from a table is like extracting sunbeams from a cucumber” Farquahr & Farquahr, 1891

Despite this warning, the table dominates the medical literature as the preferred conveyor of quantitative evidence.

Note, how much more eloquent dot plots are than the more usual pie

The entries of the table are the minimum inhibitory concentration (MIC) that represents the concentration of antibiotic required to prevent growth in vitro. The covariate “Gram staining” describes the reaction of the bacteria to Gram staining. Gram-positive bacteria are those that are stained dark blue or violet; Gram-negative bacteria do not react that way. It was named after its inventor, the Danish scientist Hans Christian Gram (1853 -1938), who developed the technique in 1884.

Display Challenges • Scale - the MICs vary over six orders of magnitude, a display in the MIC metric will improperly lump together all data less than 100. • Metaphor - a graph is a visual metaphor. Which one chosen to represent the data is critical to understanding and to memory. • Adaptability - a powerful display should allow us to generalize beyond these data by accommodating to additional drugs and additional bacteria. • Allow Comparisons - allow us to compare the efficacy of the three drugs, and allow comparisons of similar behavior among bacteria.

Antibacterial ranges of Neomycin, Penicillin and Streptomycin The chart compares the in vitro sensitivities to neomycin of some of the common pathogens (Gram+in red and Gram- in blue) with their sensitivities to penicillin and streptomycin. The effectiveness of the antibiotics is expressed as the highest dilution in g/ml, which inhibits the test organism. High concentrations are inward from the periphery; consequently the length of the colored bar is proportional to the effectiveness.

Good displays allow us to answer the questions of interest. • Better displays force us to see what we were never expecting. • Great displays also form a coherent image that stays in our memory.

The Greatest Statistical Graph Ever Drawn

The four purposes of graphic display • Exploration - there is a message in the data and the display helps us to learn what it is. • Communication - we know something and the display helps us to effectively tell others. • Calculation - a nomograph, the display automatically calculates for us. • Decoration - the graph is pretty and enlivens the presentation.

The first display is an unusual combination of exploration and decoration. No one could doubt that this display would make a gorgeous poster -- but careful study reveals much more. More about this later.

Gram negative pill is 74 times larger than gram positive pill

Drop volumes not as evocative as number of pills, but provides separate estimates by bacteria type -- this is a hybrid display that also includes the data table.

Escaping Flatland Burtin’s data can be thought of as three-dimensional in which each antibiotic is a dimension and each bacterium as a point in that three-space. This approach allows us to immediately address questions about the relationships between antibiotics.

Are there less arcane alternatives? Often simpler is better. If a display is not going to be repeated (as in an atlas) we should avoid asking the viewer to work hard to learn a new display format unless that format offers remarkable benefits unavailable elsewhere.

In cooking, any dish can be improved with either garlic or chocolate chips (never both). By the same token, almost any display can be improved by adding information. The bars take up a lot of space, but there is still room. This version has extra interpretive information and also corrects some errors (more about this later). Note that the dependent variable is 1/MIC So that bigger is better.

Bars are nice, but all of the information in the bar is contained in its terminal line. Why use up all of the space with a bar (decreasing the data-ink ratio)?

We’re almost never interested in ‘Alabama First.’ Or, in this case, ‘Aerobacter aerogenes’ first. Data displays should almost always be ordered by some aspect of the data, not the alphabet.

Ordered by Gram character and Penicillin efficacy

On Legends Using a legend requires us to use two moments of perception to understand the graph. We must first read and memorize the legend and then look at the graph. It changes the task to reading the graph rather than seeing the graph. The latter is easier, faster, and less error prone. Avoid legends whenever possible.

Changing the plotting symbol to something more evocative allows us to eliminate the legend

Dots rarely form memorable images We can often make a dot plot more memorable by connecting the dots. Some will carp that connecting dots from categorical variables is misleading; phooey! All that paying attention to such complaints will get you is to miss a memorable picture that you might have made.

There are two kinds of good displays. A Strongly Good Display - that tells you everything you want to know just by looking at it. A Weakly Good Display - that tells you everything you want to know just by looking at it, once you know what to look for. You can change a weakly good display into a strongly good one through the inclusion of informative labels.

Escaping Flatland through the use of multivariate icons One way to show multivariate data on a two dimensional surface is to invent an multivariate icon in which each feature represents one dimension of the data (remember Minard’s 6-dimensional display of Napoleon’s March).

But never waste the two dimensions of the display plane Spatial representation is the most powerful perceptual tool. It allows us to see information instead of reading it.

“That’s funny...” The most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka’ but ‘That’s funny....’ Isaac Asimov (1920 - 1992)

Why isStreptococcus fecalis so different? It would seem that its credentials as a member of the Strep family are impeccable; as Sherman. Mauer & Porter (1937) described it: In some respects Streptococcus fecalis (Andrewes & Horder, 1906) might be considered one of the better established species of the streptococci, and certainly some of the rather unique characteristics of this organism, or the general group to which it belongs, are commonly known by bacteriologists. Yet, in 1984, its genus was changed and its name became Enterococcus faecalis.

Perhaps had these data been plotted in a way that allowed us to compare the profile of responses of these bacteria to antibiotics the classification of these two bacteria would have come under scrutiny sooner.

Now that we know what to look for, it is hard to miss.

Even a table, if well-prepared, would’ve shown the phenomenon

Visual Displays in Evidence-Based Science: From Aristotle to Modern Graphs