Edward Tufte’s Beautiful Evidence

“We must have an endless commitment to finding, showing, and telling the truth”- Edward Tufte

ET open

 

Edward Tufte is a professor emeritus of political science, statistics, and computer science at Yale University. He is also the closest thing we’ve got to a household name in the field of information design, devoted equally to the data and its display, its function and its form. His beautiful, self-published books are loaded with rich visuals covering all sorts of graphs, charts, and maps, many historic and some new. He has worked as a consultant to NASA and The United States government, among many others. He is also an artist and has developed a sculpture park of 234 acres in Woodbury, Connecticut. He periodically gives a one-day seminar course on “Presenting Data and Information” which, though the title sounds impossibly dry, draws attendees from around the globe.

 

A ripple ran through the crowd as Edward Tufte, “The Leonardo da Vinci of data,” (Shapley, 1998) “The Galileo of Graphics,” (Aston, 2009) took the stage to introduce a music visualization as the opener to his seminar in Washington DC on November 6, 2017. The lights went out completely and I could no longer even see the note paper in front of me. Color bars pass on a screen, pulsing in coordination with the musical notes of a recording of Chopin, part of the Music Animation Machine by Stephen Malinowski.

 

But before we even got to the beautiful music, we had quiet reading time. Edward Tufte begins all of his presentations with “Study Hall” where he provides a devoted segment of silent time to read a written document, of 3-4 narrative pages, that describes the concepts and information of the materials that are about to be presented. He is known for his disdain of typical Powerpoint presentations and describes the study hall segment as the first step in a successful presentation by placing control (physically) in the hands of the audience. He is devoted to the user as the priority when setting out to provide education, evidence, and information. His position is that the viewer has the best capabilities to scan the materials for what interests them, skip what they don’t want, and have the option of holding onto the hard copy for future reference. In this way the presentation may be customized and personalized by the viewer and not just the presenter. The meeting begins by empowering the ones who are there to learn and makes them active participants right from the start, rather than passive observers. This “silent start” study hall strategy is also used by Jeff Bezos at Amazon for all company meetings as an efficiency measure that his team swears by. (Bariso, 2017) Attendees have time first to learn and time to think, and then to be more engaged in the meeting.

Frederic Chopin, Barceuse, Opus 57, depicted by Stephen Malinowski's Music Animation Machine
Frederic Chopin, Barceuse, Opus 57, depicted by Stephen Malinowski’s Music Animation Machine

Tufte describes the Powerpoint presentation as “stacked in time.” The concepts and data are dribbled out over time and there is no way for the audience to gather the whole in a cohesive context. It may be the easiest to create and show, but it is not the best way to clearly and accurately inform the viewer. When a presenter subjects his/her viewers to the passive experience of sitting through the traditional Powerpoint slide deck, they are entirely at the mercy of the presenter’s pace and choices. “For all of the disruption we have seen from the tech industry, there is a complete lack of creativity.” (Tufekci, 2017) Powerpoint has enjoyed the default presentation position for years. Tufte compares it to voicemail menus where a listener must wait through a list of options to find out what to press–another interface that is stacked in time and inconvenient for the audience. A “focus on the person… a shift towards a ‘person-centred’ approach, rather than a ‘system-centred’ approach. This has been accompanied by a switch from quantitative methods to qualitative methods.” (Wilson, 2000) Tufte very decisively advocates for a human- and user-centered process, that addresses quantitative and qualitative measures, when sharing data and presenting information.

 

Besides the study hall beginning, Tufte describes the optimal way for users to consume information as “adjacent in space.” Using methods that allow the integration of different types of data–numbers, words, pictures, etc. in the same space creates context for a richer, comprehensive understanding. He shared Charles Joseph Minard’s data map tracking the French invasion of Russia in 1812. The visual creates a rich story by including numbers of troops lost, over time, as it relates to deployments, the terrain, and the weather. It provides lots of context to explain the devastating loss of life due to multiple factors. “Taking context seriously means finding oneself in the thick of the complexities of particular situations at particular times with particular individuals.” (Nardi, 1996, p35)

Charles Joseph Minard's data map of Napoleaon's Russian invasion in 1812. 422,000 troops at the beginning. Only 100,000 survived.
Charles Joseph Minard’s data map of Napoleaon’s Russian invasion in 1812. 422,000 troops at the beginning. Only 100,000 survived.

Tufte had plenty of jabs for 3-D pie charts, drop shadows, bright colors, and “datajunk.” His directive is to first consider the data that you must convey, leave as much information intact as possible, and present it in as clear and uncluttered a format as possible. Visual clutter is the signature of the designer, the coder, the editor, and will only impede the learning process for the viewer. Just as Lessig describes his concept of Open Evolution as it relates to coding: “Build a platform, or set of protocols, so that it can evolve in any number of ways; don’t play god; don’t hardwire any single path of development; don’t build into it a middle that can meddle with its use” (Lessig, 1999, p110), Tufte is a strong proponent of keeping as much of the data itself available for the audience to interpret for themselves. There is no such thing as information overload, only bad design.

 

“Evidence is evidence, whether words, numbers, images, diagrams, still or moving. It is all information after all. For readers and viewers, the intellectual task remains constant regardless of the particular mode of evidence: to understand and to reason about the materials at hand, and to appraise their quality, relevance, and integrity.” (Tufte, 2006, p83)

 

The more data that is there, the more accurate and believable it will be. He raises significant concern around integrity in data analysis. Specifically, in the process of selecting data, he recognizes the broad practice of “cherry picking and lemon dropping” data to suit one’s biases. Akin to James Moor’s Invisibility Factor in computer calculations: “Answers chosen will build certain values into the program…This becomes a significant ethical issue as the consequences grow in importance.” (Moor, 1985, p274) Gathering data appropriately and then showing it accurately takes skill and mindful discipline. Many data visualizations can skew the viewer’s understanding simply by using larger font sizes for some bullet points–and the more important the data, the higher the stakes. Tufte shared a presentation from his evaluation of the Space Shuttle Columbia flight that imparted information in a way that led to dangerous decisions and ultimately may have contributed to the loss of life for 7 crew members. The Powerpoint format was too simple, inappropriately edited, and did not show the important, complex data that needed to be considered. He argues the world is complex and multi-variate and we must display the data to encompass this. Don’t dumb it down–have faith in the viewer.

 

Tufte warns us all, as creators and as users, to question the data we see and to think carefully about the relationship between the evidence and the conclusion. Start with an open question and do the research. Taking data from one project and appropriating it to something else will often lead to inaccuracy. Every step, from how the data is collected (is the scientist paddling over to a cleaner area of the lake to gather his sample for water pollutants?) to how the designer lays it out (are they removing some data to make it fit nicely in their grid?) is suspect. “In the political and philosophical sense in which I use the term here, neutrality is impossible. In any situation, there exists a distribution of power.” (Jensen, 2006, p91) He also urges us to question our own biases and to cultivate self-awareness about what we see.

 

Tufte urges us to always assume equality across a room. Give every idea a chance. Meet the challenge to see things in a neutral way rather than through the lens of our own bias. Always source alternative and divergent views. Create the environment for truth to be sought and revealed.

 

The last visual of Tufte’s talk asked:

 

How do they know that?

How do you know that?

How do I know that?

 

“When we turn over the provision of knowledge to others, we are left vulnerable to their choices, methods, and subjectivities. Sometimes this is a positive, providing expertise, editorial acumen, refined taste. But we are also wary of the intervention, of human failings and vested interests, and find ourselves with only secondary mechanisms of social trust by which to vouch for what is true and relevant.” (Gillespie, 2014, p187)

 

In the field of data analysis and visualization we must be accurate with sources, provide more than less information, and show it in the most logical and digestible way to our viewers. Ultimately we are seeking the truth, which Edward Tufte proselytizes can be found through evidence. If the evidence is shown properly, the right conclusion will be found.

 

Sources:

Aston, Adam (June 10, 2009) “Tufte’s Invisible Yet Ubiquitous Influence” Bloomberg.com https://www.bloomberg.com/news/articles/2009-06-10/tuftes-invisible-yet-ubiquitous-influencebusinessweek-business-news-stock-market-and-financial-advice (retrieved on November 9, 2017)

Bariso, Justin (2017) “Silent Start: The Brilliant and Surprising Meeting Method I learned from Jeff Bezos,”  Inc.com.  https://www.inc.com/justin-bariso/amazons-jeff-bezos-uses-a-brilliant-and-surprising.html  (retrieved on November 9, 2017)

Gillespie T. (2014), “The relevance of algorithms” in Media Technologies: Essays on Communication, Materiality, and Society, eds. T. Gillespie, P. Boczkowski, and K. Foot. Cambridge: MIT Press, 167–194. http://www.tarletongillespie.org/essays/ Gillespie%20-%20The%20Relevance%20of%20Algorithms.pdf with Shapin, Steven. 1995. “Trust, honesty, and the authority of science.” In Society’s Choices: Social and Ethical Decision Making in Biomedicine.

Jensen, R. (2006). “The myth of the neutral professional” in Questioning Library Neutrality, ed. A. Lewis. Library Juice, 89–96.

Lessig, L. (1999). “Open code and open societies: values of internet governance,” Chicago-Kent Law Review 74, 101–116. http://cyber.law.harvard.edu/works/lessig/ final.PDF.

Moor, J. H. (1985). “What is computer ethics?” Metaphilosophy 16(4): 266–275.

Nardi, B.A. (1996). “Studying context: a comparison of activity theory, situated action models and distributed cognition” in Nardi, B. (ed.) Context and Consciousness: Activity Theory and Human-Computer Interaction. MIT Press.

Shapley, Deborah (March 30, 1998) “The da Vinci of Data” The New York Times. (retrieved on November 11, 2017)

Tufekci, Zeynep (Nov 2, 2017) Confronting Surveillance Capitalism with Zeynep Tufekci, Civic Hall, New York City event.

Tufte, Edward (2006). Beautiful Evidence: 83-127.

Wilson, T. D. (2000). “Human information behavior.” Informing Science 3(2): 35.

 

 

 

 

 

 

Observation: Mount Sinai Department of Human Genetics

The Mount Sinai Hospital of New York City has a respected and established Genetics and Genomic Sciences department. The cancer genetics area of the department counsels patients before and after they choose to undergo genetic testing. The industry of genetic testing has expanded rapidly in the past ten years as science continues to identify specific gene mutations and their impact on individuals’ health risks. It has become more common for healthy adults to undergo genetic testing as a measure to predict cancer development based on the statistics and data related to known genetic factors. Genetic mutations can have harmful, positive, or neutral implications for a person’s risk profile. Ten years ago there were seven “significant” gene mutations that were understood and could be tested for. Today there are hundreds. Regardless of your genetic mix, there is no way to change your genes and the predictions for illness are only based on percentages in the data. There is no formula whereby an individual can know how their own health story will actually play out. Inherited mutations are believed to play a role in approximately 10% of all cancers. (National Cancer Institute, 2017)

Test Positive

Because you cannot change your genetic profile, the question is whether the knowledge of being “positive” for a negative gene mutation is beneficial to one’s physical and psychological health. In some cases, having the information can allow for preventive surgeries and screenings that may reduce or eliminate the risk for certain cancers. The knowledge also has the potential to reinforce healthy lifestyle choices for those wishing to mitigate genetic risk factors through environmental or habit changes. The genetic counselor’s job is to spend time with the patient before they have testing done to assist them in determining whether the knowledge of their genetic profile will lead to a positive, healthier life (in the case of a patient who is ready and willing to take action as a result of the diagnosis) or whether the information could simply increase stress around a condition if the patient is unable or unlikely to change their situation. The counselor must also be aware of his/her own biases throughout the process to avoid leading the patient’s choice. After genetic testing is complete the counselor meets again with the patient to discuss the results and work through next steps, if medical intervention is recommended.

As with the collection of other large data sets, the collection and analysis of genetic data brings with it unintended consequences. Issues of privacy, access, and discrimination exist in this domain. “Given enough data, intelligence and power, corporations and governments can connect dots in ways that only previously existed in science fiction.” (Howard, 2012) Privacy risks involved with cancer genetic testing include discrimination by insurance companies based on higher risk of illness and pre-mature death. In 2008 the Genetic Information Nondiscrimination Act (GINA) was passed to protect those who have had testing from discrimination by health insurance agencies and employers. However, the law does not cover disability, life, or long-term care insurance and those entities are increasingly likely to ask potential customers whether they or anyone in their family has had genetic testing that revealed a significant mutation. Our longevity metric could land in the mix of profit-loss equations for these big businesses.

Those who are not covered with good insurance and those in poverty are far less likely to be able to get genetic testing at all, which means this trend could perpetuate discrepancies in life expectancy of wealthy vs. poor patients. Patients who can afford the testing will have more information and may be better able to make informed choices to extend their personal health and health of their family members.

The chart below maps the procedure for counseling, testing, results, and follow-up care. It is perfectly fine to decide “Not to Know” about one’s potential genetic risk profile. What the counselor seeks to avoid, is to reveal a significant mutation without doing anything with that information to mitigate the risk. Knowing can be a burden and should be used as the power to changing outcomes.

Gen testing

 

 

In order to set up the 3-hour observation, I contacted Karen Brown, Director of the Mount Sinai Cancer Genetic Counseling Program. She directed me first to speak with Volunteer Services at the hospital to meet the basic requirements for observing in any capacity at Mount Sinai. I was sent an “Observer Form” which needed to be signed by the counselor who I would shadow. I also had to provide the following:

  1. Proof of HIPAA training. The Health Insurance Portability and Accountability Act is a law that applies to all healthcare industry professionals and their subcontractors and affiliates to create privacy protections around patient healthcare information (to which I would be exposed during my observation)
  2. Proof of medical clearance and toxicology screening
  3. Verification of credentials and qualifications (degree, letters of reference, CV)
  4. Flu shot
  5. Security photo ID (provided through the hospital)

 

My observation day was Friday, October 13, 2017.

First I sat with the front desk “intake” crew at the Cancer Genetics department. Calls come in at a steady pace from patients asking for genetic counseling. Most have been referred by doctors because they are exhibiting pre-cancerous symptoms at a younger-than-usual age. Others have been referred because a family member has tested positive for a known “significant” mutation and they want a diagnosis for themselves. Others call without a referral and are turned away to get a referral from their primary care doctor. If they have no clear medical reason for testing, they have the option to have testing due to “patient concern” and pay for the process out of pocket (they still must have a referral from a doctor to take this route). The cost is typically anywhere from $500 to $3000. Large insurance companies generally cover some or all of the costs for patients with family history of a known genetic mutation or if a parent or sibling has died of cancer under the age of 50 (a marker that genetics may have been at play).

Demographic information and medical history are collected over the phone after which the patient is directed to a website where they enter extensive family history information into a family tree template (specifically any history of cancer, cancer deaths, or pre-mature deaths due to illness that the patient is aware of).

Next I sat in with a cancer genetics counselor for a pre-testing appointment. I watched the interview process whereby a 42-year-old patient who has concerning symptoms learned about how the genetic test is done and what she may learn by submitting a DNA sample. The pre-cancer she has may be bad luck or it may be genetically inherited. The counselor goes through every person identified in the patient’s family tree and which relations died of cancer. All of this data will be entered into Mount Sinai’s database.

There is a specific panel of testing that the counselor highlights to this patient according to her symptoms. If the patient is found to have the suspected genetic mutation, the recommendation will be for her to undergo yearly screenings for several cancer types that she would then be known to have a high risk of getting. Surgeries may also be recommended to remove organs that are at high-risk. Many of the known significant mutations raise risk levels for more than one type of cancer. Anxiety understandably may increase with this extensive additional data. The patient is currently concerned about her stomach pre-cancer symptoms, but she may be recommended to have many more cancer screenings in the future if she is tested and the genetic result is positive. As more people have genetic testing more data is gathered to identify additional significant mutations and refine risk percentages for different conditions to which they are linked. The whole process ultimately leads to more data, more information, more screenings, more surgeries and these feed back into the system to create more data linked to each significant mutation. Logically it should reduce premature death rates. It also adds a lot of personal data into the healthcare industry and encourages many more surgeries and doctor visits going forward. The patients will hopefully live healthier and longer. The hospitals will generate more revenue as a result of the many added procedures.

The counselor walks this patient through the process while trying not to convince her of a particular direction. The woman has siblings and children and a positive diagnosis may have consequences for them as well. There are a couple of emotional moments and the patient fidgets nervously. She has a choice to “Not Know” or to possibly “Know.” If there is no genetic mutation identified in her cancer genetic test, she will need to continue looking for care answers with her primary doctor, but her family will not be directly at risk. She decides to move forward with the widest panel of testing because she believes she would rather know her complete cancer genetic profile. Consent forms are signed and the woman is brought down the hall for blood to be drawn. In four weeks she is scheduled to come back in for the results, at which point the next chapter of her care will begin.

Sources:

Alexander Howard as quoted in 2012 in Digital Disconnect by Robert W. McChesney, 2013

National Cancer Institute, https://www.cancer.gov/about-cancer/causes-prevention/genetics/genetic-testing-fact-sheet, retrieved on October 12, 2017

Mount Sinai Department of Genetic and Genomic Sciences

The Business of Free: Disruption or Destruction?

I have worked in the commercial photography industry for twenty years and have witnessed numerous disruptions. Stock photography disrupted assignments. The royalty-free license disrupted the rights-managed license. Digital photography disrupted traditional film photography. Internet marketing disrupted catalog marketing. Each stage and phase has raised questions and stirred angst for professional artists making their living through this creative medium. The digital revolution combined with the growth of internet commerce has created an environment of chaos for commercial content and media business models. Organizing media online so that it is effectively searchable and solving the riddle of how professionally produced content can be funded online have increasingly created obstacles for anyone who makes a living in the space of creative media. (McChesney, Digital Disconnect, p82). At this point, we are witnessing not only radical disruption but potentially destroyed established business models due to a massive shift in what drives revenue in online commerce.

coley-christine-240
Photo by Coley Christine on Unsplash

Unsplash – The New Reality of Competing with Free
The company Unsplash was founded in 2013 in Montreal, Canada, and is self-identified as a “Beautiful Free Photo Community” with subheading: “Do-whatever-you-wish HD photos. Gifted by the world’s most generous community of photographers.” (Unsplash.com) The CEO and Founder, Mikael Cho, spearheaded the broad adoption of this model of copyright-free photography when he was looking for images for his company Crew’s web site and either did not find something he liked, or found images that were pricier than what he was willing to pay. (Crew’s business model is to link graphic designers with clients though crowd-sourcing: crew.co) He hired a photographer to shoot a custom photograph for the web site and then provided the outtakes in HD online, via Tumblr, at no cost, with the permission of the photographer, for anyone to use in whatever way they wished. The site experienced 20,000 downloads in its first two hours. (Cho, Medium.com)

From that beginning, the company has grown to over 250,000 images submitted by 40,000 contributing photographers, and enables over 10 million downloads per month. (Cho, Medium.com) Their downloading clients include Apple, Squarespace, Everlane, Slack, FB Workspace, to name just a few. (Unsplash.com)

Photographers from around the globe upload their pictures, which are edited by Unsplash curators, for inclusion on the site. All photographs uploaded to Unsplash enter Unsplash’s Creative Commons Zero license, equivalent to a public domain license, or copyright-free license. If a client-user clicks on a contributor’s photo and then onto their Unsplash profile, they will have access to all of the photographer’s uploaded pictures—whether they were selected by the curators for visibility on the platform or not—to download at no cost, and can copy, alter, or distribute them, or use them for products, prints, billboards, commercial advertising, editorial uses, or anything else, even to re-sell the image itself, though the company “discourages” this. They also say it would be nice to include credit for the photographer, but it isn’t necessary. (Boguslawska, Petapixel.com)

Unsplash identified and exploited an inefficiency in the marketplace for image licensing. First time clients and those unfamiliar with licensing creative content become ensnared in the “hassle” of obtaining a license for a photograph, the cost of paying for the license, and then adhering to the demands of the particular license they have acquired. All of this becomes unnecessary when using pictures under the CC0 license. Their process is entirely friction-free. End users do not even need to register on the Unsplash site to download its High Resolution photos.

So who benefits from this model and in what ways? For any person or company wishing the freedom to use high-resolution pictures at no cost, with no licensing restrictions, the benefits are clear: zero restrictions, zero cost. However, what’s in it for the photographers who are willingly uploading their images to Unsplash? Why do they choose to offer their pictures, no-strings-attached, to be downloaded for free in perpetuity? Cho remarks: “it’s this extreme level of giving that produces the unprecedented level of connection.” (Cho, medium.com) The theory on the photographers’ benefit is tied to the idea of generating exposure, building an audience and a following, and this attention—this “unprecedented level of connection” that Cho offers—may potentially lead to paid commission assignments for some photographers or collaborative projects that generate revenue by building business relationships. While theoretically possible, every submission makes for a potentially more robust free collection that effectively drains the industry of paid collaboration opportunities.

Cho does not “explain the potential impact that giving images away for free could have on the value of images.” Presumably no one will pay for an image if they can find a comparable one for free. (Risch, PDNonline.com)

If large companies with substantial budgets, such as those listed in Unsplash’s client list, are using this site to download for free instead of paying for a commercial license, won’t they continue to do so as long as they can find something usable on the platform? As Unsplash grows, more free photos will be available, making it less and less likely that any particular photographer is going to land one of these coveted paid assignments. And all the while the image downloaders are being acculturated to the normalcy and expectation of free. The traditional industry of copyrighted photography will suffer. How do you price your work to cover the cost of professional production, let alone cover living expenses in an environment that creates this level of pricing pressure? Sure – it is disruptive, but it is also destroying a creative industry’s viability.

What is conspicuously missing from interviews with Cho and blog posts about Unsplash, is the business strategy that is most certainly incubating to monetize what they have built. The web site is beautiful. It carries no advertisements. Web hosting, image curating, cloud space, API capabilities, and extensive marketing, are all being paid for by previously raised capital of $3.5 million. Raising capital would necessarily imply that a business plan for eventual capital gain has been shared with investors. Cho explains that Unsplash was originally a loss leader for Crew and that it isn’t currently making any profit. (Calore, wired.com)

As for Mikael Cho and his team at Unsplash, though he focuses his message on love of community, there may be a significant payday in his sights. Crew was sold in 2016 (amount undisclosed). His priority is now to work on Unsplash. With the volume of traffic coming into Unsplash it is a brand that has value for anyone looking to capture eyeballs or track user data. The end game could likely be a sale to a larger company or an IPO that is perhaps at odds with the community vibe (and the notion that it is all about sharing and being generous). The product (in this case, photography) is not the business. The business is clicks and data. It is an entirely different kind of disruption.

They are peddling free, but the story may hold a twist. As we examine this pattern, is it logical to see this relationship as one more piece in the machine that drives and reinforces income inequality? The creators of the product are complicit in the arrangement, providing their intellectual property at no cost with no copyrights, for a shot at being noticed by “the audience.” Big business does not have to pay and even makes money off the backs of free creative content producers. And as the arrangement proliferates, the likelihood of making real money for photography by the small individual producer diminishes, even as that product has true value to the companies that freely use it. Ultimately the value of customer data and audience-building may eclipse the value of pictures. That trajectory has nothing to do with the quality of the image, or whether the creator is amateur or professional. It simply creates a definite benefit for the commercial users who download free content, and a perceived, but questionable, upside for the generous contributor.

Sources:
McChesney, Robert W. (2013) Digital Disconnect: How Capitalism is turning the Internet Against Democracy, pp 63-95.

Cho, Mikael. (2017) Medium.com, “Hello Unsplash, Inc” https://medium.com/unsplash-unfiltered/hello-unsplash-inc-ce02b1c79d23 retrieved Sept 11, 2017

Cho, Mikael. (2017) “The Future of Unsplash” https://medium.com/unsplash-unfiltered/the-future-of-photography-and-unsplash-811f114aab7a retrieved Sept 20, 2017

Risch, Conor. (2017) PDNonline.com, “Unsplash CEO Tries to Justify Copyright Grab” https://pdnpulse.pdnonline.com/2017/08/unsplash-ceo-tries-to-justify-copyright-grab.html retrieved Sept 23, 2017

Boguslawska, Aleksandra (2015) “Why Unsplash is Hurting Photographers”, Petapixel.com, https://petapixel.com/2015/01/08/unsplash-hurting-photographers/ retrieved Sept 13, 2017

Cho, Mikael. (2017) “I Started Unsplash”, https://medium.com/@mikaelcho/i-made-unsplash-and-now-i-m-making-a-book-eaa8e947ad78 retrieved Sept 18, 2017

Calore, Michael. (2017) “The Web’s Premiere Free Photo Library Opens Up Its Vaults”, Wired.com, https://www.wired.com/2017/05/unsplash-api/  retrieved Sept 23, 2017

Unsplash.com, https://unsplash.com, “About” retrieved Sept 11, 2017