Lauren Chaparro and I were honored to be among the speakers at Strata RX 2012, O’Reilly’s conference on the use of big data in health care/medical field. Our talk was called “Doing Big Data All By Yourself: Interactive Data Driven Decision Making by Non-Programmers.”
I gave the first half of the talk, delving into the stark realities of big data implementations. The second half of the talk featured Lauren (a Palantir Forward Deployed Engineer on the health applications team) giving a live demonstration of workflows in a system that we recently implemented on top of Palantir Gotham, one of our data fusion platforms.
There is a lot of excitement in the marketplace around the availability and capability of big data processing technologies. But most of these technologies are building blocks, not complete solutions. Before health care providers can realize the promise of big data, they need to overcome at least three major challenges associated with implementing these technologies:
In the health care space, the end users are clinicians and researchers. Many big data implementations, however, feature interfaces best suited to data scientists and programmers. As a result, this technology is mostly used for doing aggregation and static dashboarding. This is a good start, but it falls short of the goal of putting the power to learn from big data into the hands of clinicians.
The punchline here is that the scarce resource in the big data domain (regardless of vertical) is talent—the talent to (a) do the complex system engineering and data science necessary to derive insights from data and (b) build the last mile of familiar, expressive, and interactive interfaces needed to truly take advantage of all that the data has to offer.
The second half of the talk focused on work we did in association with Center of Public Integrity. We put together a Palantir Gotham instance that integrated anonymized data from Medicare and various other data sources to show the potential of a fully integrated, interactive system.
The datasets involved were:
Since patient privacy concerns are paramount with this sort of data, Lauren Chaparro used simulated rather than real data to give a live demonstration of how different subject matter experts could perform a number of different workflows inside a single system:
We believe answering questions like those addressed in this demo is key to curbing waste, fraud, and abuse in our nation’s healthcare system and improving healthcare delivery. Through the integration of a variety of datasets at massive scale, our software can empower insurers, policy-makers, and physicians to pursue these kinds of hypotheses and derive actionable insights today, without turning to data scientists.