Palantir has entered the 2008 VAST Challenge. We present an in-depth look at one of our challenge solutions as the first public example of the Palantir platform in action.
Two years ago, the IEEE began an annual conference called VAST (Visual Analytics in Science and Technology). The VAST symposium focuses on the fundamental research contributions and real-world application of visual analytics. As a part of the conference, the VAST Challenge allows teams to compete on delivering analytic solutions against a synthetic real-world dataset.
Each year, the organizers build a very vast (pun intended) dataset from scratch. The data is entirely fictional but mirrors real-world use cases and scenarios. This year’s dataset is about a new religious movement that started on an imaginary Caribbean island (cleverly titled Isla del Sueño, or “Island of Dreams”) situated between Florida and Cuba. There are four subsets of the synthetic data: the Wikipedia page for the movement and its associated edit and discussion pages; landing and Coast Guard interdiction records for boats leaving Isla del Sueño for Florida/Mexico; cell phone records from the island; and RFID tracking data from people in a building that was attacked with an IED.
The types of questions asked in the problem sets are qualitative questions that require answers backed by data. These are the sorts of questions that don’t yield answers using a machine-learning/data-mining approach nor can an unassisted human get these answers by simple inspection of the data. They require some sort of human-computer symbiosis to solve.
To solve the VAST problems, we assembled an ad-hoc team of analysts — composed of a mix of engineers, in-house professional analysts, and one senior executive — and asked them to use the Palantir Government software to extract insights from the data.
The results speak for themselves: the complete set of Palantir’s VAST solutions are available here.
Read on for an in-depth look at how we deconstructed and solved one of the problems.
Analyzing Cell Phone Networks or “Where’s Waldo Ferdinando?”
Mini-Challenge 3 asks a couple of questions about the command and control structure of the religious movement by examining cell phone data from the island:
- What is the Catalano/Vidro social network, as reflected in the cell phone call data, at the end of the time period?
- Characterize the changes in the Catalano/Vidro social structure over the ten day period.
We started out with a CSV containing 9,385 rows that look like this, logging ten-days of calls:
A tabular format like this is about as far from “visual” or “analytic” as one can get; not even a social-network-analysis-savant could glean something interesting from the data in that form. In addition to the spreadsheet, we were given the approximate geographic locations of each cell tower and the following intelligence (from the contest’s synthetic dataset):
“We have medium confidence that Ferdinando Catalano is identifier 200. Close relatives and associates he would be calling would include David Vidro, Juan Vidro, Jorge Vidro, and Estaban Catalano. We believe Ferdinando would call brother Estaban most frequently. We also believe that David Vidro coordinates high-level Paraiso activities and communications. ”
Using nothing but that spreadsheet and the power of Palantir, we were able to uncover secretive attempts by member of the Paraiso movement’s highest echelon to conceal their cell phone traces by switching devices.
Most importantly, the analytic team did this without needing to be experts in the technical aspects of social network analysis, database queries, or Java. Instead, Palantir allows users — people who have a deep understanding of how the world works but are not technically savvy — to get answers backed by data. So answers to high-level questions such as, “Who were these people talking to during those last three days of network silence?” are actually easy to extract from the dataset. We realize that this sounds awfully hand-wavy, abstract, and vague, but the videos and explanations show exactly how it’s done.
Here’s the two-minute long Mini Challenge 3 Video and full explanation. The ten-minute long Palantir VAST 2008 Grand Challenge video gives a brief overview of Palantir and then presents all four solutions.
Palantir Finance, our product that has not yet been released, brings a similar level of power, flexibility, and high-level analytics to the realm of finance. Stay tuned for a sneak peek of those capabilities in the coming months.
If you found that video as much fun as we do, check out the other demonstrations of our ability to make data come to life. The “Boat Video” about migration data does an especially good job of showing off Palantir’s relational analysis abilities and its open-platform geospatial integration. Each of the write-ups includes additional screenshots that aren’t in the videos.
Welcome to the future of analysis!