Predictive Analytics – Rhinos, Elephants, Donkeys and Minority Report

By on December 23rd, 2015 in Blog Posts, Privacy & Security

The  IEEE Computer Society published “Saving Rhinos with Predictive Analytics” in both IEEE Intelligent Systems, and in the more widely distributed ‘Computing Edge‘ (a compendium of interesting papers taken from 13 of the CS publications and provided to members and technologists at no cost.)  The article describes how data based analysis of both rhino and poacher activity in concert with AI algorithms can focus enforcement activities in terms of timing and location and hopefully save rhinos.

Other uses of predictive analytics may not be as benign

For those outside of the U.S., the largest population of elephants (Republicans) and donkeys (Democrats) are in the U.S.– these animals being symbols for the respective political parties, and now on the brink of the 2016 presidential primaries, these critters are being aggressively hunted — ok, actually sought after for their votes.  Not surprisingly the same tools are used to locate, identify and predict the behavior of these persons.   When I was young (1964) I read a book called The 480, which described the capabilities of that time frame for computer based political analysis and targeting of “groups” required to win an election. (480 was the number of groupings of the 68 million U.S. voters in 1960 to identify which groups you needed to attract to win the election.)   21st century analytics are a bit more sophisticated — with as many as 235 million groups, or one per potential voter (and over 130 million voters likely to vote.).  A recent kerfuffle between the Sanders and Clinton campaign (NPR story) over “ownership/access” to voter records stored on a computer system operated by the Democratic National Committee reflects the importance of this data.  By cross connecting (data mining) registered voter information with external sources such as web searches, credit card purchases, etc. the candidates can mine this data for cash (donations) and later votes.  A few percentage point change in delivering voters to the polls (both figuratively, and by providing rides where needed) in key states can impact the outcome. So knowing each individual is a significant benefit.

Predictive Analytics is saving rhinos, and affecting the leadership of super powers. But wait, there’s more.  Remember the movie “Minority Report” (2002). This movie started on the surface with apparent computer technology able to predict future crimes by specific individuals — who were arrested to prevent the crimes.  (Spoiler alert) the movie actually proposes a group of psychics were the real source of insight.  This was consistent with the original story (Philip K Dick) in 1956, prior to The 480, and the emergence of the computer as a key predictive device.  Here’s the catch, we don’t need the psychics, just the data and the computers.  Just as the probability of a specific individual voting for a specific candidate or a specific rhino getting poached in a specific territory can be assigned a specific probability, we are reaching the point where aspects of the ‘Minority Report’ predictions can be realized.

Oddly, in the U.S., governmental collection and use of this level of Big Data is difficult due to privacy illusions, and probably bureaucratic stove pipes and fiefdoms.   These problems do not exist in the private sector.  Widespread data collection on everybody at every opportunity is the norm, and the only limitation on sharing is determining the price.  The result is that your bank or insurance company is more likely to be able to predict your likely hood of being a criminal, terrorist, or even a victim of a crime than the government.  Big Data super-powers like Google, Amazon, Facebook and Acxiom have even more at their virtual fingertips.

Let’s assume that sufficient data can be obtained, and robust AI techniques applied to be able to identify a specific individual with a high probability of a problematic event — initiating or victim of a crime in the next week.  And this data is implicit or even explicit in the hands of some corporate entity.  Now what?  What actions should said corporation take? What probability is needed to trigger such actions? What liability exists for failure to take such actions (or should exist)?

These are issues that the elephants, and donkeys will need to consider over the next few years — we can’t expect the rhinos to do the work for us.  We technologists may also have a significant part to play.
(2017 note — make that consider over the next few months, predictive analytics played a significant role in the 2016 U.S. elections.)

Image By EDUARDO SORTICA – “Fotografia própria”, CC BY 2.5