Big Data, Artificial Intelligence, and You

At ODI, we spend a lot of time writing about the ever-increasing flood of data that our society produces. As scientists (which many of us in ODI happen to be), we tend to focus our thinking on the rapidly growing number of sensors deployed in every imaginable setting on our planet (and beyond!), producing endless streams of data – giving us a fundamentally new window into the workings of the world.

But there are also more and more devices much closer to home, producing rich and complex streams of data about who we are and what we do. This past holiday season, for example, you couldn’t shop anywhere (in person or online!) without seeing ads for in-home artificial personal assistants like Google Home or the Amazon Echo. These devices are connected to the internet, so they can do things like perform searches or do simple tasks using voice commands. They can even be used to control other network devices, like audio systems or home security systems. To use one of these devices, you simply say a specific “wake up” word or phrase, like “OK Google” or “Alexa,” which alerts it to whatever command follows.

That means that these devices are always on, and always listening. In order to recognize when their services are being summoned, they are constantly analyzing the things that are being said around them. Although there are conspiracy theorists who might suggest that there are dark agents using this a surveillance, there is no evidence that this is the case. However, as initially reported in The Information and subsequently covered in this excellent NPR piece, an Amazon Echo from an Arkansas home has been seized from a home as part of a murder investigation, in the hopes that there will be useful evidence to be found stored in its memory. It is unclear how much information is actually stored on the device itself, versus how much is retained in the servers at Amazon where commands are actually processed. (A search warrant was issued to Amazon, which has provided account holder information for the owner, but has not shared information from its servers.) This case, sure to be the first of many, raises interesting questions about privacy and security around these kinds of data.

Another area where Artificial Intelligence (AI) technology appears to be entering peoples’ daily lives is in the automobile industry, where the advent of self-driving cars is already upon us. Using a combination of cameras, lidar, and other sensors, these vehicles are aware of things happening in their surroundings in a way that far surpasses what a human can do. And the amount of data in current models has been estimated at 4 terabytes – or 4,000 gigabytes – for every 8 hours of driving (from These data are used to navigate the complex and constantly-changing world around the vehicle; however, it only takes a little imagination to realize that these vehicles are also incredible surveillance tools. If an accident were to occur anywhere in the vicinity of one of these cars – even if the car wasn’t involved at all – the data from its sensors could be used to produce an extremely accurate, multi-dimensional model of exactly what happened. As more and more self-driving cars take to the roads, it will only be a matter of time before they join the Amazon Echo as witnesses in court, providing critical evidence about what really happened.

It would be easy to take this information as evidence that the Terminator’s nefarious SkyNet system is just around the corner, but I don’t think that’s inevitable. However, it may be instructive to think about where our actions could lead as we begin surrounding ourselves with faster and more sophisticated sensors, and connecting the data streams they produce to artificially intelligent agents – without fully understanding how the data are being stored and analyzed, by whom, and for what purposes. We need to consider what privacy rules are in place, and what laws govern how such data can be used.

When we think about managing our personal data, we often focus on obvious things – like assigning different passwords to all our accounts, and keeping them private. We might also think about what information we are sharing (and with whom) on social media. The integration of AI technologies into our daily lives introduces a new “channel” through which our personal information can be collected, stored, and analyzed – but because we don’t necessarily log in to these devices when we go to use them, we may be less conscious of the fact that we are producing a very rich and detailed data stream. The potential for these technologies to make our lives more pleasant and efficient is unbelievable; but the potential for abuse is certainly there, and we need to be vigilant and well-informed citizens to ensure that we are creating the world we want to live in.



