Fact: we are now surrounded by more information than ever before. As billions of people create data and even more devices sense, capture, interpret, and communicate this information, the era of big data is set to get exponentially bigger as the world’s population grows, society becomes more connected, and technology becomes faster and smarter than ever before.

What Is Big Data?

Data is everywhere. Literally.

Take browsing the internet. Every decision you make, every button you click, video you watch or article that you read – pretty much every action that you make online is being collected and stored as data somewhere. This is taking place for every person that uses the internet, every second of every day. In 2019, the number of internet users worldwide was estimated at nearly 4.4 billion, or 58% of the global population.

It’s easy to see how data has gotten so big, so fast.

Of course, big data isn’t just about surfing the web. Everything from smartphones (which are currently estimated at somewhere around 2.5 billion worldwide) to connected smart home devices are constantly collecting data about our habits, activity, behaviors, and preferences. There are even devices that track our biological statistics (hey fitness trackers 👋). The result of this 24/7 data collection is an enormous data profile on pretty much everyone and everything.

And it’s not just human-centric data that’s getting bigger.

The advent of super-fast wireless communication, increasingly intelligent machines, and the rapid growth of cloud computing technologies paired with sensors that can detect just about anything, have meant that for the first time ever we’re now collecting very large amounts of real-time data on anything from the local environment to visual data collected from the observable universe, and everything in between.

In case you’re still wondering just how big big data is, think about the type of data produced by each credit card or wireless payment made, every security camera that captures your image when you’re out and about, every like, dislike, comment, and share on social media, and every time you use an app on your phone creates. This barely scratches the surface of the big data environment that exists today.

Given the meteoric rise in smartphone ownership it’s likely that the amount of big data being created and captured isn’t going to slow down any time soon.

Data Before the Internet

Computer scientist John Mashey is widely credited with coining the term “big data” in the 1990s and used to describe a level of data that is so large that it can’t be handled in conventional means. Some 20-30 years later, it’s obvious that Mashey was on to something.

In the early days of computing, big data wasn’t really a thing. Sure, there was a potentially unlimited amount of data out there but actually getting to that data and processing it wasn’t possible with the technology of the time.

In the pre-internet days, collecting measurements and statistical information for large datasets involved a considerable amount of work. This work most likely involved physically going out and collecting information before being able to use it in a meaningful way.

Even once collected, the limitations of computing power and storage meant that data still had to be dealt with by humans or rudimentary machines until technology was able to catch up, culminating in the data and computing landscape we’re all familiar with today.

How Big Data Is Collected and Used

The data that we produce is referred to as our ‘data footprint’ or, somewhat less attractively, ‘data exhaust’. Given the ubiquitous nature of data, it can be hard to keep track of what data is being collected and where. Do you ever read a website’s Privacy Policy? Probably not and this is causing concern for privacy advocates worldwide.

The inability to control personal data is at the crux of contemporary debates about data collection and privacy. As our lives become more digitized, the items that we interact with both in and out of our homes continue to gather data – sometimes without our knowledge. Moreover, it’s very rare to know exactly what our data is actually being used.

For instance, take the so-called internet of things. Although many people may not think of their domestic smart appliance like light bulbs, voice controllers, or smart TVs as hungry, data-collecting machines these devices are very much harvesting personal data for use by the companies that own the technology. In fact, any network-connected interface or device that we interact with in either our homes or the outside world are collecting data about our behavior, including the choices that we do or do not make, on a minute-by-minute basis.

While they may make life a little bit easier, voice controlled assistants like Alexa or Google Home are actually pieces of hardware equipped with sensors to collect information from their external environment (you and your home) in the form of high-precision microphones, a connection to other connected smart devices on your local network (allowing it to monitor your usage of them), and a connection to the internet to send this information back to the host company (where the data is stored and used for analysis). When you make a purchase through, say, Alexa, the device is able to collect data from its surroundings and simultaneously learn from and influence your spending behavior through your buying data.

Implications of Big Data

Is big data a bad thing? Will the prolific collection of any and all personal data lead to a big brother style scenario?

Maybe. It all depends on where you stand on the ‘usage of data’ debate.

On the one hand, devices that collect data to “improve the experience they provide” may do exactly that for you as the user, learning the nuanced choices you make each day in a way that streamlines your life, benefits your productivity and well-being, and generally making things a lot easier.

On the other, there are those that view the continual collection of data from these kinds of devices to be a fundamental infringement of privacy that needs clear and controlled legislation to prevent unethical use, unacceptable intrusions of our personal lives, and inevitable exploitation of private information.

Many people agree that some form of tighter regulation over the collection and use of personal data is a fundamental requirement, and we now see this increasingly happen as data-protection legislation evolves and becomes more widespread. The EU’s wide-reaching GDPR legislations are an attempt to introduce laws to control how data is used. It’s up for debate how successful this initiative has been.

Others argue that too much intervention in such matters is hypocritical (given the data routinely collected by governments themselves) and can lead to freedom-of-speech issues and draconian policies that stifle progress and cripple dissent.

Either way, one thing is certain – big data is getting bigger by the day.

RELATED POSTS