Digital Data: The Flood of Virtual Water
We all know our digital society generates a lot of data. However, it is hard to objectively imagine how incredibly massive the amount of data we're generating really is. Most generated data is unstructured and comes from seemingly disparate sources. But just as a lake exists as an ecosystem, “Big Data” can also be said to be an ecosystem – and we'll explain why.
An IBM marketing report titled, "10 Key Marketing Trends for 2017" states, “Every day, we create 2.5 quintillion bytes of data. To put that into perspective, 90 percent of the data in the world today has been created in the last two years alone – and with new devices, sensors and technologies emerging, the data growth rate will likely accelerate even more.” This means, on the average day, we're creating 2,500,000,000,000,000,000 bytes of data.
Even still, this does not give us much scale to fully comprehend the sheer magnitude of the amount of data being created. So, let’s go on a little journey. We'll borrow a phrase from Albert Einstein...
A Thought Experiment
Since both water and data can be calculated as having a finite volume, we will merge the two through an abstraction. According to the pharmaceutical industry, one drop of water is 0.05 ml. Equivalently, then, there are 20,000 drops in one liter. In turn, one “byte” of data, conceivably the smallest quantity possible, exists as a single 0 or 1.
Given this information, we will define one drop of water as equal to one byte of data, allowing us to create a new entity much easier to visualize that we'll call, "Virtual Water. If one drop of water equals one byte of data, in the span of one day, we are generating 125,000,000,000,000 liters of virtual water.
To provide scale, let's look at the amount of water contained in "The Great Lakes." Around the year 2000, The Great Lakes contained 21% of the world's surface fresh water. That's 5,472 cubic miles (22,810 km3) or 6,025,300,000,000,000 U.S. gallons (22808300000000000 liters). If one drop of water equals one byte of data, in the span of six months, the world conservatively generates enough data to fill the Great Lakes. How thrilling – yet at the same time, a bit mind numbing.