While you sit and gloat over your 256 GB phone with its 48 MP camera, do spare a thought for the poor engineers who have to figure out how to store all this data you’re generating. At last count, it was estimated that we humans generate about 2.5 quintillion bytes of data per day. That’s 2.5 billion gigabytes.
Where does it all go? How is it all stored?
The data that’s not on your phone ends up in the “cloud”. A somewhat abstract term that describes a networked storage mechanism that’s accessible over the internet. These cloud servers are, essentially, massive warehouses that draw an enormous amount of power and generate a lot of heat. They’re packed to the brim with data storage devices and bustling engineers in all shapes and forms.
But what if you could replace all of this with a silent, almost ascetic environment? What if a data warehouse could be shrunk down to the size of a pea? Well, Microsoft has done just that, after a fashion.
Working with the University of Washington, Microsoft has developed an automated DNA data storage and retrieval mechanism. Yes, that DNA, deoxyribonucleic acid, the structure that forms the template for virtually all life as we know it.
Eventually, this very same DNA could end up as the template to the structure for our digital lives. But we’re getting ahead of ourselves.
What Microsoft and University of Washington researchers have done is build a Rube Goldberg contraption of sorts that can read and write from and to DNA without any human intervention.
How’d they do it, exactly? And why does it matter?
Digital data is binary, consisting of 1s and 0s. A combination of eight 1s and 0s make up a “byte” and each byte can represent, say a certain character of the alphabet (a-z or A-Z), number (0-9) or special character ([email protected]#$%^&*). 1,024 bytes form a megabyte, 1,024 megabytes form a gigabyte, and so on. Given that data is binary, digital data can literally be stored in any medium that’s capable of maintaining two states, regardless of what those states are.
These states could be voltage (high and low), temperature (hot and cold), magnetic polarity (north and south) and even varying sound levels (loud or quiet). It’s so simple to transmit digital data that you could very well clap to transmit a digital photo. All you’d need is an instrument (a mic) to hear the variation in volume level and software to interpret your clapping as binary data. Such a mechanism would be ludicrously slow and inefficient, not to mention how mind-numbingly dull it would be to share photos this way, but that’s beside the point.
Clearly, the issue then is about speed and efficiency, which is why we have chosen to store data in magnets (hard disks) and transmit it via electricity (voltage).
Getting back to the subject of DNA, DNA is essentially a structure for data storage in all things living. Four chemical bases — adenine (A), guanine (G), cytosine (C) and thymine (T) — make up the code that gives structure to all life. As with binary digital data, the sequence of the A, C, G, T bases is the information needed for life to exist. In theory, you could pack in over 200 petabytes (PB) of data into a single gram of DNA. Including backups, Facebook is estimated to be storing over 300 PB of data in vast data centre warehouses scattered all over the world.
If one could manipulate the sequence of these base pairs, one could store data in DNA. Many scientists and researchers have already tried and succeeded in doing just that.
However, the process essentially involves a bunch of highly trained scientists running around with flasks and pipettes and carefully measuring out quantities of chemicals. This is hardly the ideal setting for a data centre serving the needs of seven billion humans.
Microsoft’s automated approach eliminates the need for this hustle and bustle and provides a system which, to the user, is no different than storing data in the cloud. You send data, data is stored. You ask for data, you receive date. Simple.
It’s a revolution! Or is it?
Microsoft’s just solved the world’s data problems. Or has it? Sadly, the answer to that is no, it hasn’t. This fancy automated system of their’s managed to store a 5-byte message — which simply reads ‘HELLO’ — and retrieve it in 21 hours.
In the 15 seconds you spent reading the above paragraph, netizens consumed 300,000,000,000,000 bytes of data. In 21 hrs, they will consume a whopping 1,512,000,000,000,000,000 bytes of data (30 percent of it is porn, in case you’re wondering).
While you’re lamenting that measly 1 MB/s that your ISP has deigned to give you, this DNA-based system will only serve you at 0.00000006 MB/s.
Microsoft’s Rube Goldberg contraption isn’t the future just yet, but it’s certainly a byte-sized teaser of an exciting one.