The Big Think

February 22, 2012

Numbers

Filed under: Technology — jasony @ 1:23 pm

Book Review: Abundance – WSJ.com: “If every image made and every word written from the earliest stirring of civilization to the year 2003 were converted to digital information, the total would come to five exabytes. An exabyte is one quintillion bytes, or one billion gigabytes—or just think of it as the number one followed by 18 zeros. That’s a lot of digital data, but it’s nothing compared with what happened from 2003 through 2010: We created five exabytes of digital information every two days. Get ready for what’s coming: By next year, we’ll be producing five exabytes every 10 minutes. “

5 Comments »

  1. I disagree with the term “produce” in some of these uses. When you take a photograph, you don’t produce any data; you are simply recording data which was already present in the physical world.

    Comment by greg — February 22, 2012 @ 2:05 pm

  2. Good point, Greg. And the resolution of the photo could arbitrarily increase the amount of storage w/o being a meaningful representation of the data in any given system.

    Still, I don’t think that photos represent all of the increase, so I think something is going on. Reminds me of Kurzweil’s assertion that in the future most of our new knowledge will come from sifting all the data and finding previously unknown connections and correlations.

    Comment by jasony — February 22, 2012 @ 2:08 pm

  3. True. A lot of data “produced” comes from some processing/resifting/deriving/etc of previous data. But its not obvious to me what is “new” data vs what is recording existing data. If you were smart enough, you could predict anything about the future state of the universe from the current state of the universe plus all the rules of how every particle in the universe behaves. But calling that derivative data is a stretch.

    Regardless, 5 exabytes is a lot of data, no matter where it came from or what it is. Even producing 5 exabytes of random bits would be incredible.

    Comment by greg — February 22, 2012 @ 2:17 pm

  4. A few off the top of my head places that are producing a lot of data:

    the LCH (holy cow amounts of data per second)
    Google
    emails/texts

    Hard to imagine 5 exabytes in 10 minutes, though. Maybe a lot of this is behind-the-scenes stuff that’s machine created?

    Comment by jasony — February 22, 2012 @ 2:26 pm

  5. Hm – I may disagree with Greg about this: when you take a photograph you are indeed producing a new set of data, even about an existing thing. Michelangelo’s David, for example, a famously multivalent work of art.

    The same statue exists immutable in time and space, and yet a photograph can convey David’s fear
    http://www.life2point0.com/WindowsLiveWriter/ExperiencingReality_E993/david280%5B3%5D.jpg
    … or his resolve…
    http://www.petergreenberg.com/wp-content/uploads/2008/02/david-michelangelo.jpeg

    And furthermore, although increasing the pixel resolution may arbitrarily increase the size without adding data, the opposite is often the case. A modestly-sized picture certainly conveys the message “this is Mich’s David.”
    http://assets.nydailynews.com/polopoly_fs/1.117480!/img/httpImage/image.jpg
    …but in a larger one you can click around and see Michelangelo’s craftsmanship, noting the veins in the neck and arm:
    http://www.playtingtime.com/wp-content/uploads/2011/11/s_David_2.jpg

    They’re just not there in the other work. These four photos are indeed four completely different sets of data, and in the future we may have readily available to us pictures that can reveal even more about this statue — it’s not difficult to imagine that ten years from now you could get images of David that reveal chiselstrokes that wouldn’t even be visible to a live visitor at the museum. All in all, I’m grateful to all the photographers whose skill and insight *produced* these four pictures, and of course a quick visit to Google shows that there are far more than four, which is a delightful turn of events in the history of data and knowledge.

    Of course, back to the original story, roughly 4.999 of those 5 exabytes every 10 minutes will be devoted to the Kardashians, so.

    Comment by barrybrake — February 28, 2012 @ 1:49 pm

RSS feed for comments on this post.

Leave a comment

You must be logged in to post a comment.

Powered by WordPress