Data Proliferation and a few simple ways to improve your global “bitprint”

Data Proliferation and how NOT to be “that guy” I’m going to (hopefully) stay on point here in this entry and avoid talking about specific software solutions and take more of a cosmic approach to the problem of data proliferation.

For those of you that are unaware, data proliferation is a big, big problem with most companies, especially in the following areas:

  • Storage
  • Backups
  • Network performance
  • TOI (Turnover of Information)
  • Litigation (especially during the discovery phase)

Here’s a pretty typical example that may surprise you (or make you roll your eyes or even worse).  Marketing person (why am I always picking on marketing people?) has a kickoff meeting to discuss a major new initiative and has prepared a very slick, 15 MB presentation (replete with huge graphics, charts, bells, whistles, hood ornaments, chrome, you name it).  She calls a meeting with all 20 of her reports, PM’s, etc where she goes into detail about all the aspects of the new initiative and sums it up by saying “I’ll send you all a copy of the presentation after the meeting…please feel free to forward it as appropriate”. 

When I hear this, I literally hear the sound of tires screeching or a needle being ripped across a record.

Let’s think about this in terms of data proliferation (see, so far no mention of any specific vendor)…

  • Marketing lady has a copy of the 15 MB presentation (and probably each individual graphic she used) on her laptop (let’s say this is around 100 MB total).
  • She sends an email to each person in the meeting with the presentation as an attachment (15 MB x 20 = 300 MB).
  • Each person downloads it to their laptop (another 300 MB)
  • Half of the people forward it to their peers (another 300 MB…stay with me)

What are we at now?  1 GB at LEAST? Let’s talk about backups…

  • Marketing lady’s laptop gets backed up every night, so there’s another 100MB
  • so do all the other folks: 300MB
  • the mail server where all these messages reside gets backed up nightly: 300MB
  • the .pst file for each person on the cc: list gets updated: 300MB

So here we are at 2GB, and I’ve probably missed a few hundred MB somewhere.  So, what would I do different?

Content Management, baby.

Take your pick…open source solutions like joomlaalfresco or nuxeo, and proprietary tools like SharePoint andDocumentum do the job nicely.  Simply put the file in the CMS, send out the link after the meeting and you reduce your company “bitprint” (yes, I just coined a cheesy phrase) drastically.  Your admins will love you.