Monday, June 10, 2013

Big Data 101: Thinking beyond National security Agency

Federal government investments and initiatives have long shifted the needle on technology innovation and adoption. Almost every case study on government funded innovation has a mention of how internet has its genesis in defense department’s DARPA initiative.

Last week, American public woke up to the fact that NSA, CIA and other security agencies were gathering phone records of some/most/all phone-calls to and from the United states, setting off a large public debate, perhaps what the whistle blower Edward Snowden wanted in the first place. It was interesting to hear US Senators and Congressmen try to explain technical jargon like metadata and applications of big data to their constituents. About how the call records turned over to federal agencies were just metadata of calls and not actual content of calls.

Corporate IT executives and CIO’s have already begun to recognize the value that good data analysts can bring and this incident is only bringing renewed attention on the potential of big-data. An unintended consequence of this saga, perhaps the real silver lining here is for technologists. Now that the program is out in the open, I wonder if there will be argument to commercialize the “mechanics” to reverse-engineering some of the big data technologies being discussed. One can argue that similar technologies used by NSA and federal agencies to gather and analyze large volumes (“big data”) of metadata about telephone records, can be used by commercial organizations. Say to parse through large volumes of data required for in-silico research to speed drug discovery.

Big data spells big money, not just for corporations. Data Scientists are already among the hottest category of IT professionals in the market. Play this out against the immigration debate and the message to younger generation of technologists in America is clear: plan a career in data analytics and science!

Other Links of interest
  • While it is news now, the media has been talking about it for a while. NSA data center front and center in debate over liberty, security and privacy (Fox news article in April 2013)
  • NSA's Big Data Platform Faces Enterprise Test: Accumulo, the data storage software developed by the National Security Agency, has taken another step toward the enterprise market. Sqrrl, the startup launched by former NSA technologists to commercialize Accumulo, has teamed up with Apache Hadoop provider Hortonworks to combine their technologies.
  • NSA Reveals Cloud Plans, May Open-Source Some of Its Software
  • Hadoop is an Open Source Revolution: Federal Computer Week Interview "Hadoop, and a handful of open-source tools that complement it, has no equal when it comes to making gigantic and diverse datasets easily available for quick analysis using clusters of inexpensive computers"