The Essentials of Big Data Analysis

June 13th, 2012 | Job Search, Press Releases and Industry News | 2 Comments »

big data

An overview of Big Data analysis and what it can do for you!

Data is the indispensable commodity of our age. Businesses, research facilities, and government organizations all collect endless streams of data that, when correctly assessed, can mean unparalleled breakthroughs and growth. Organizations, realizing the true value of their data reservoirs, are willing to handsomely reward anyone capable of extracting refined nuggets of insight from these mounds of unrefined information. The rewards are definitely great.

For those skilled in big data analytics, the median salary is $90,000, and by the year 2018, there are expected to be 190,000 big data positions, many of which do not exist yet! So, if you are passionate about decoding hidden solutions from a treasure trove of information, you may have found your calling.

The Breakdown

The first step is to understand what it means to be working with “big data.” “Big” data refers to the vast quantity of information which we can now access, quantities that far surpass previously feasible database amounts. Less than a decade ago, a terabyte was an imposing amount of information.

Now, that same terabyte of storage is readily available on the consumer market through electronic retail chains & E-Commerce while companies regularly access and digest petabytes, exabytes, and zettabytes of information, only a sliver of which is structured.

Erin Bartolo, the Data Science Program Manager at the Syracuse University iSchool, says that the power of big data lies in the ability to handle quantities of unstructured data. She says that most information, “isn’t going in a relational database in a neat little row” because it is more complex than that.

In addition to numeric, quantitative information, big data analysis focuses upon unique, qualitative information. The goal of big data is to turn untapped information in the form of maps, images, weather patterns, physician’s notes, and a near multitudinous array of content into heretofore unseen business intelligence.

Traditional business intelligence solutions, which rely heavily on structured queries, are almost incapable of analyzing this trove of information without meticulous, time-consuming adjustments to handle even the smallest amount of semi-structured data. That is where the division is drawn and business intelligence allows big data analysis to take over.

To achieve more rapid, adaptive capabilities, developers began generating systems, applications, and frameworks like Hadoop, NOSQL, Oracle, and countless other programs to transform unstructured data into timely, fresh insight. Susan Puccinelli with Datameer, a Hadoop data analytics company, emphasizes the power that their program provides. She says, “Hadoop keeps all data in the raw form, so IT doesn’t have to go back and re-model and re-schema your data every time a business user wants to ask a new question or analyze a different set of data.”

The velocity of programs like Hadoop is phenomenal and provides savvy users with complex, indispensable information. This information can identify business trends, catch drug side effects in trial medication, use statistics to better combat crime rates, and provide countless other conclusions, all of which executives, scientists, and government officials are earnest to capitalize on. Using big data, you gain a glimpse into the bigger picture and in turn, need a broader, more ambitious mind to envision and predict the right solutions for the future.

Check out part two here!

By James Walsh

2 Responses to “The Essentials of Big Data Analysis”

  1. H.M. says:

    James, good insight. I think it is worth mentioning HPCC Systems which provides a single platform that is easy to install, manage and code too. Their built-in analytics libraries for Machine Learning and integrations tools with Pentaho for great BI capabilities make it easy for users to easily analyze and tame their big data. I believe HPCC is better than Hadoop and commercial offerings, it has a real-time data analytics and delivery engine (Roxie) and runs on the Amazon cloud like a charm through their One-Click portal. For more info visit:

  2. jameswalsh says:

    Thanks for the response. This goes to show the variety of analytics programs and the differing capabilities of each one. That is why it is important to have a command of several programs to advance your proficiency and versatility within the world of data science. We’ll keep an eye out on HPCC Systems and other programs as they come along.