Big Data

Most of them use big data software like Hadoop, MapReduce, or Spark to achieve distributed processing

Data munging is the process of converting the raw form of data into a form that is easy to study, analyze, and visualize. The visualization of data and its presentation are an equally important set of skills on which a data scientist relies heavy when facilitating managerial and administrative decisions using data analysis.

Search the internet for data science projects (Google quandl) and invest your time building your own forte, along with zeroing in on the areas that still require brushing up.

A data scientist is a team player, and when you are working together with a group of like-minded people, being a keen observer always helps. Learn to develop the intuition required for analyzing data and making decisions by closely following the working habits of your peers and decide what best suits you. Communication skills differentiate a great data scientist from a good data scientist. More often than not, you find yourself behind closed doors explaining the findings of your data analysis to people who matter, and the ability to have your way with words will always come in handy when tackling unforeseen situations.

Websites such as Kaggle are a great training ground for budding data scientists as they try to find teammates and compete against one another to showcase their intuitive approaches and hone their skills. With the rising credibility of the certifications provided by such sites in the industry, these competitions are fast becoming a stage to show to companies how innovatively your mind works. - done reading

GOTO Metrics, a two-year-old company (in 2011), renamed itself as Zettaset, raised 3 million dollars, and re-launched. They raise this money from Draper Fisher Jurvetson and EPIC Ventures to help employees make business decisions on real data as opposed to intuition. It still offers a software toolset that runs on top of a group of servers and unifies the existing databases so someone can mine them for insights. It’s those insights that will end up adding value to technology companies.”It’s not just about the technology anymore; it’s about the data and if you can produce more granular insights from your data.

Zettaset, whose name is an attempt to reflect the growing amount of data available — a zettabyte consists of more than a million petabytes — helps make implementing Hadoop on top of existing databases easier and also offers an user-friendly graphical user interface so people can then play around with the data. They can also export it to more familiar programs such as Excel spreadsheets.

Executives want to know what is being said about their companies or products on Twitter, Facebook, and other social media web sites. They should pay a company who already have connection with these social media companies (already have access to those data hoses) to analyze this data, or they should use SurveyMonkey. Is there a way for the company to push out a message, a survey to these social pages (web sites) or to the individual person? Reputation Defender, and there are other companies that focus on mining and analyzing data that are produced by government agencies on consumer behavior and habits. What about data that come from inside the company (such as product / equipment usage)?

How can we make big data analytics available to smaller companies who might not even have capability to collect intelligence data? Analyze their operations. Offer custom solution (if necessary) to help them collect the data. Have tools that are easy to use so that a BA (Business Analyst) can analyze the data. In effect, they rent the Hadoop guy and a BA from us.

The bigger opportunity isn’t in enabling this shift; it’s in what the shift can do for businesses and society. For example, making data available for the average Joe helped boost and is enabling consumer-facing startups such as energy consumption startup OPower to offer amazing insights really quickly. These insights can help change behaviors.

In the government arena, opening up data has the power to make government entities more accountable or even deliver results and insights that government can’t. There are also stories such as this one from the New Yorker, about a doctor in New Jersey that crunched data and then built a pilot program that reduced medical costs for the most-expensive patients in the city by more than 50 percent. It made people healthier too.

Better analytics of bigger data should enable more people to make the leap from intuition to insight — or even see an insight without ever having the intuition that drives them to look for data to back it up. It’s the difference between looking for treasure using a map of pirate sea routes and historical storm data and looking for treasure by trying to think like a pirate.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License