Are you being bombarded with emails about “big data”, weekly if not daily? Do you hear the term big data being tossed about as if everyone should know what it is? Then surely, you must be asking yourself – What is big data anyway? This whitepaper will help answer that question through a high level overview of this concept addressing:
No one is quite sure who coined the term “big data” and began using it in the context we see today. The earliest hints are attributed to John Mashey at Silicon Graphics in the early 1990s1. In its simplest form, big data is nothing more than a large collection of data residing in storage databases. The key attribute is that the amount of data stored exceeds the organization’s storage and computing capacity, overwhelming the organization and inhibiting the ability to analyze the data for decision-making purposes.
The magnitude or extent of big data is dependent on an individual organization’s computing and storage capacity. Today typical big data magnitudes are in the order of terabytes (trillion or 1,000,000,000,000 or 1012 bytes) and petabytes (1015), with exabytes (1018) and zettabytes (1021) not far away. To put it in perspective, 1 terabyte equates to the data from 2,000 hours of CD-quality music while 10 terabytes equates to the data found within the entire U.S. Library of Congress’ print collection. Two petabytes could store all U.S. academic research libraries. For a deeper and somewhat fun perspective on what these magnitudes represent click here.
Industry expert IDC (International Data Corp) describes the growth in data as being a five dimensional phenomenon2:
Today, data is viewed quite differently than years ago. Traditionally, data had to be provided in a “structured” (numerical) format in order to be mined and analyzed. By some estimates, structured data today represents about 10 – 30% of a company’s data. The remaining 70 – 90% of data is considered “unstructured,” meaning freeform text, images, audio and video4. Unstructured data comes from websites, correspondence, customer service center records, social media, customer complaints and many other sources. It is contained in document repositories, emails, spreadsheets, audio/visual files, social media sites and texting channels. The availability of unstructured data and the ability to extract meaningful information from it is a significant difference adding to the big data/big analytics phenomenon.
Due to relatively limited computing capacity historically available, organizations have been confined to analyzing subsets of data for decision-making purposes; or they were limited to simplistic analysis because the volume of data overwhelmed their processing platforms. Organizations could be overlooking important trends due to finite processing power, storage capacity or tools to effectively analyze the extent of such data. Today, however, we have specialized software tools and extensive computing processing power to tackle this problem thanks to the emergence of big data analytics.
Big data analytics is the expedient processing of large volumes of data using new hardware and software technologies to extract meaningful information to make better decisions5. Correlations that were never possible may be developed if there is a large enough dataset, appropriate analytical tools and sufficient computing power. Hidden patterns emerge allowing for better decision making and assessment in fields such as business, healthcare, epidemiology and so on.
A number of new technological advancements that are enabling organizations to make the most of big data and big data analytics include6:
The benefit of this capability is better business decisions through the analysis of whole datasets instead of smaller subsets in a fraction of the time – in minutes or hours compared to days or weeks.
Big data and big data analytics are application independent meaning that the data can be generated and the analytics can be used by a variety of applications.9 The vision for big data is that organizations will harness more relevant data, apply analytical tools and use the results to make the best decision. According to a survey conducted by industry expert Economist Intelligence Unit in June 2011, there is a strong link between an organization’s effective data management and its financial performance. Eric Brynjolfsson, an economist at the Sloan School of Management at the Massachusetts Institute of Technology, found that companies that adopted data-driven, decision making achieved productivity boosts of 5-6%10.
Aside from potential increases in profits, efficiency and national safety, big data is giving rise to many economic opportunities such as:
As the old adage goes “knowledge is power.” Big data and big data analytics are tools that facilitate the acquisition of knowledge so clearly are a powerful resource delivering game- changing insights and competitive advantages. As big data changes the rules of the game for organizations of all sizes it has become an industry unto itself. Companies have no choice but to participate in the opportunity or risk losing market share and falling behind.
Looking ahead, this is a trend that’s here to stay. The relative affordability and availability of such analytical tools allows organizations of all sizes to significantly improve their decision- making and business plan execution. Big data means big opportunities, and the companies willing to leverage these tools will be positioning themselves to best compete within their industry segments both in the near and long term.
To learn more about how OneBeacon Technology Insurance can help you manage online and other technology risks, please contact Dan Bauman, Vice President of Risk Control for OneBeacon Technology Insurance at firstname.lastname@example.org or 262.966.2739.
1 Lohr, Steve (February 1, 2013). “The Origins of ‘Big Data’: An Etymological Detective Story.” New York Times. Retrieved October 14, 2013. http://bits.blogs.nytimes.com/2013/02/01/the-origins-of-big-data-an-etymological-detective-story/?_r=0
2 “What is Big Data?” SAS Website. Retrieved October 2013. http://www.sas.com/en_us/insights/big-data.html
3 Ibid 2
4 Pope, David. “From Big Data to Meaningful Information”. SAS. Retrieved October, 2013. https://www.sas.com/content/dam/SAS/en_us/doc/conclusionpaper1/from-big-data-to-meaningful-information-106328.pdf
5 “Big Data Analytics – Why is it important?” SAS. Retrieved October 2013. http://www.sas.com/big-data/big-data-analytics.html
6 Ibid 2
7 Ramanathan, Deepak (2012). “Big Data Meets Big Analytics”. SAS. Retrieved October 2013.http://de.slideshare.net/deepakramanathan/big-data-meets-big-analytics
8 Ibid 4
9 Ibid 2
10 “Big Data Harnessing a game changing asset”. (September 2011). Economist Intelligence Unit. Retrieved October 2013. http://www.sas.com/resources/asset/SAS_BigData_final.pdf
11 Felman, Susan and others (June 2012). “Unlocking the Power of Unstructured Data”. IDC Health Insights. Retrieved October 2013. http://www-01.ibm.com/software/ebusiness/jstart/downloads/unlockingUnstructuredData.pdf
12 Rathnam, Lavanya (June 7, 2103). “How Big Data was used to find the Boston Bombers”. iCrunchData News. Retrieved October 2013. http://news.icrunchdata.com/post/2013/06/07/big-data-boston-bomber
13 “Building Believers – how to expand the use of predictive analytics in claims”. SAS. Retrieved October 2013. http://www.sas.com/en_us/whitepapers/building-believers-predictive-analytics-claims-106256.html
14 Ibid 11
15 Ibid 10
16 Manyika, James. (May 2011). “Big data: The next frontier for innovation, competition, and productivity”. McKinsey Global Institute. Page 8. retrieved October 2013. http://www.mckinsey.com/business-functions/business-technology/our-insights/big-data-the-next-frontier-for-innovation