Back to Blogs

Dark Data: What Lies Beneath

iceberg
Published on Jan 18, 2019

The information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships, and direct monetizing).- Gartner

Dark data is data found in log files and data archives stored in the data storage locations of large enterprises. Simply put, big data is like the visible, top part of the iceberg, while dark data is the larger, submerged region which is invisible to us. Categorizing dark data can be a complicated and tedious process. It usually varies from company to company as the data is unstructured. Some examples of unstructured data are customer information, website log files, financial information, raw survey data, account information, email correspondence, etc.

Using dark data to generate revenue is indeed appealing, but it comes with a few issues which need to be addressed. When you think of dark data as clutter in a hoarder’s house, the first problem becomes clear: Space. As unorganized data continues to grow, storage that could otherwise be used for valuable assets is used up. More storage means additional overhead costs, which is already a major concern in most companies, particularly in the era of big data. Apart from increased storage costs, massive volumes of unstructured or unorganized data may lead to serious security risks, as dark data may contain sensitive, proprietary information as well. Sensitive data needs to be handled within the legal and regulatory frameworks as data loss may lead to serious financial and goodwill repercussions. For example, a breach of a customer’s private records could lead to identity theft. A violation of the company’s confidential information, for example, data on product research and development, may take away the competitive edge of the company. These risks can be mitigated by evaluating and auditing the usefulness of these data for the organization followed by categorizing it into labels with access controls using strong encryption and security measures.

At present, enterprises have explored only a minuscule portion of the dark data universe for analytical purposes. Dark data analytics is taking a backseat in company priorities due to technology, process, and investment constraints. While it may prove to be quite costly to tap into dark data, the outcome may be worth the investment. Moreover, recent advancements in technology like pattern recognition, cognitive analytics, & computer vision are making it possible to explore the business insights dark data has to offer. With dark data, an organization can derive insights from currently untapped signals. These insights can be used to improve decision making in production, supply chain, business processes, sales & accounts, etc. and in some cases, even prevent a business from going bankrupt.


Contributors