Finding a needle in a haystack is a thankless endeavour. The more so when you add more hay to the mix, making the chances of recovering that elusive item even slimmer.
You can say that about data today too. Users, companies and governments are generating more data each day, even backing it up meticulously, but they often do not give enough thought about what to do with it afterwards.
Big Data tools can help plough through the data, including random and unstructured data, but even then, you need to know where to look. Indeed, you need to know that you actually have that data on hand in the first place.
Ask many large organisations if they knew exactly the entire store of data they had and you might not get a certain answer. Many suffer from server sprawl and over-extended, under-reported IT resources.
Last year, when Sony Pictures was attacked online, hackers made away with gigabytes of data. Its IT folks had previously had difficulty keeping track of all its hardware in 30 data centres. Like many organisations, the company’s IT infrastructure had grown to a point where there was not enough visibility on what was online.
Ironically, in an age where IT budgets seem to be tightened all the time, there is no lack of public infrastructure to tap on. Cloud computing has allowed many organisations to expand first, think later.
Scalability, affordability and agility are buzzwords today, but what of accountability? Or visibility of one’s IT resources?
Let’s not even bring in the shadow IT setups that employees increasingly use to run projects that cannot wait for IT departments. What happens to all those files on Dropbox, or action plans left on some hosted services somewhere on the cloud?
If you can’t find your data, you can’t mine it for information. Or for actionable intelligence, whether to better market future products or simply for new staff to get up to speed faster.
By 2020, there would be some 44 zettabytes of data created and copied in the digital universe, 10 times the 4.4 zettabytes in 2013, according to a study by research firm IDC last year.
However, only 5 per cent is actually “target-rich” data that is worth analysing, it notes. Actual high-value data? That’s even less, at just 1.5 per cent.
Finding that needle in the haystack may be even harder in the coming years, as more data gets generated, transferred and stored digitally.
The haystack analogy actually came from Veritas at its partner conference in Macau this week. The enterprise storage vendor is pitching its data management technology to give organisations a better view of where its data is stored.
They can then find out who owns it and assign meaning and importance to it. With that, organisations can begin to know where to mine their data for analysis.
Essentially, Veritas’ tools show where to look so there’s a better chance of finding meaningful information. Tagging one’s information, for starters, makes the important stuff easier to find.
It is a compelling sell, especially for organisations that have pushed their data out to the cloud over the past few years. They will be trying to make sure the data is not just available but available for the right uses.
That’s a strategy that could apply to a number of scenarios. Singapore’s push to be a smart nation, for example, would do well to make sure that data collected by an increasing array of sensors is not just sent to some servers on the Net and forgotten.
Besides analytics, another area that would benefit from improved visibility is privacy. For a company collecting information on customers, it has to make sure it complies with data privacy laws governing the storage of that data.
The financial industry has traditionally had strict rules; now other companies need to take heed of such regulations too.
A hotel, for example, will collect information on customers who come stay with it, but it may have to erase data that is no longer needed, if local regulations demand that. To do so, it has to know where that particular portion of its massive amounts of data is kept.
It’s true everyone on the planet is creating too much data to make sense of, thanks to the proliferation of cameras, phones and PCs. Cheap storage has allowed users to keep that data – photos, blogs or accounts – forever now.
Yet, the importance of managing that data is only dawning on many now. Without that, you’ll be hard pressed to find the needle in a haystack that is filling up with more hay all the time.