The data deluge refers to the increasingly large and complex data sets generated by researchers that must be managed by their creators with “industrial-scale data centres and cutting-edge networking technology” (Nature 455) in order to provide for use and re-use of the data.
The lack of standards and infrastructure to appropriately manage this (often tax-payer funded) data requires data creators, data scientists, data managers, and data librarians to collaborate in order to create and acquire the technology required to provide for data use and re-use.
This blog is my way of sorting through the technology, management, research and development that have come together to successfully solve the data deluge. I will post and discuss both current and past R&D in this area. I welcome any comments.
Do you have any additional definitions of data deluge?