The authors of a 2005 National Science Foundation (NSF) report defined five actors in data management: data users, authors, managers, scientists and funding agencies. Today, I will examine the data scientist vs. the data manager.
First, what are the shared goals of the five actors in data management?
- ensure that all legal obligations and community expectations for protecting privacy, security, and intellectual property are fully met;
- participate in the development of community standards for data collection, deposition, use, maintenance, and migration;
- work towards interoperability between communities and encourage cross- disciplinary data integration;
- ensure that community decisions about data collections take into account the needs of users outside the community;
- encourage free and open access wherever feasible; and
- provide incentives, rewards, and recognition for scientists who share and archive data (NSF, 2005).
In order to fulfill these goals, an organization will need one or more individuals who can fulfill the role of data scientist and data manager. I say, “one or more”, simply because I believe that at one time or another, a researcher may find him- or herself acting as the sole data user, author, manager, and scientist.