First, he sees a Data Scientist as someone with an amalgamation of skills that used to be reserved only for academic institutions or large companies. Now, he sees that even small companies like Twitter and LinkedIn are hiring Data Scientists. Among the skills required to be a Data Scientist, he cites understanding/knowing:
- natural language processing
- what makes data big (unstructured text, unstructured pieces of data like videos, audio)
- how to summarize
- how to catagorize
- how to communicate all this easily, and,
- machine learning/artificial intelligence.
This is certainly a more granular list than the NSF’s high-level definition. I’m curious as to whether or not the skills listed in the NSF definition will be required by private industry, and, therefore, include librarians and information scientists. Magoulas’ list seems biased towards hard Computer Science. That leads me to speculate that “Data Scientist” may eventually be “owned” solely by Computer Scientists. (I admit to some selfish concerns here, as I was just starting to think of myself as a Data Scientist. Yet, while I have many of the skills Magoulas cited above, I am by no means a hard core Computer Scientist; I am very much an Information Scientist.)
On the other hand, Magoulas mentions that you have to “manage Big Data” (the “Data Manager” from my second post) but that you have to then “make sense of it” [it being the the data] (which is the Data Scientist). The human factor is still very much a “part of the process” (of decision making), in his view. That leads me to believe that librarians and information scientists will be a part of this new world of data sense-making.
What do you think the future holds for the Data Scientist role?