In the last post, I wrote about how good it is to see OR linked as a skillset to data science. However, do note that OR is only one part of the DS skillsets. OR ≠ Data Science. How does an Operations Researcher transition to a Data Scientist?
Bayesian Data Analysis: Andrew Gelman from Columbia is running a course on Bayesian Data Analysis *right now*, with Google+ Hangout sessions. Looks very interesting.
Programming skills: see my previous post on learning R and Python - the languages of data science.
Big data architecture: in my experience, first understand the layers of a normal data warehouse architecture, then broaden to the enterprise BI architecture stack, then learn about the new bits for addressing the "big" aspect. I was fortunate to have led a fairly big project in this area, and had the opportunity to work with some great data warehouse architects and enterprise BI architects to learn a ton from them. I'm not sure what the best self-learning material is other than the typical read-a-lot. Wikipedia doesn't seem to cut it, and the best material that helped me aren't publicly available. Hmm...I will have to think about this - topic for another post perhaps. In the meanwhile, Pivotal seems to do a fairly good job in their blog to dumb down the explanation of the bits for "big" data technology in some practical terms.
Business skills: I think this only applies to academics (sorry for the generalisation). For the practitioners, i.e. OR people working in and with businesses, that's a fundamental part of our jobs.
There are a few things the O'Reilly book I talked about in the last post briefly mentions as suggestions for an OR person to learn more about: some of the new Bayesian / Monte Carlo Statistics methods, broad programming skills, data warehouse architecture for big data technology, and business kills "to be able to intelligently collaborate with (or lead) others on a data science team".
For those looking to upgrade, here are my quick thoughts on where to start.
For those looking to upgrade, here are my quick thoughts on where to start.
Bayesian Data Analysis: Andrew Gelman from Columbia is running a course on Bayesian Data Analysis *right now*, with Google+ Hangout sessions. Looks very interesting.
Programming skills: see my previous post on learning R and Python - the languages of data science.
Big data architecture: in my experience, first understand the layers of a normal data warehouse architecture, then broaden to the enterprise BI architecture stack, then learn about the new bits for addressing the "big" aspect. I was fortunate to have led a fairly big project in this area, and had the opportunity to work with some great data warehouse architects and enterprise BI architects to learn a ton from them. I'm not sure what the best self-learning material is other than the typical read-a-lot. Wikipedia doesn't seem to cut it, and the best material that helped me aren't publicly available. Hmm...I will have to think about this - topic for another post perhaps. In the meanwhile, Pivotal seems to do a fairly good job in their blog to dumb down the explanation of the bits for "big" data technology in some practical terms.
Business skills: I think this only applies to academics (sorry for the generalisation). For the practitioners, i.e. OR people working in and with businesses, that's a fundamental part of our jobs.