Python and Machine Learning

Today, I read a little more in the HDInsight eBook and ran a word count MapReduce program on the HDInsight emulator. The really canned approach as presented in the book was not keeping my interest. I've been chomping at the bit to get back to doing some Python as it (along with R) is heavily used in data science.

I discovered that the Python Tools for Visual Studio (PTVS) have evolved very well, supporting intellisense, debugging features and in general making Python a first-class language supported by Visual Studio. That has been my IDE of choice for years and I'm feeling less inclined to go the Linux way these days as Microsoft has taking great steps toward open source. I've now got PTVS installed and am ready to go.

The other part of learning is knowing what path to follow. The web provides a candy store of options for someone wanting to get into data science in Python. One site I learned about, through someone's tweet, I think, is Machine Learning Mastery which is maintained by Jason Brownlee, a professional programmer who studied data science in a graduate degree program. Jason is the perfect colleague of anyone interested in getting into ML. He has written a guide for getting into the field, blogs regularly about ML and what beginners and practitioners should do. And he encourages noobs to reach out to him. Awesome!

Last year, I read Melanie Mitchell's book, Complexity: A Guided Tour, which was a great introduction to the possibilities of machine learning. So, this may be a great way to get my feet wet in data science. I really love the idea of writing a program that learns from experience.