math Calculus and History I’ve been reading Steven Strogatz [http://www.stevenstrogatz.com/]' book, Infinite Powers, How Calculus Reveals the Secrets of the Universe [https://amzn.to/2K89pVM] and I’m marveling at how exciting it is to learn about how calculus came about, from Archimedes’ brilliant methods more than two millennia
math The Mathematics of Security Originally written on May 25, 2019. I’ve been reading Joshua Holden’s The Mathematics of Security [https://amzn.to/2K71Caz], subtitled Cryptography from Caesar Ciphers to Digital Encryption. The book begins with definitions of ciphers, plaintext, ciphertext, cryptanalysis and other common terms used in the general field of cryptography,
Pandas Cheatsheet helps memory retention A cheatsheet is like a reference page, with the difference that it provides a quick lookup of the most common features. I really like this pandas cheatsheet [https://www.dataquest.io/blog/pandas-cheat-sheet/] for that reason. When I haven't done data-wrangling or cleaning in a while, it helps me to
data science Diving into Data Science: My Last 2 Weeks of Attending Conference, Meetup & Workshop It's been a somewhat turbulent but interesting last couple of weeks. In my quest for education and knowledge in the data science field, I attended ODSC West 2017 [https://odscwest.pathable.com/] in San Francisco from November 4th through the 6th, and two local data science meet-ups, one a Discussion
Hypothesis Generation vs. Hypothesis Confirmation I'm just starting to get back to data science and glad to have found Hadley Wickham [http://hadley.nz/]'s and Garrett Grolemund [https://github.com/garrettgman]'s book, R for Data Science [http://r4ds.had.co.nz/]. It looks to be a good refresher for me. In reading the
Biodiversity and Data Science From a quick review of measurements of biodiversity that I found on Wikipedia, there appear to be at least a couple of mathematical formulas for calculating it. Yet I don't think of biodiversity in such an abstract way. To me the term relates more to how natural our earth environment
TIL: recodes and imputations When I get a data set to explore, the first thing I think about is the cleanliness of the data. If it's survey data, there may be inconsistencies in the collection method, especially if the data spans years of surveys. From reading the freely-downloadable textbook, Think Stats [http://greenteapress.com/
Posting Style I've been getting back into exercising my data science muscles, and as I do continue, I'd like to put together some interesting blog posts. But I don't want to wait until I've got a full data story together. So I'm going to start blogging shorter learnings in addition to longer
Musings on where I am and career plans It's been about two years since I started taking the Coursera data science specialization courses. I completed I think six of those courses. I still have the Regression Models, Practical Machine Learning and Developing Data Products courses to complete before doing a Capstone project. I'm not sure if I can
Services development Over the past six to nine months I've been gravitating toward services development in Java. Having worked in .NET/C# for most of the past dozen years, I have had to enable my frustration shielding as I learned not just a new language but mostly a new set of development,
Brainstorming a data science project I've finished the R programming course [https://www.coursera.org/course/rprog] and have now moved on to the next course in the sequence, Getting and Cleaning Data [https://www.coursera.org/course/getdata]. In this one, I'll have the chance to further practice with R, reading formats, slicing and
Data Science with R The John Hopkins' Coursera course on R Programming [https://class.coursera.org/rprog-005] started two weeks ago and I thought I'd share some of my thoughts on the course and my growing knowledge of R [http://www.r-project.org/]. The programming assignments in this course are, thankfully, challenging. I was
Data Skeptic I was able to get in a couple of workouts at the gym while on vacation this week, giving me an opportunity to catch up on some podcast listens. The Cloud of Data [http://cloudofdata.com/category/podcast/] podcast didn't have any new shows but I discovered a nice new
My Week in Review: Podcast, Resources and Warming up to R I had an interesting week in the world of Data Science this week. I started learning some R programming, continued to find new and interesting data science learning resources, and had an interesting coincidence occur relating to data science as practiced at my company. Predictive Analytics and Infer On Tuesday
Enrolling in a Data Science Specialization Program As I said in my first post [http://knowledge-from-data.ghost.io/learningdatascience/] on this blog, I want to write about my personal experience learning and figuring out data science. Over the past week I have had a breakthrough in clarity that was fueled in part by watching a colleague discuss
Python and Machine Learning Today, I read a little more in the HDInsight eBook [http://blogs.msdn.com/b/microsoft_press/archive/2014/05/27/free-ebook-introducing-microsoft-azure-hdinsight.aspx] and ran a word count MapReduce program on the HDInsight emulator. The really canned approach as presented in the book was not keeping my interest. I've been
Azure HDInsight As I thought about how I would get started digging into data science this past week, my idea was to lay some groundwork for some playful programming that would allow me to get some data analysis experience with a skill (programming) that I am already comfortable with. Then I received
Getting Started Learning Data Science I've been working at New Relic [http://www.newrelic.com] for over two years, where I've been developing the .NET agent [http://newrelic.com/.NET] which monitors performance of web applications. This year the company has released an analytics product called Insights [http://newrelic.com/insights], a data analysis tool