“I have had my results for a long time: but I do not yet know how I am to arrive at them.” - Carl Friedrich Gauss
During the winter of 2018, I decided to work through the SKLearn library in Python and try to rigorously understand most of the algorithms implemented in it. To my surprise, I found that it was rather difficult to find accessible but comprehensive explorations of the algorithms; most blogs skimmed over the mathematical underpinnings of machine learning, and most papers presupposed great familiarity with the field.
This section of my blog is devoted to exploring ML in a way that is rigorous but still practical and accessible to a relatively broad audience. Obviously, not every reader will be interested in every aspect of each post. You might want to simply gain a practical understanding of when to use a certain clustering algorithm; or you might want to learn why expectation- maximization optimization really works. However, I’m hoping that most people will find something interesting in these posts.
I think there are at least two reasons that it’s worth deeply understanding ML algorithms:
- First, it’s fun! The math behind statistical inference and machine learning is really cool. The people who developed it were remarkably smart and creative.
- More practically, it will allow you to write more effective code. It’s much easier to figure out why your model isn’t working if you actually understand how your model works.
What Will The Posts Look Like?
In order to let readers easily jump around, I will loosely divide each post into the following four sections:
- Motivation: I’ll probably start with a mix of examples to understand when and why an algorithm is useful, and then describe how the algorithm works intuitively.
- Derivation: We’ll do some math to prove the algorithm works!
- Implementation: To understand techniques from top to bottom, I’ll implement them from scratch (i.e. without using pre-existing Machine Learning libraries).
- Practical Use: I’ll discuss how you might practically go about using these algorithms (i.e. using libraries like SKLearn, Tensorflow, Pytorch).
Throughout, I’ll try to focus extensively on use cases of each of the algorithms, i.e. the kinds of situations in which you’d use a Naives Bayes Classifier over a SVN or vice versa. Admittedly, not every post will follow this exact structure, but each post should generally cover these kinds of information.
My (tentative) plan for this section of my blog is to work through a diverse selection of machine learning algorithms in order of complexity.
- I’ll start with a couple of quick refreshers on probability theory and probablistic modeling. If you’re just starting to learn about Data Science, these might be worth reviewing.
- I’ll spend a lot more time discussing canonical techniques in statistical inference, including Maximum Likelihood Estimation & Information Theory, Bayes Estimators, Generalized Linear Models, and More. (If you don’t know what those terms mean, that’s okay - in fact, that’s the point!)
- I’m thinking that I’ll then talk about some non-parametric methods related to kernel density estimators - if you’ve heard of KNN-nearest neighbor classifiers/regression, that’s a
- Lastly, I’ll spend a while talking about more sophisticated deep learning techniques.
I’m always learning, and I’m sure I will make mistakes in the blog. If you find inaccuracies in my posts, please let me know either by opening an issue on GitHub (preferred) or emailing me at firstname.lastname@example.org.