Studied Mathematics, graduated in Cryptanalysis, working as a Data Scientist. Interested in algorithms, probability theory, and machine learning. Python user.

There exist a vast amount of great articles describing how Bagging methods like Random Forests work on an algorithmic level and why Bagging is a good thing to do. Usually, the essence is the following:

“You train a lot of Decision Trees on different parts of the training set and average their predictions into a final prediction. The prediction gets better, because the variance of the Random Forest is smaller compared to the variance of a single Decision Tree. (dartboard.png)”

— some article

Of course, I am paraphrasing here. The articles include great pictures, code, and many more thoughts. But…

Imagine that, for whatever reason, you want to do a diet consisting of apples and strawberries only. You don’t really favor one fruit over the other, but you want to make sure that you…

- get enough vitamin C and
- get enough calories.

In addition to having a different amount of vitamin C and calories, apples and strawberries also come with a different price tag. So it’s natural to ask for the **cheapest** diet that fulfills both of your *constraints*.

Let’s say that you want to consume at least **300 mg of vitamin C** (which is way too much, but it…

From time to time, you have to choose between two options. This might be in completely **uninformed** situations where you can pick the better option with a mere probability of 50%. In some of these uninformed cases, you can even boost this probability with a simple trick, as demonstrated in my other article.

However, usually, you are able to **gather some information** that helps you pick the better option with high probability. One easy, yet smart method to do this is *A/B testing* that you have probably heard of or even used already.

When working on regression problems, often you have target values that are continuously and evenly distributed in some range. Let me illustrate what I mean by this. Consider the following dataset 2-dimensional dataset:

In the machine learning community, I often hear and read about the notions of *interpretability *and *accuracy, *and how there is a trade-off between them. Usually, it is somewhat depicted like this:

A classic task for us data scientists is building a classification model for some problem. In a perfect world, data samples — including their corresponding labels — are handed to us on a silver plate. We then do our machine learning tricks and *mathemagic* to come to some useful insights that we derived from the data. So far so good.

However, what often happens in our imperfect yet beautiful world is one of the following:

In this article, I want to introduce you to a simple problem with an easy-to-apply, yet awfully unintuitive solution. It is one kind of *envelope problem* and goes like this

There are two envelopes with some different amounts x and y of money in them. The envelopes look exactly the same and are randomly shuffled before they reach your hands.

Recursion is an important concept in mathematics and computer science that comes in many flavors. The essence of them is the following:

There is an object that consists of smaller versions of itself. Usually there is a smallest, atomic object — this is where the recursion ends.

We are especially interested in solving problems using recursions. For example, sorting numbers or other elements, i.e. turning an input array like `[1, 4, 5, 2, 6, 3, 6]`

into `[1, 2, 3, 4, 5, 6, 6]`

.

This is a fundamental problem in computer science and has been extensively studied by many…

In mathematics, there are thousands of theorems to be proven. Often, we tailor unique proofs for one of these theorems — this can be beautiful, but extremely difficult at the same time. Think about proofs to theorems that involve **constructing** a desired object.

As a small example, consider the following “theorem”:

Another day, another classic algorithm: *k*-nearest neighbors. Like the naive Bayes classifier, it’s a rather simple method to solve classification problems. The algorithm is intuitive and has an unbeatable training time, which makes it a great candidate to learn when you just start off your machine learning career. Having said this, making predictions is painfully slow, especially for large datasets. The performance for datasets with many features might also not be overwhelming, due to the curse of dimensionality.

In this article, you will learn

- how the
*k*-nearest neighbors classifier works - why it was designed like this
- why it has these…