Applying math to solve different problems.

Approximate Bayesian Computation with Wrocław trams


Introduction

Recently, I presented some results from the field of Simulation-Based Inference at the Statistical Journal Club at the Astronomical Observatory. While modern neural networks allow the creation of some incredible models that allow for the Likelihood Free Inference (LFI), there are also some more old-school examples from the field of statistics. Those were usually labeled as the Approximate Bayesian Computation (ABC) methods. Here, I would like to write something about the classical implementation and apply it to one of my favorite problems known as the German tanks problem (although here there will be trams).

Read more ⟶

GMVAE clustering applied to RNA sequencing


Introduction

One of many classical tasks of machine learning is clustering, based on the data one would like to distinguish some clusters. There are many classical approaches to clustering, the most notable ones include: K-means, GMM (Gaussian Mixture Model) trained with the EM algorithm, and DBSCAN. While each of those algorithms is widely used in research and industry, all of them try to cluster the data using its original representation and mostly fail in the case of hidden similarity between observations.\ In this post, I present a GMM+VAE deep learning architecture, which will be used to cluster the data based on a learned embedding of the data. The clustering model is based on this paper, while application to RNA sequencing data is based on the work that I performed during my studies at the Warsaw University. The original project with a solution can be found in a related GitHub repository. The dataset used in the study was taken from the NeuroIPS 2021 competition. In this blog, we will tackle the joint embedding part of the competition. We will try to embed biological information in an unsupervised manner and at the same time reduce the impact of batch effect on the model performance.

Read more ⟶

LSTM vs Tolkien


Introduction

This notebook is greatly inspired by this notebook by G. Negrini. Please check out his website, as he make really good content. In original notebook, the Keras library was used. This approach is based on duo Jax + Haiku from DeepMind. We will try to distinguish between drug names and Tolkien characters using simple LSTM model, surprisingly it isn’t as easy as one can think. If you want to challenge yourself, here is popular website with great quiz.

Read more ⟶