Variational Autoencoder (VAE)

December 31, 2022

Here I discuss one of the two most popular classes of generative models for creating images.

Manim (3blue1brown library for math animations) basics

February 20, 2022

A long time ago I had a dream of a website, where mathematicians could exchange their ideas in a visual form, not as "canned" formulae. At that point I was basically broke and did not have enough money/passion to create a decent library myself. Years later a friend of mine showed me 3blue1brown youtube channel, where a guy was creating beautiful understandable videos about mathematics (shame that I already learnt everything they taught, though). Recently I found out that he actually open-sourced the library he created for making those animation. This post is about it.

Lasso regression implementation analysis

February 15, 2022

Lasso regression algorithm implementation is not as trivial as it might seem. In this post I investigate the exact algorithm, implemented in Scikit-learn, as well as its later improvements.

How does DeepMind AlphaFold2 work?

December 25, 2021

I believe that DeepMind AlphaFold2 and Github Co-pilot were among the most prolific advances of technology made in 2021. Two years after their initial breakthrough, DeepMind released the second version of their revolutionary system for protein 3D structure prediction. This time they basically solved the 3D structure prediction problem that held for more than 50 years. These are the notes from my detailed talk on the DeepMind AlphaFold2 system.

Overview of consensus algorithms in distributed systems - Paxos, Zab, Raft, PBFT

October 03, 2021

The field of consensus in distributed systems emerged in late 1970s - early 1980s. Understanding of consensus algorithms is required for working with fault-tolerant systems, such as blockchain, various cloud and container environments, distributed file systems and message queues. To me it feels like consensus algorithms is a rather pseudo-scientific and needlessly overcomplicated area of computer science research. There is definitely more fuzz about consensus algorithms than there should be, and many explanations are really lacking the motivation part. In this post I will consider some of the most popular consensus algorithms in the 2020s.

Популярная статья о распределенных вычислительных системах, блокчейне и криптографии в контексте выборов

September 25, 2021

В контексте прошедших выборов многие люди, не являющиеся техническими специалистами, заинтересовались тематикой блокчейна, криптографии и т.п. Меня попросили подготовить популярную статью на эту тему.

How to configure a private OpenVPN server (+ client)

September 18, 2021

Late September is the time of parliament elections in Russia. 2 weeks prior to the elections Russian government has banned several major VPN providers in order to prevent users from accessing banned opposition websites, through which dissidents coordinate counter-measures against divide-and-conquer tactics, employed by the regime. In this post I'll explain, how to set up a basic OpenVPN server and configure a client to overcome this obstacle.

Data structures for efficient NGS read mapping - suffix tree, suffix array, BWT, FM-index

June 10, 2021

In Next-Generation Sequencing bioinformatics there is a problem of mapping so-called reads - short sequences of ~100 nucleotides - onto a full string that contains them - the reference genome. There is a number of clever optimizations to this process, which I consider in this post.

Why Huffman trees require a bottom-up walk to be optimal?

June 08, 2021

Why greedy algorithm wouldn't work for Huffman trees?

A case study of 20PiB Ceph cluster with 100GB/s throughput

March 15, 2021

Recently we deployed a Ceph cluster that might be one of the more powerful in Russia in terms of both throughput and storage capacity. I'd like to discuss nuts and bolts of that system in this post.

Blog version 4

July 13, 2019

I just released a new version of my personal blog http://borisburkov.net, this time powered by Gatsby.js.

Asyncio ecosystem

March 29, 2019

I have a very bad developer experience with Asyncio. It is such a messy and overcomplicated system that I studied it over at least 3 times now. I figured, it's time to cut my losses and write a post about it!

DeepMind - Презентация AlphaFold в EBI

February 07, 2019

Два месяца назад весь мир облетела новость, что DeepMind выиграл известное соревнование по предсказанию 3D-структур белков CASP, порвав всех биоинформатиков с впечатляющим отрывом. Многие люди из мира биотеха теперь пытаются осознать, 'что это было'? Революция или эволюция, наука или инженерия, талант или финансирование? Волею судеб я когда-то оказался совсем недалеко от этой области науки, поэтому потратил несколько дней чтобы разобраться в деталях - а между тем в EBI приехал наводить мосты ведущий инженер проекта Эндрю Сеньор из DeepMind.

Amazon Alexa

February 07, 2019

Послушал двух парней из кембриджского офиса Амазона, работающих над Алексой. Составил общее впечатление о том, каково оно - работать в Амазон.

Focal loss and Average Precision

November 12, 2018

A simple loss function for multiclass classification with multiple classes that beautifully deals with class imbalance

Карьера в империи данных - Лекция дата-инженера из Facebook

October 23, 2018

Этой весной на ML/AI-конференции в Microsoft Research я коротко обсудил вопрос построения карьера дата-сайнтиста в IT-компаниях с Зубином Гарамани, профессором сильнейшего инженерного факультета Кембриджа и директором лабораторий искусственного интеллекта в Убер. Зубин тогда объяснил, что от ваших научных регалий обычно зависит та позиция, на которую вы устраиваетесь на работу, и роль в компании. И вот в это воскресенье я получил подтверждение его слов от Марека Романовича, дата инженера в Фейсбуке в Нью-Йорке.

Postgres roles

October 09, 2018

Postgres authentication and permission system sometimes feels like a total mess to me. This is a recap of how it works.

Docker users and user namespaces

October 09, 2018

After taking a break from DevOps for a few months and switching to other fields, I would always forget the details of how users within a docker container map to users on the host machine. This is a condensed recap of user mappings that should save me time, upon switching the contexts.

OpenStack, Kubernetes and OpenShift crash course for impatient - Kubernetes

January 20, 2018

Kubernetes is a system for orchestration of containerized applications that can be used to deploy your microservice-based websites to the cloud. Kubernetes is created by Google, based on their internal orchestration system Borg (although, codebase is re-written completely from scratch). Kubernetes is written mostly in Go programming languages and is open-source.

OpenStack, Kubernetes and OpenShift crash course for impatient - OpenStack

January 19, 2018

OpenStack is a pretty old standard for describing cloud resources and interacting with them. Most of its APIs were suggested around 2012. It is "Open" because multiple vendors that provide cloud services (including Rackspace and Red Hat) agreed to use the same API for interaction with them and called it OpenStack.

OpenStack, Kubernetes and OpenShift crash course for impatient - introduction

January 18, 2018

Much like a junkie from a russian anecdote, who started shouting "Jiggers, cops!" when they brought him to the police station, EBI in 2018 suddenly discovered the existence of cloud technologies.

Traction

December 17, 2017

MOST STARTUPS DON'T FAIL BECAUSE THEY CAN'T BUILD THE PRODUCT. MOST STARTUPS FAIL BECAUSE THEY CAN'T GET TRACTION.

BurkovBA.github.io is online!

December 14, 2017

I've been procrastinating over my blog for almost a year. Initially I wrote it in Angular in early 2017 and re-wrote everything in React in the last couple of weeks. At last, following Github's "ship early - ship often" motto, I shipped it today. Probably the most challenging aspect of the whole work was to make Github pages play nice with React SPA - I'll tell you how in this post.

Энигма, часть 5 - "Бисмарк" и "дебютантка"

November 30, 2017

В ходе "Битвы за Атлантику" в 41-ом году немецкий флот пытался отрезать Великобританию от морского сообщения с континентом и Штатами. У немцев было превосходство в военно-морском флоте, и на какое-то время им даже удалось установить вокруг островов морскую блокаду.

Энигма, часть 1 - Что такое "Энигма"?

November 01, 2017

Что вообще такое эта знаменитая "Энигма", которую все так стремились взломать, и зачем она была нужна?

Энигма, часть 0 - Британия во Второй мировой

October 25, 2017

Прежде чем перейти собственно к теме повествования, криптографии и Блетчли-парк, я хотел сказать пару слов об участии Британии в войне - чтобы дать контекст.

Энигма. Анонс

October 21, 2017

Все смотрели "Игру в Имитацию"? Камбербетч, конечно, прекрасен, а в жизни, конечно, всё было не так. Этот пост про математиков и инженеров из GC&CS (Government Code and Cypher School) во главе с Аланом Тьюрингом, нашедших уязвимости в немецких шифровальных машинах "Энигма" и "Лоренцå" во Вторую мировую войну, и спасших тем самым десятки или даже сотни тысяч соотечественников.

Facebook license

September 25, 2017

Несколько дней назад Facebook изменил лицензии ряда самых популярных своих open-source библиотек React, Flow, Jest и Immutable.js на стандартную MIT.