TechLunch #21: Distributed systems (30/05/2018)

30/05/2018

Warning: theses notes are published raw, without any rewriting.

Attention: ces notes sont publiées telles quelles, sans retraitement particulier.

Talk #1: Distributed architecture of Algolia Search API

Numbers

Cap theorem: you can only have 2

The search is consistent in the end, but the results can be different at a specific moment.

A cluster = 3 servers

Read: replicas. Every machine can answer. Local operation only.

Write: distributed consensus

Redirected via DNS

If a machine is not enough to contain all the datas, go to a multi clusters structure.

Sharing datas with a key (userId). Mapping between userId and cluster. Ability to move a cluster without resharding.

Sound system. Different sounds in different places. Or pair 2 to make a stereo. Or play the same sound in all the house.

Micro-services. Sources and Renderer. In-host.

Lighter processes. Process isolation. No SPOF. Flexible design.

RPC methods + notifications + properties propagation

Service discovery. Very dynamic topology. Unreliable home network. Custom UDP broadcast protocol. Local registrar.

Aggregation. Adjust the behavior to the real world situation. System state is transient. Except system topology.

Flash storage on each device. No reliable clock. No guarantee on the topology. No backend. No Internet.

They added a storage box, to store configuration of topology, etc. Plc coordinator. Ethernet access.

C/C++

Ancient times. F5 load balancer in front of servers.

Migration to HAProxy.

Optimisation: link aggregation on hardware side, compression on software side. HAProxy became SPOF.

Solution: client side load balancing.

Test 1: random subset round-Robin. Big difference between servers load.

Test 2: windowed round-robin. Good but not perfect.

Test 3: full mesh round-robin. Problem with RHP cache.

Different server generation -> align weight to server capacities.

Migration to Consul to avoid multiple http call