TechLunch #21: Distributed systems (30/05/2018)


Warning: theses notes are published raw, without any rewriting.
Attention: ces notes sont publiées telles quelles, sans retraitement particulier.

Talk #1: Distributed architecture of Algolia Search API


  • 160B total api call / month
  • 40B search api call / month

Cap theorem: you can only have 2

  • Consistency
  • availability
  • partition tolerance

The search is consistent in the end, but the results can be different at a specific moment.

Architecture : a mix of multi+mono tenant

A cluster = 3 servers


Read: replicas. Every machine can answer. Local operation only.

Write: distributed consensus


Redirected via DNS


If a machine is not enough to contain all the datas, go to a multi clusters structure.

Sharing datas with a key (userId). Mapping between userId and cluster. Ability to move a cluster without resharding.


Consensus algorithm:

Talk #2: Devialet phantom, a multi-room distributed system

Sound system. Different sounds in different places. Or pair 2 to make a stereo. Or play the same sound in all the house.

Micro-services. Sources and Renderer. In-host.

Lighter processes. Process isolation. No SPOF. Flexible design.

RPC methods + notifications + properties propagation

Service discovery. Very dynamic topology. Unreliable home network. Custom UDP broadcast protocol. Local registrar.

Aggregation. Adjust the behavior to the real world situation. System state is transient. Except system topology.

Flash storage on each device. No reliable clock. No guarantee on the topology. No backend. No Internet.

They added a storage box, to store configuration of topology, etc. Plc coordinator. Ethernet access.


Key learnings

  • Design for failure
  • Use standards as much as possible: Bonjour for host discovery
  • IPv6 is great
  • Multiplexing TCP connexions
  • UDP for time critical operations
  • timeout management is hard

Talk #3: Client side load balancing

Ancient times. F5 load balancer in front of servers.

Migration to HAProxy.

Optimisation: link aggregation on hardware side, compression on software side. HAProxy became SPOF.

Solution: client side load balancing.

Test 1: random subset round-Robin. Big difference between servers load.

Test 2: windowed round-robin. Good but not perfect.

Test 3: full mesh round-robin. Problem with RHP cache.

Different server generation -> align weight to server capacities.

Health checks

Migration to Consul to avoid multiple http call