System Design Interview — Study Notes VI — Designing Data Intensive Applications Book Notes
Notes on Concepts & Components to be used in a System Design Interview
Part II — Distributed Data
Scalability: data volume, read load, write load grows bigger than a single machine can handle — spread the load across multiple machines.
Fault tolerance / high availability
Latency: each user can be served from a datacenter that is geographically close to them.
Scaling to Higher Load
Vertical scaling / scaling up: buy a more powerful machine. Many CPUs, many RAM chips, many disks can be joined together under one operating system — shared-memory architecture. Cost grows faster than linearly. A machine twice the size cannot necessarily handle twice the load. Definitely limited to a single geographic location. nonuniform memory access (NUMA) — some banks of memory are closer to one CPU than to others.
shared-disk architecture — uses several machines with independent CPUs and RAM, but stores data on an array of disks shared between the machines which are connected via a fast network — network attached storage (NAS) or storage area network (SAN).
Horizontal scaling / scaling out — Shared-Nothing Architectures: each machine or virtual machine running the database software software is called a node. Any coordination between nodes is done at the software level, using a conventional network. distribution of data across multiple geographic regions — survives the loss of an entire datacenter. You can perform cloud deployments of virtual machines.
In some cases, a simple single-threaded program can perform significantly better than a cluster with over 100 CPU cores.
Replication Versus Partitioning
Two common ways data is distributed across multiple nodes:
Replication: Keeping a copy of the same data on several different nodes, in different locations.
Partitioning: Splitting a big database into smaller subsets — partitions so that different partitions can be assigned to different nodes — sharding.