System Design Series: Sharding vs Replication
As someone who is studying system design, I’m constantly looking for ways to test and improve my knowledge. That’s why I’ve decided to write a series of articles on topics related to system design, such as database scalability and availability, as well as other related topics like load balancing, caching, and more. In doing so, I hope to not only solidify my own understanding of these concepts, but also provide helpful insights for others who are interested in system design.
I welcome any suggestions for future topics or critiques of my articles. My goal is to create content that is informative, accurate, and accessible to anyone who wants to learn more about system design. So please don’t hesitate to leave feedback or ask questions in the comments section.

Replication and sharding are two widely used techniques for handling the scalability and availability of large-scale databases. Both techniques involve distributing data across multiple servers, but there are significant differences in how they work and in which cases they are more appropriate.
Replication is the exact copying of data from one database server to another. Each replica is kept synchronized with the source server through replication mechanisms such as MySQL binlog or PostgreSQL replication. Replicas can be used to improve availability by distributing queries among replica servers and allowing the application to continue functioning even if the primary server fails. Additionally, replication can reduce latency and improve throughput by distributing read queries among replicas and reducing the load on the primary server. However, replication may require more hardware resources to keep replicas updated and may not be suitable for large volumes of data.
On the other hand, sharding involves dividing data into multiple partitions and distributing those partitions across different database servers. Each server is responsible for a specific set of partitions, and when a query is made, it is sent only to the servers that have the necessary data. This allows large volumes of data to be stored and queried efficiently, as each server handles only a portion of the data. Sharding can also improve availability, as the failure of a database server will affect only a portion of the data, and redundancy can be added using replicas for each partition. However, sharding can increase latency, as it may be necessary to access multiple servers for a single query, and partitioning may need to be reorganized as data grows or changes significantly.
To further improve availability and performance, it is common to use a reverse proxy (more about proxies here!) to distribute queries across multiple database servers or database replicas. The reverse proxy acts as an intermediary between the application and the database servers, routing queries to the appropriate servers and maintaining an active connection with each server so that it can quickly detect failures or disconnections.
In summary, replication is more suitable for improving the availability and performance of the database server, while sharding is more suitable for handling large volumes of data and ensuring queries are executed quickly. Both techniques can be used together to provide a complete solution for scalability and availability of large-scale databases, and the use of a reverse proxy can further improve the availability and performance of the system.