Back to blog
Dec 12, 2025
6 min read

The Anatomy of Scalable Systems: Architecture, Access Patterns, and Resilience

An in-depth analysis of the differences between Software Architecture and System Design, focusing on data flow, read/write patterns, and strategies for building resilient software beyond just choosing tools.

There is a common confusion in our industry when hearing the terms “Software Architecture” and “System Design.” They are often used for the same thing, but they are not the same.

Software Architecture is the high-level strategy. It is deciding the foundations and the rules of the game: will it be a monolith or microservices? How will the pieces communicate? These are the decisions that are extremely expensive to change once taken. On the other hand, System Design is the tactics and logistics. It is the art of defining the components, interfaces, and specific technologies (databases, caches, load balancers) to satisfy concrete requirements of scalability and reliability. Architecture is the city; system design is deciding where the pipes go and what they will be like.

Many developers believe that designing a system is choosing tools: “We will use Kubernetes, MongoDB, and Kafka.” But when you see a real senior engineer working, you notice something curious: tools are the last thing they mention. Before drawing a single box on the whiteboard, they stop at a fundamental, almost philosophical question: How do users interact with the data?

The question that defines the system’s destiny

This question is what reveals the “Access Pattern.” Understanding the answer is what differentiates a system that scales indefinitely from one that collapses. Kleppmann and other experts in distributed systems agree that scalability is not magic, but the management of trade-offs. Fundamentally, we face a fork in the road: systems limited by reading (Read-Heavy) or systems limited by writing (Write-Heavy).

The challenge of massive reading (Fan-Out)

Let’s think about content distribution architecture like the Instagram feed. The challenge is Fan-Out: distributing the same data to millions of users. The standard solution involves the use of CDNs and distributed caches. However, naive cache implementation is dangerous.

In a famous technical paper from 2013 titled “Scaling Memcache at Facebook,” engineers from the social network described a critical phenomenon: the “Thundering Herd.” It occurs when a highly requested key in the cache expires; in that millisecond, thousands of requests pass through the empty cache and hit the database simultaneously, potentially causing a cascading failure. To mitigate this, Meta and other giants implement “Leasing” strategies or add “Jitter” (random variation) to expiration times, ensuring that not all data expire at once.

Furthermore, scaling reads forces us to confront the CAP Theorem, formulated by Eric Brewer in the year 2000. This theorem postulates that, in the presence of a network partition (connection failure), we must choose between Consistency (C) and Availability (A). When using read replicas, most modern systems opt for “eventual consistency,” prioritizing that the system responds (Availability) even if the data is a few milliseconds old.

The writing funnel (Fan-In)

The opposite scenario is equally fascinating and requires totally different tools. Let’s imagine IoT sensors sending telemetry every second or a real-time voting platform. Here we have a Fan-In problem: many trying to enter through a narrow door. The database becomes a bottleneck.

In these cases, attempting to process everything synchronously is a mistake. The robust solution is decoupling via message queues (like Kafka or RabbitMQ). We change the promise to the user: we don’t say “done,” we say “received, I’ll do it soon.” This flattens the load curve and protects the database.

However, queues bring their own poison if not handled with care. A critical concept that is often ignored is Idempotency. Networks fail and sometimes a queue delivers the same message twice. If that message is “charge €50,” you have a serious problem. Designing “idempotent” systems means your system must be capable of processing the same message ten times and only executing the action once. The robustness of a write system lies more in its ability to handle duplicates than in its raw speed.

And when even queues are not enough, we reach the final frontier: Sharding or fragmentation. This implies breaking the database into pieces (users A-M on one server, N-Z on another). It sounds great for scaling, but the operational complexity it adds is brutal. Suddenly, making a query that crosses data from both servers becomes a logistical nightmare. That is why the recurring advice from experts is to postpone sharding until it is absolutely inevitable.

Resilience and Observability: Operating in chaos

Finally, there is an aspect that separates academic design from the real world: the assumption of failure. A junior engineer designs for the “happy path”; a senior designs for catastrophe.

Here enter patterns like the Circuit Breaker. If a microservice starts failing or runs very slow, continuing to send it requests is counterproductive. The Circuit Breaker detects the failure and “cuts the power,” returning immediate errors without attempting to connect. It is a technique of systemic empathy: letting the fallen component breathe so it has a chance to recover, instead of drowning it with retries.

But to apply all this, you need eyes. Error logs are not enough. Modern systems depend on Observability and “Distributed Tracing.” You need to be able to see the X-ray of a request as it travels through the load balancer, crosses three microservices, writes to Kafka, and ends in the database. Without that visibility, optimizing or fixing a distributed system is like operating blindly.

Conclusion

Designing systems and defining an architecture is not about memorizing names of trendy technologies. It is about managing trade-offs. It is understanding that if you optimize for reading, you might complicate writing. It is accepting that total consistency is sometimes the enemy of user experience. The next time you face a design, do not start by installing libraries. Start by drawing the data flow and asking yourself: “Who reads, who writes, and what happens when all this fails?” That is the true essence of software engineering.

References and Recommended Reading for further depth:

  1. Kleppmann, M. (2017). Designing Data-Intensive Applications. O’Reilly Media.
  2. Nygard, M. (2018). Release It!: Design and Deploy Production-Ready Software. Pragmatic Bookshelf.
  3. BettaTech (2025)
  4. Nishtala, R., et al. (2013). “Scaling Memcache at Facebook”. USENIX NSDI.
  5. Brewer, E. (2000). “Towards Robust Distributed Systems”
  6. Hohpe, G., & Woolf, B. (2003). Enterprise Integration Patterns.
  7. Amazon Builders’ Library. “Making retries safe with idempotent APIs”.