Microservices Interview Questions
Microservices come up constantly in backend, cloud, and system-design interviews, where interviewers probe trade-offs, communication patterns, and failure handling. These are the questions they actually ask, with concise answers you can speak confidently.
17 questions with concise, interview-ready answers.
1. What are microservices, and how do they differ from a monolith?
Microservices are an architectural style where an application is built as a suite of small, independently deployable services, each owning a single business capability and communicating over the network. A monolith packages all functionality into one deployable unit that shares a single codebase and database. Microservices give independent deployment, scaling, and technology choices per service, but add network calls, operational complexity, and distributed-systems challenges that a monolith avoids.
2. What are the main benefits and drawbacks of microservices?
The benefits are independent deployability, the ability to scale individual services, fault isolation, smaller focused teams, and freedom to pick the right technology per service. The drawbacks are significant: distributed-systems complexity, network latency and partial failures, harder testing and debugging, eventual consistency across data stores, and heavier operational and monitoring overhead. They pay off for large, evolving systems but are usually overkill for small applications.
3. When should you NOT use microservices?
Avoid microservices for small applications, early-stage products, or small teams where the operational overhead outweighs the benefits. A common recommendation is to start with a well-structured monolith and split out services only when you hit real scaling, team-coordination, or deployment-independence pressures. Premature decomposition forces you to pay the distributed-systems tax — network failures, data consistency, and complex deployments — before you get any return.
4. What is an API gateway, and why use one?
An API gateway is a single entry point that sits in front of the microservices and routes incoming client requests to the appropriate service. It centralizes cross-cutting concerns such as authentication, rate limiting, TLS termination, request routing, and sometimes response aggregation. This shields clients from the internal service topology and avoids each service having to reimplement these concerns, though it can become a bottleneck or single point of failure if not made highly available.
5. What is service discovery, and how does it work?
Service discovery lets services find the network locations of other services dynamically, since instances come and go and their IP addresses change with scaling and restarts. Services register themselves in a registry (such as Consul, Eureka, or the one built into Kubernetes), and callers query that registry to resolve a service name to a healthy instance. Discovery can be client-side, where the caller picks an instance, or server-side, where a load balancer or proxy does the resolution.
6. What are the ways services communicate with each other?
Services communicate either synchronously or asynchronously. Synchronous communication uses request/response calls like REST over HTTP or gRPC, where the caller waits for a reply — simple but it couples services in time and propagates failures. Asynchronous communication uses messaging or events through a broker like Kafka or RabbitMQ, where a service publishes a message and continues without waiting, giving looser coupling and better resilience at the cost of eventual consistency and added complexity.
7. What is the difference between REST and gRPC for inter-service calls?
REST typically uses HTTP with JSON, is human-readable, widely supported, and easy to debug, which makes it a good fit for public-facing and browser-friendly APIs. gRPC uses HTTP/2 with Protocol Buffers, a compact binary format defined by a strict schema, giving lower latency, smaller payloads, streaming support, and generated client code. gRPC is often preferred for high-performance internal service-to-service communication, while REST remains common at the edge.
8. What is the database-per-service pattern?
Database-per-service means each microservice owns its own private database, and no other service reads or writes it directly — they must go through the owning service's API. This preserves loose coupling and lets each service choose the data store that best fits its needs. The trade-off is that you lose simple cross-service joins and ACID transactions across services, so you must handle queries and consistency with patterns like API composition, CQRS, and the saga pattern.
9. What is the saga pattern?
A saga manages a transaction that spans multiple services by breaking it into a sequence of local transactions, each in its own service, where every step publishes an event that triggers the next. If a step fails, the saga runs compensating transactions to undo the previous steps, since you cannot do a single distributed rollback. Sagas come in two flavors: choreography, where services react to each other's events, and orchestration, where a central coordinator tells each service what to do.
10. What is the circuit breaker pattern?
A circuit breaker prevents a service from repeatedly calling a downstream dependency that is failing, which would otherwise pile up requests and cause cascading failures. It tracks failures and, once they cross a threshold, "opens" the circuit so calls fail fast or return a fallback instead of waiting on timeouts. After a cooldown it moves to a half-open state and lets a few trial requests through to check if the dependency has recovered before closing again.
11. What is eventual consistency, and why does it matter in microservices?
Eventual consistency means that after an update, the different copies of data across services will become consistent over time rather than instantly. Because each service owns its own database and they communicate asynchronously, you usually cannot guarantee immediate, strong consistency across the whole system. Designs must tolerate temporary disagreement between services, which affects how you handle reads, user expectations, and patterns like sagas and event-driven updates.
12. How is load balancing handled in a microservices system?
Load balancing distributes requests across the multiple instances of a service to spread load and improve availability. It can be server-side, using a dedicated load balancer or reverse proxy that fronts the instances, or client-side, where the caller retrieves the list of healthy instances from service discovery and chooses one using a strategy like round-robin or least-connections. In platforms like Kubernetes, this is handled by Services and the built-in proxy.
13. What role do containers and orchestration play in microservices?
Containers package each service with its dependencies into a lightweight, portable, and consistent unit, which makes services easy to deploy and run identically across environments. Because microservices produce many such containers, an orchestrator like Kubernetes is used to schedule them, scale them up and down, restart unhealthy ones, manage networking, and roll out updates. Containers and orchestration are what make running dozens or hundreds of services operationally practical.
14. What is distributed tracing, and why is observability important?
Distributed tracing follows a single request as it travels across multiple services by attaching a shared trace ID and recording timing for each step, so you can see the full path and pinpoint where latency or errors occur. Observability — the combination of logs, metrics, and traces — is critical because a request now spans many services, making failures far harder to diagnose than in a monolith. Tools like OpenTelemetry, Jaeger, and Zipkin standardize how this telemetry is collected.
15. What is idempotency, and why does it matter in microservices?
An operation is idempotent if performing it multiple times has the same effect as performing it once. This matters because networks are unreliable, so clients and message brokers often retry requests, and a retried "create payment" or "place order" call must not charge or order twice. You achieve it by designing operations to be naturally idempotent or by using idempotency keys and deduplication so repeated requests are recognized and safely ignored.
16. How does the CAP theorem relate to microservices?
The CAP theorem states that a distributed system can guarantee only two of consistency, availability, and partition tolerance at the same time. Since network partitions are unavoidable in a distributed microservices system, partition tolerance is a given, so the real choice is between consistency and availability during a partition. This is why many microservices favor availability and eventual consistency, accepting that data may be briefly out of sync to keep the system responsive.
17. What are the biggest challenges and trade-offs when adopting microservices?
The biggest challenges are distributed-systems complexity, handling partial failures and network latency, maintaining data consistency without distributed transactions, and the operational burden of deploying and monitoring many services. You also take on harder integration testing, service versioning, and the need for strong DevOps, observability, and automation. The core trade-off is exchanging the simplicity of a monolith for independent scalability and deployability, which only pays off when the system is large and the team is mature enough to absorb that complexity.
Get these answered live in your real interview
NostrobeAI is a real-time AI interview copilot — it hears the question and drafts a strong answer on your screen, invisible on Zoom, Meet, and Teams. One-time pricing, no subscription.
Try NostrobeAI free