By: Ankit Gubrani / @ankitgubrani90


name: Ankit Gubrani;
title: Sr. Software Engineer;
working at: Playstation;
email-id: [email protected];
twitter: @ankitgubrani90;



  • Let’s build a microservice application   !!!
  • Cascade Failures
  • Fault Tolerant Systems
  • Circuit Breaker
  • Circuit breaker implementation using Resilience4j
  • Demo

Building a microservice   !!!


Immediate failure


How to handle immediate failure


But what’s the disadvantage?



Although immediate failures are good, but if we know request to a service has failed 10 times, why should be make request 11th time?

Remote service goes Unresponsive


How to handle unresponsiveness


Downsides of timeout failures


Cascade failures


Cascade failures

Cascading failure is a failure where the failure of one part of an interconnected system results in the failure of more parts, and eventually the whole system

Fault tolerant systems

  • Fault tolerance refers to the ability of a system (a microservice application in our case) to continue operating without interruption when one or more of its components fail.
  • Objective of fault tolerant systems is to remove single point of failure from a system
  • Fault tolerant systems ensures high availability

How to make fault tolerant systems

Avoiding Cascade failures

  • Timeouts
  • Retry
  • Circuit Breaker
  • Bulkhead
  • Cache Optimizations

Circuit breaker pattern

  • Circuit breaker acts as a GATE KEEPER for all requests
  • It monitors the response from supplier service for consecutive failures
  • When consecutive failure crosses a threshold Circuit breaker trips
  • As intermittent failures are common, we do not open circuit breaker on first request failure
  • Cache Optimizations

Designing Circuit breaker


States of Circuit breaker




When everything is normal, the circuit breaker remains in the CLOSED STATE and all calls pass through to the supplier (Payment Service)

States of Circuit breaker




When the number of failures exceeds a threshold the breaker trips, and it goes into the Open state. The circuit breaker returns an error for calls without executing the function.

States of Circuit breaker




Periodically make calls to supplier (Payment Service) to see if it is successfully returning the result. This state is HALF-OPEN.

States of Circuit breaker


Why do we need Circuit breaker?

  • FAIL FASTER : Circuit breaker helps services to FAIL FASTER, which helps conserve resources & keep the system alive
  • PREVENTS SERVICE FAILURES : Prevents consumer services from failing
  • AUTOMATICE RECOVERY : As circuit breaker periodically checks if supplier is working again, & recovers the whole system


  • Resilience4j is a library for making Fault Tolerant Applications
  • Resilience4j provides modules for implementing:
    • Circuit Breaker
    • BulkHead
    • RateLimiter
  • Resilience4j is inspired by a framework called Hystrix which was designed by Netflix


  • Resilience4j’s uses a sliding window to store and aggregate the outcome of calls for circuit breaker pattern
  • Count-based sliding window
    • It checks out of Last N calls how many were successful & how many failed
    • And based on that information, circuit breaker state is maintained
  • Time-based sliding window
    • It checks out of all the calls made in last N seconds; how many were successful & how many failed
    • And based on that information, circuit breaker state is maintained


Thank You


Contact Me :
LinkedIn : Ankit Gubrani
Twitter : @ankitgubrani90
Email-ID : [email protected]
Blog :