I’d contend that you’re optimizing for initial deployment speed rather than overall resilience. Backing off with increasing delays before retrying a dependent service call is a best practice for production services. The fact that you’re seeing this behavior on initial rollout is inconvenient, but it’s also self healing. It might take a bit longer than you like, but if you’re really that impatient, there are workarounds like the one you described.