Queue-based load leveling pattern
There are times when the load on an application cannot be determined at all times. Although there is consistent and predictable demand for application for most of the times, there are times when this load can go very high leading to failure of service or providing reduced performance or non-availability. Queue-based load leveling pattern can help during such scenarios. In this pattern, a queue is maintained and all request for the service is stored as messages within this queue. The queue acts as a highly available and durable temporary storage that then sends messages to service at a controlled speed thereby reducing disruption at the service end. The same has been shown in next image. There are multiple tasks sending messages to message queue. The queue stores the messages and ensures that the service gets these messages at a speed consistent with the resources available at the service end.
This pattern ensures that there is no unnecessary scaling up and out of resources by provisioning more instances to meet higher service demand. It has a direct impact on cost as well due to predictable usage and instances of resources.
High availability and better scalability are other advantages derived by implementing this pattern.