Introducing the microservices architecture
Docker and Docker Hub enable development using the microservices architecture. This architecture emphasizes building and running containers that focus on a single aspect of the overall application. When all the containers are running, you have your complete backend application. The containers can be complex, such as a full-blown database server, or simple, such as a short shell script. Ideally, the containers you implement for your application will be simple, short, and focused. Each microservice you write should be simple to debug since you don't need many lines of code.
Suppose we want to develop a backend application that uses MongoDB and Redis and whose application code is written using Node.js. We have the option to create a Dockerfile and start with the MongoDB image. We would then add Redis by installing it using apt, and then add our program to it as we did with the Debian image in Chapter 2, Using VirtualBox and Docker Containers for Development. The problem with creating the application using this method is that when you stop the container for development reasons, you're also stopping the running MongoDB and Redis servers.
Instead of a monolithic container with everything installed, you can run MongoDB, Redis, and your custom application containers separately. You can even divide your custom application into multiple containers. All you need is a mechanism to communicate between your application containers.
Note
It is far better to avoid using monolithic containers in your design! While it might seem that a large and complex program such as MongoDB is a monolithic sort of thing, it's just one dedicated service you can use as a microservice.
Now that we have a brief understanding of microservices architecture, we can examine some of the benefits and requirements of containers as microservices.
Scalability
Scalability is almost always a huge consideration for backend implementations. For example, a simple HTTP/WWW (web page) server can grind to a halt if enough people are trying to fetch our pages from it at the same time. For this reason, server farms exist so that you can deploy two or more of these HTTP/WWW servers that duplicate the functionality of serving our pages. For a two-server farm, you basically get double the number of people fetching your pages from it than for a single server. As traffic grows—for example, if the site gains in popularity—you can add a third server, then a fourth server, and so on. The capability of the backend to serve pages grows as you need it.
In a microservices architecture, we achieve a similar means of scalability. We can run multiple instances of our MongoDB container to achieve more capacity for database operations. The only trick is to configure MongoDB as a cluster or as shards and the application containers to use this database setup.
Inter-container communication
Inter-container communication usually involves some technology that allows messages to be sent from one container to another and for responses or statuses to be sent in return. Being able to communicate between running containers can be done via a few technologies, including the following:
- Sockets
- The filesystem
- Database records
- HTTP
- MQTT
Let's discuss each of them now.
Using sockets
Using sockets is a non-trivial way to communicate between containers. If you have five containers, you might have five sockets per container to provide communication paths between them all. As you scale, more sockets need to be created in each container, and you really want to automate this. There's quite a bit of business logic involved.
Using the filesystem
Using the filesystem involves sharing something such as a network drive among all the containers. To send a message, a container writes to a file in the filesystem. To receive a message, the container reads from a file in the filesystem. The receiver needs to poll, or repeatedly check, the filesystem to detect when the file is written to. This is not ideal because we don't really want to share a network drive like this—the performance is going to be on the slow side.
Note
Polling is a programming technique where you continuously check the status of a machine state (such as whether a file has changed).
Using database records
Using database records is similar to the filesystem method, except the messages to be sent are simply written to records in the database and the receivers only need to poll the database records for changes. Some databases provide a notification mechanism to tell a client (receiver) that the database has changed.
Both filesystem and database schemes require a good amount of business logic and debugging. You have to consider the order of messages sent and received and avoid missing a message because an older message is overwritten in the database or filesystem.
Using HTTP
HTTP is a stateless protocol, so you don't have to maintain a mesh of open sockets for communication. The protocol is well-defined and human-readable (for example, in text). To send a message, you send an HTTP request to the container you want to communicate with and wait for the response. You can close or persist the connection (keep it alive) as the HTTP protocol permits. Additionally, to avoid having to poll for messages or state change via HTTP, you can use WebSockets.
Using MQTT
MQTT is a well-designed message bus. It works much like IRC or Slack in that you have rooms (topics) and people in rooms (subscribers). Messages sent to a room (topic) are received by the people (subscribers). The people (subscribers) can join multiple rooms (topics) and they receive the messages for those rooms (topics).
For an MQTT application, there must be one MQTT server (broker) container that is accessible from the other containers. The other containers do not have to know about one another, only the address of the MQTT broker.
The MQTT broker accepts connections from one or more clients. The clients can subscribe to one or more topics. The topics are as arbitrary as the channel/room names are in IRC or Slack; they are typically strings. When a message is sent to the MQTT broker for a specific topic, the broker sends the message to all the clients who are subscribed to that topic.
Mosca (https://hub.docker.com/r/matteocollina/mosca) is an MQTT broker written in JavaScript. You can run it in a container, as you do with MongoDB or Redis.
There are several other MQTT brokers to choose from, as well—you can find them on Docker Hub.
HTTP versus MQTT
MQTT is a protocol specifically designed for passing messages of key/value pairs. Its strength is in its broadcast capability. Each client is responsible for asking for modifications to values based on the specific keys it cares about. Each client can be assured that their updates are received by any and all other interested clients. MQTT also has the capability to retain specific key/value pairs, so when a new client subscribes, it can be notified of the current key/value pair (the most recently sent one).
MQTT does not provide a request/response protocol, although it is simple to implement one. The downside of using MQTT for request/response-type transactions is that the response is not guaranteed to happen as soon as possible.
HTTP requires custom programming to provide the message-passing services that MQTT provides. You could implement a message bus sort of system that mimics MQTT's functionality, but that means more programming work for you and additional maintenance costs down the line. HTTP's strength is that it is a request/response protocol, so you can typically expect a response right away. The downside is that if the server is maintaining a set of key/value pairs, you would be required to poll the server from the clients to see whether the values have changed and post to the server to update the values. Polling causes the server to burn CPU, even when values haven't changed, and this can add up in a way that grinds your server to a halt if enough clients are polling frequently enough. You could use WebSockets, but in the end, you've reinvented MQTT.
HTTP is a good choice if you need more than what MQTT provides. Certainly, HTTP supports PHP or Node.js (and others) backend services.
It's possible to combine HTTP and MQTT. Use HTTP for request/response-type transactions and MQTT for state updates.
MQTT is a good choice for our purposes.
The chapter3/ directory in the companion GitHub repository contains a simple microservices-based backend demonstration application. It uses MongoDB, Redis, and MQTT, along with some publisher and subscriber applications that you can find in the GitHub repository for this book (https://github.com/PacktPublishing/Docker-for-Developers). Later in this chapter, we'll learn how to share our subscriber and publisher containers via Docker Hub.