MQTT Codec
2024-01-01
RDME currently only supports MQTT v3.1.1 for messaging. MQTT and codecs are also a very lengthy subject so we are going to talk about decoding primitives, packet structures and take a closer look at a couple packet types.
# Encoded length | - - - - - - - - | - - - - - - - - | - - - - - - - - | Fixed Header Variable Header Payload (required) (optional) (optional)
Fixed Headers
Every packet in MQTT starts with what is a called a fixed header. The fixed header is 2 - 5 bytes long and consists of three major parts. The packet type, and metadata on the packet and the packet length. In networking the least significant bit is bit 0, and bits 0 - 3 are flags for metadata while 4 - 7 are the packet type.
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 ... ( 3 more possible bytes ) | - - - - - | | - - - - - | | - - - - - - - - - - - | type flag bits length
For every packet except the PUBLISH packet, the flag bits are fixed for that packet type. We'll talk about the flag bits for publish packets in a... bit... dammit...
To decode the length, we keep reading bytes until the bit in position 7 is unset or we reach 4 bytes. Once we have all of our bytes we mutiply them together and that gives us the total length of the packet. The max packet size is therefore 256 MB.
multiplier = 1 value = 0 do encodedByte = 'next byte from stream' value += (encodedByte AND 127) * multiplier multiplier *= 128 if (multiplier > 128*128*128) throw Error(Malformed Remaining Length) while ((encodedByte AND 128) != 0)
Variable Headers
The next part of a packet is the variable header. However, not all packets contain a variable header and the information they contain is different between packet types. The contents of the Variable Header are included in the length given by the Fixed Header.
# Encdoded length | - - - - - - - - | - - - - - - - - | - - - - - - - - | Fixed Header Variable Header Payload | - - - | length ----------------------------------->
Packet Identifiers
The packet Id is contained inside the variable header. but only for packets that require acknowledgements. It contains two bytes indicating the packet id, for a total of 64k avaiable id's. Id's are not specific to the network, but only to the broker/client's session.
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 | - - - - - - - - - - - - - | | - - - - - - - - - - - - - | packet identifier
UTF-8 encoded Strings
If you're familiar with C, you know that C string have a termination character \0
. Forget that. We want stability. UTF-8 encoded strings have a predefined size, dictated by the first two bytes in an encoded string. the bytes following the encoded length represent the string itself.
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 | - - - - - - - - - - - - - | | - - - - - - - - - - - - - | ... string bytes string length
QoS
MQTT uses whats called a Quality of Service, or QoS to determine how to send packets. The QoS is represented by two consecutive bits, giving a total of four possible outcomes. There are 3 levels of message assurance: At Most Once, At Least Once, and Exactly Once. If a recipient obtains a QoS value of 4, it is treated as a protocol violation and the connection is closed.
bx00 represents At Most Once or QoS 0 — Fire and forget
# At Most Once packet exchange
Sender Receiver
Pub(data) ------------------------------> N bytes
bx01 indicates At Least Once or QoS 1 — Requires the recipient to acknowledge
# At Least Once packet exchange
Sender Receiver
Pub(data) ------------------------------> N bytes
<-------------------- Acknowledge Receipt 4 Bytes
bx11 indicates Exactly Once or QoS 2 — Consists of a 4 packet chain of events
# Exactly Once packet exchange.
# Sender is in responsible for ensuring delivery.
# Therefore, only the sender sends retries.
Sender Receiver
Pub(data) ------------------------------> N bytes
<--------------- Received acknowledgement 4 Bytes
Release Message ------------------------> 4 Bytes
<--------------------- Completion Message 4 Bytes
Publish Packet
Fixed Header
The publish packet is the only packet where the Fixed header's Flag Bits are variable. For a refresher, here is the first byte of the Fixed Header:
7 6 5 4 3 2 1 0 | - - - - - | | - - - - - | type flag bits # PUBLSIH packet type bits 0 0 1 1
The Flag Bits are broken down into three pieces of information:
- Retain Flag
- Qos Level
- Qos Level
- Dup Flag
If the retain flag is set, this message is sent to all current subscribers of the topic, as well as any new subscribers.
If the Dup Flag is set, it means that the sender has already attempted to send the packet, but used a different packet Id.
Variable Header
The variable Header consists of the Topic Name, followed by the Packet Id.
Payload
Whatever you want to send.
Will
A Will is sent to the broker by the Client inside the CONNECT packet. The Will is published to other clients in the event that they disconnect. The will itself consists of a UTF-8 encoded string indicating the topic, followed by two bytes indicating the payload length and finally the payload itself.
Connect Packet
Connect packets are — you can tell by the way it is...
Fixed Header
The fixed header of a Connect packet are always the same, the packet type, followed by four unset bits.
7 6 5 4 3 2 1 0 | - - - - - | | - - - - - | type flag bits # CONNECT packet fixed header byte. 0 0 0 1 0 0 0 0
Variable Header
The variable header of a CONNECT packet consists of the UTF-8 encoded string MQTT followed by a single byte that tells what version of MQTT is being used. for MQTT v3.1.1 the byte value will be... 4. Intuitive. For future reference if you're introducing SEMVER into a protocol versioning, use multiple bytes.
Now that we know the protocol and version, we are going to decode the Connect Flags. Connect Flags in MQTT are contained in a single byte.
- Username Flag — Tells wether a username is present. If it is it will be a UTF-8 encoded string.
- Password Flag — Tells wether a username is present. If it is it will be a UTF-8 encoded string.
- Will Retain — Tells wether a Will should be a Retained message for the topic.
- Will QOS — Valid value are 0, 1 or 2. 3 is a protocol violation.
- Will QOS — Valid value are 0, 1 or 2. 3 is a protocol violation.
- Will Flag — Tells the broker if the client is sending a Will.
- Clean Session — Tell the broker if it should clean prior client session history.
- Reseved — Nothing... (but is a protocol violation if it is set).
Payload
Client Id
The client ID is a required field and is a UTF-8 encoded string. The client Id is used by the broker to persist sessions. A client can send a client Id that has a length of zero bytes, however the broker MUST assign the client a Id. Additionally, the if the client sends a zero byte Id, it MUST set the CleanSession bit to 0
Will Topic
Present when will Flag is set to 1, but a regular MQTT topic.
Will Message
Present when the Will Flag is set. Contains the payload of the packet that will be sent if the client disconnects.
User Name
Present when the Username Flag is set, but a regular UTF-8 encoded string
Password
Present when the Password Flag is set. Constains a 2 byte length field followed by the password. Because setting arbitrary requirements, like UTF-8 characters, could weaken passwords, passwords can contain any bytes.
Summary
If you made it this far, hopefully I was able to teach you something about writing network protocols. We talked about how to obtain primitives from a byte stream, genereal MQTT packet architecture, and some packet types. If you see some issues or have questions that you think could be better addressed in this post, please reach out to me on my "contact me" form.