MQTT Codec

2024-01-01

RDME currently only supports MQTT v3.1.1 for messaging. MQTT and codecs are also a very lengthy subject so we are going to talk about decoding primitives, packet structures and take a closer look at a couple packet types.

MQTT non-copywrite logo

# Encoded length
| - - - - - - - - | - - - - - - - - | - - - - - - - - |
    Fixed Header    Variable Header       Payload
     (required)       (optional)        (optional)

Fixed Headers

Every packet in MQTT starts with what is a called a fixed header. The fixed header is 2 - 5 bytes long and consists of three major parts. The packet type, and metadata on the packet and the packet length. In networking the least significant bit is bit 0, and bits 0 - 3 are flags for metadata while 4 - 7 are the packet type.

7   6   5   4   3   2   1   0   7   6   5   4   3   2   1 ... ( 3 more possible bytes )
| - - - - - |   | - - - - - |   | - - - - - - - - - - - |
    type          flag bits             length

For every packet except the PUBLISH packet, the flag bits are fixed for that packet type. We'll talk about the flag bits for publish packets in a... bit... dammit...

To decode the length, we keep reading bytes until the bit in position 7 is unset or we reach 4 bytes. Once we have all of our bytes we mutiply them together and that gives us the total length of the packet. The max packet size is therefore 256 MB.

multiplier = 1

value = 0

do

    encodedByte = 'next byte from stream'

    value += (encodedByte AND 127) * multiplier

    multiplier *= 128

    if (multiplier > 128*128*128)

        throw Error(Malformed Remaining Length)

while ((encodedByte AND 128) != 0)

Variable Headers

The next part of a packet is the variable header. However, not all packets contain a variable header and the information they contain is different between packet types. The contents of the Variable Header are included in the length given by the Fixed Header.

# Encdoded length
| - - - - - - - - | - - - - - - - - | - - - - - - - - |
    Fixed Header    Variable Header       Payload
          | - - - |
           length ----------------------------------->

Packet Identifiers

The packet Id is contained inside the variable header. but only for packets that require acknowledgements. It contains two bytes indicating the packet id, for a total of 64k avaiable id's. Id's are not specific to the network, but only to the broker/client's session.

7   6   5   4   3   2   1   0   7   6   5   4   3   2   1   0
| - - - - - - - - - - - - - |   | - - - - - - - - - - - - - |
                      packet identifier

UTF-8 encoded Strings

If you're familiar with C, you know that C string have a termination character \0. Forget that. We want stability. UTF-8 encoded strings have a predefined size, dictated by the first two bytes in an encoded string. the bytes following the encoded length represent the string itself.

7   6   5   4   3   2   1   0   7   6   5   4   3   2   1   0
| - - - - - - - - - - - - - |   | - - - - - - - - - - - - - |   ... string bytes
                        string length

QoS

MQTT uses whats called a Quality of Service, or QoS to determine how to send packets. The QoS is represented by two consecutive bits, giving a total of four possible outcomes. There are 3 levels of message assurance: At Most Once, At Least Once, and Exactly Once. If a recipient obtains a QoS value of 4, it is treated as a protocol violation and the connection is closed.

bx00 represents At Most Once or QoS 0 — Fire and forget

# At Most Once packet exchange

Sender                                   Receiver
    Pub(data) ------------------------------> N bytes

bx01 indicates At Least Once or QoS 1 — Requires the recipient to acknowledge

# At Least Once packet exchange

Sender                                   Receiver
    Pub(data) ------------------------------> N bytes

    <-------------------- Acknowledge Receipt 4 Bytes

bx11 indicates Exactly Once or QoS 2 — Consists of a 4 packet chain of events

# Exactly Once packet exchange.

# Sender is in responsible for ensuring delivery.
# Therefore, only the sender sends retries.

Sender                                   Receiver
    Pub(data) ------------------------------> N bytes

    <--------------- Received acknowledgement 4 Bytes

    Release Message ------------------------> 4 Bytes

    <--------------------- Completion Message 4 Bytes

Publish Packet

Fixed Header

The publish packet is the only packet where the Fixed header's Flag Bits are variable. For a refresher, here is the first byte of the Fixed Header:

7   6   5   4   3   2   1   0
| - - - - - |   | - - - - - |
    type          flag bits

# PUBLSIH packet type bits
0   0   1   1

The Flag Bits are broken down into three pieces of information:

  1. Retain Flag
  2. Qos Level
  3. Qos Level
  4. Dup Flag

If the retain flag is set, this message is sent to all current subscribers of the topic, as well as any new subscribers.

If the Dup Flag is set, it means that the sender has already attempted to send the packet, but used a different packet Id.

Variable Header

The variable Header consists of the Topic Name, followed by the Packet Id.

Payload

Whatever you want to send.

Will

A Will is sent to the broker by the Client inside the CONNECT packet. The Will is published to other clients in the event that they disconnect. The will itself consists of a UTF-8 encoded string indicating the topic, followed by two bytes indicating the payload length and finally the payload itself.

Connect Packet

Connect packets are — you can tell by the way it is...

Fixed Header

The fixed header of a Connect packet are always the same, the packet type, followed by four unset bits.

7   6   5   4   3   2   1   0
| - - - - - |   | - - - - - |
    type          flag bits

# CONNECT packet fixed header byte.
0   0   0   1   0   0   0   0

Variable Header

The variable header of a CONNECT packet consists of the UTF-8 encoded string MQTT followed by a single byte that tells what version of MQTT is being used. for MQTT v3.1.1 the byte value will be... 4. Intuitive. For future reference if you're introducing SEMVER into a protocol versioning, use multiple bytes.

Now that we know the protocol and version, we are going to decode the Connect Flags. Connect Flags in MQTT are contained in a single byte.

  1. Username Flag — Tells wether a username is present. If it is it will be a UTF-8 encoded string.
  2. Password Flag — Tells wether a username is present. If it is it will be a UTF-8 encoded string.
  3. Will Retain — Tells wether a Will should be a Retained message for the topic.
  4. Will QOS — Valid value are 0, 1 or 2. 3 is a protocol violation.
  5. Will QOS — Valid value are 0, 1 or 2. 3 is a protocol violation.
  6. Will Flag — Tells the broker if the client is sending a Will.
  7. Clean Session — Tell the broker if it should clean prior client session history.
  8. Reseved — Nothing... (but is a protocol violation if it is set).

Payload

Client Id

The client ID is a required field and is a UTF-8 encoded string. The client Id is used by the broker to persist sessions. A client can send a client Id that has a length of zero bytes, however the broker MUST assign the client a Id. Additionally, the if the client sends a zero byte Id, it MUST set the CleanSession bit to 0

Will Topic

Present when will Flag is set to 1, but a regular MQTT topic.

Will Message

Present when the Will Flag is set. Contains the payload of the packet that will be sent if the client disconnects.

User Name

Present when the Username Flag is set, but a regular UTF-8 encoded string

Password

Present when the Password Flag is set. Constains a 2 byte length field followed by the password. Because setting arbitrary requirements, like UTF-8 characters, could weaken passwords, passwords can contain any bytes.

Summary

If you made it this far, hopefully I was able to teach you something about writing network protocols. We talked about how to obtain primitives from a byte stream, genereal MQTT packet architecture, and some packet types. If you see some issues or have questions that you think could be better addressed in this post, please reach out to me on my "contact me" form.

MQTT v3.1.1 specification