The MQTT modules (MQTT Transmission, MQTT Distributor and MQTT Engine) all provide status tags within Ignition that can be used to monitor the overall health of the MQTT pipeline.

MQTT Engine

The tags automatically created for MQTT Engine are documented here

Engine Info - Edge Nodes


Use the tags below to verify the expected number of Edge Nodes have come online and remain online:

NodesOfflineIntegerThe number of Sparkplug Edge Nodes offline. This is determined by whether the last lifecycle message was an NBIRTH or NDEATH
NodesOnlineIntegerThe number of Sparkplug Edge Nodes online. This is determined by whether the last lifecycle message was an NBIRTH or NDEATH
NodeUnitCountIntegerThe total number of Sparkplug Edge Nodes as determined by the received NBIRTH messages


Use the tags below to identify the Sparkplug ID and last timestamp for the Offline and Online nodes

Offline NodesDatasetA dataset containing the Sparkplug ID and timestamp for all offline Sparkplug edge nodes
Online NodesDatasetA dataset containing the Sparkplug ID and timestamp for all online Sparkplug edge nodes 


Use the script below to easily parse the Offline and Online datasets

onlineEdgeNodes = system.dataset.toPyDataSet(system.tag.readBlocking("[MQTT Engine]Engine Info/Edge Nodes/Online Nodes")[0].value)     
print "Online Sparkplug EdgeNodes: " + str(system.tag.readBlocking("[MQTT Engine]Engine Info/Edge Nodes/NodesOnline")[0].value)
if system.tag.readBlocking("[MQTT Engine]Engine Info/Edge Nodes/NodesOnline")[0].value > 0:
    for row in onlineEdgeNodes:
        data = []
        data.append(["Sparkplug EdgeNode Descriptor", row[0]])
        data.append(["Last Connect Date", row["Date"]])
        print data
         
offlineEdgeNodes = system.dataset.toPyDataSet(system.tag.readBlocking("[MQTT Engine]Engine Info/Edge Nodes/Offline Nodes")[0].value)     
print "Offline Sparkplug EdgeNodes: " + str(system.tag.readBlocking("[MQTT Engine]Engine Info/Edge Nodes/NodesOffline")[0].value)
if system.tag.readBlocking("[MQTT Engine]Engine Info/Edge Nodes/NodesOffline")[0].value > 0:
    for row in offlineEdgeNodes:
        data = []
        data.append(["Sparkplug EdgeNode Descriptor", row[0]])
        data.append(["Last Connect Date", row["Date"]])
        print data


Engine Info - MQTT Clients



Server Latency (ms)IntegerThe amount of time that it takes for a test MQTT message to be sent and received back by MQTT Engine

A long latency time can indicate network issues.


Edge Node - Node Info Tags

For each connected Edge Node, a Node Control and Node Info folder containing tags is created along with a Device Info folder for each connected device.


Using these tags we can look into the health of each Edge Node connection: 

MQTT Engine Sparkplug Data Latency

A long latency time can indicate network issues.

Data Latency (ms)Long

The time in milliseconds between MQTT Engine receiving of the last message and the payload's reported time.

Note: For this to be very accurate the edge node's clock and the system clock running MQTT Engine should be synced

MQTT Engine Sparkplug Edge Node Birth Count
MQTT Engine keeps track of the number of Sparkplug Birth messages it receives from a Sparkplug edge node and/or Sparkplug device. This count is tracked in an Ignition tag under the MQTT Engine tag provider on a per-edge node basis. Monitoring the Birth Count tag across all edge nodes will provide insight into how often the Sparkplug edge node is sending Birth messages for various reasons; rebirth request, configuration changes at the edge, network issues, etc.

A high Birth count can be indicative of issues at the edge GW and repeated Birth messages can put additional load/stress on MQTT Engine and the GW hosting Engine.

Birth Count

Long

The number of NBIRTH messages since the last time the info metrics were reset via the Node Info/Reset Info tag

Death CountLongThe number of NDEATH messages since the last time the info metrics were reset via the Node Info/Reset Info tag

MQTT Engine Sparkplug Edge Node Connection Status
MQTT Engine keeps track of the state of the MQTT connection for each Sparkplug edge node.  This connection status is tracked in an Ignition tag under the MQTT Engine tag provider on a per-edge node basis - the Online tag.

Monitoring the Online tag across all edge nodes will provide insight into how often the Sparkplug edge node is going on and offline. Repeated online/offline cycles can indicate network/connectivity issues at the edge GW.

Offline DateTimeDateTimeThe time at which the last NDEATH message was received by MQTT Engine
OnlineBooleanWhether or not the Edge Node is online. This is determined by whether the last lifecycle message was an NBIRTH or NDEATH
Online DateTimeDateTimeThe time at which the first NBIRTH message for a connection was received by MQTT Engine

MQTT Engine Rebirth Requests

MQTT Engine can ask the edge client, MQTT Transmission, to publish a new Birth message at any time - a rebirth request. Engine will request a rebirth from the edge when it encounters any errors that require “resetting” the Sparkplug session. Monitoring the Rebirth tags in the Node Info folders under [MQTT Engine]Edge Nodes will provide insight into the overall health of your MQTT data pipeline. If the rebirth count is high, that generally means there is a problem edge GW to central GW. If you historize the tags you will be able to track the reasons for the rebirth requests over time and use this data to root cause various issues with infrastructure, network, configuration, etc.

Rebirth CountInteger

The count of rebirth requests issued by MQTT Engine (available 4.0.22 onward)

Rebirth (Last DateTime)DateTimeThe time of the last rebirth request issued by MQTT Engine (available 4.0.22 onward)
Rebirth (Last) CauseString

The reason for the last rebirth request (available 4.0.22 onward)


Reasons for rebirth requests are:

  • Triggered by user
    • A rebirth request for the Edge Node was manually triggered at MQTT Engine
  • Message sequence number error
    • The message sequence number received was not in order 

      Common cause:

      • In a Sparkplug compliant system, the combination of Group ID and Edge Node ID (the Sparkplug Edge Node Descriptor) that identifies the Edge Node must be unique. If there are two or more Transmitters with the same Sparkplug Edge Node Descriptor, data from these two transmitters will sent with the same topic resulting in the next message sequence number expected by the MQTT client being incorrect. As a result, the MQTT Client will mark the data as stale and request a rebirth from the transmitter. If you have multiple MQTT Clients subscribing to the namespace, this will also likely create a firestorm of rebirth requests across the system.
  • Received a message for an edge node that is offline
    • An NDATA message was received from an Edge Node that is marked as Offline at MQTT Engine.

      Common causes:

      • A BIRTH message was not published by the Edge Node or was not received by MQTT Engine after marking the Edge Node Offline
      • There are two or more Transmitters with the same Sparkplug Edge Node Descriptor. A DEATH message has been received for one Edge Node marking the Edge Node Offline and subsequent data is received from another Edge Node with teh same Sparkplug Edge Node Descriptor.
  • Reordering sequence numbers
    • The message sequence number received was not in order after waiting the specified number of milliseconds after receiving an out of order message for the expected message to arrive.

      Common cause:

      • The reordering timeout at MQTT Engine, configured to support clustered MQTT servers which do not support guaranteed in order delivery of QoS 0 messages, is not long enough for MQTT Engine to receive a message with the next sequence number. 
  • Unable to set Edge Node online
    • Engine failed to set the Edge Node online - review logs for exact cause
  • Failed to find metric name from alias
    • The alias in the DATA message did not match any alias previously published in the BIRTH message
  • UDT tag doesn't exist
    • UDT tag received in DATA message not found in previously published BIRTH messages
  • Unknown metric
    • A DDATA message contains a metric not previously included in a BIRTH message
  • DDATA before BIRTH
    • A DDATA message was received from an Edge Node Device that is marked as Offline at MQTT Engine

      Common causes:

      • A DBIRTH message was not published by the Edge Node or was not received by MQTT Engine after marking the Edge Node Device Offline

MQTT Engine Message Queues
MQTT Engine queues messages to a set of internal queues fronting thread pools. One pool/queue per Sparkplug edge node under typical conditions/configuration. Since Sparkplug messages must be processed in order, these thread pools only contain a single thread. Under high load / message volume, these thread pools can get backed up and this is visible in the queue size. If the queue size is high, messages are backed up waiting to be processed. Monitoring these tags will help to identify any backup in MQTT Engine Sparkplug message processing.Tags to monitor - MQTT Engine message queues

  • [MQTT Engine]Engine Info/Queued Messages
    • A dataset showing the current message count for each message queue
  • [MQTT Engine]Engine Info/Queued Messages Total
    • The count of all current queued messages


  • No labels