The MQTT modules (MQTT Transmission, MQTT Distributor and MQTT Engine) all provide status tags within Ignition that can be used to monitor the overall health of the MQTT pipeline.

MQTT Engine

The tags automatically created for MQTT Engine are documented here

Engine Info - Edge Nodes


Use the tags below to verify the expected number of Edge Nodes have come online and remain online:

NodesOfflineIntegerThe number of Sparkplug Edge Nodes offline. This is determined by whether the last lifecycle message was an NBIRTH or NDEATH
NodesOnlineIntegerThe number of Sparkplug Edge Nodes online. This is determined by whether the last lifecycle message was an NBIRTH or NDEATH
NodeUnitCountIntegerThe total number of Sparkplug Edge Nodes as determined by the received NBIRTH messages


Use the tags below to identify the Sparkplug ID and last timestamp for the Offline and Online nodes

Offline NodesDatasetA dataset containing the Sparkplug ID and timestamp for all offline Sparkplug edge nodes
Online NodesDatasetA dataset containing the Sparkplug ID and timestamp for all online Sparkplug edge nodes 


Use the script below to easily parse the Offline and Online datasets

onlineEdgeNodes = system.dataset.toPyDataSet(system.tag.readBlocking("[MQTT Engine]Engine Info/Edge Nodes/Online Nodes")[0].value)     
print "Online Sparkplug EdgeNodes: " + str(system.tag.readBlocking("[MQTT Engine]Engine Info/Edge Nodes/NodesOnline")[0].value)
if system.tag.readBlocking("[MQTT Engine]Engine Info/Edge Nodes/NodesOnline")[0].value > 0:
    for row in onlineEdgeNodes:
        data = []
        data.append(["Sparkplug EdgeNode Descriptor", row[0]])
        data.append(["Last Connect Date", row["Date"]])
        print data
         
offlineEdgeNodes = system.dataset.toPyDataSet(system.tag.readBlocking("[MQTT Engine]Engine Info/Edge Nodes/Offline Nodes")[0].value)     
print "Offline Sparkplug EdgeNodes: " + str(system.tag.readBlocking("[MQTT Engine]Engine Info/Edge Nodes/NodesOffline")[0].value)
if system.tag.readBlocking("[MQTT Engine]Engine Info/Edge Nodes/NodesOffline")[0].value > 0:
    for row in offlineEdgeNodes:
        data = []
        data.append(["Sparkplug EdgeNode Descriptor", row[0]])
        data.append(["Last Connect Date", row["Date"]])
        print data



Engine Info - MQTT Clients



Server Latency (ms)IntegerThe amount of time that it takes for a test MQTT message to be sent and received back by MQTT Engine


A long latency time can indicate network issues.


Edge Node - Node Info Tags

For each connected Edge Node, a Node Control and Node Info folder containing tags is created along with a Device Info folder for each connected device.


Using these tags we can look into the health of each Edge Node connection: 

MQTT Engine Sparkplug Data Latency

A long latency time can indicate network issues.

Data Latency (ms)Long

The time in milliseconds between MQTT Engine receiving of the last message and the payload's reported time.

Note: For this to be very accurate the edge node's clock and the system clock running MQTT Engine should be synced


MQTT Engine Sparkplug Edge Node Birth Count
MQTT Engine keeps track of the number of Sparkplug Birth messages it receives from a Sparkplug edge node and/or Sparkplug device. This count is tracked in an Ignition tag under the MQTT Engine tag provider on a per-edge node basis. Monitoring the Birth Count tag across all edge nodes will provide insight into how often the Sparkplug edge node is sending Birth messages for various reasons; rebirth request, configuration changes at the edge, network issues, etc.

A high Birth count can be indicative of issues at the edge GW and repeated Birth messages can put additional load/stress on MQTT Engine and the GW hosting Engine.

Birth Count

Long

The number of NBIRTH messages since the last time the info metrics were reset via the Node Info/Reset Info tag


Death CountLongThe number of NDEATH messages since the last time the info metrics were reset via the Node Info/Reset Info tag


MQTT Engine Sparkplug Edge Node Connection Status
MQTT Engine keeps track of the state of the MQTT connection for each Sparkplug edge node.  This connection status is tracked in an Ignition tag under the MQTT Engine tag provider on a per-edge node basis - the Online tag.

Monitoring the Online tag across all edge nodes will provide insight into how often the Sparkplug edge node is going on and offline. Repeated online/offline cycles can indicate network/connectivity issues at the edge GW.

Offline DateTimeDateTimeThe time at which the last NDEATH message was received by MQTT Engine


OnlineBooleanWhether or not the Edge Node is online. This is determined by whether the last lifecycle message was an NBIRTH or NDEATH


Online DateTimeDateTimeThe time at which the first NBIRTH message for a connection was received by MQTT Engine


MQTT Engine Rebirth Requests

MQTT Engine can ask the edge client, MQTT Transmission, to publish a new Birth message at any time - a rebirth request. Engine will request a rebirth from the edge when it encounters any errors that require “resetting” the Sparkplug session. Monitoring the Rebirth tags in the Node Info folders under [MQTT Engine]Edge Nodes will provide insight into the overall health of your MQTT data pipeline. If the rebirth count is high, that generally means there is a problem edge GW to central GW. If you historize the tags you will be able to track the reasons for the rebirth requests over time and use this data to root cause various issues with infrastructure, network, configuration, etc.

Rebirth CountInteger

The count of rebirth requests issued by MQTT Engine (available 4.0.22 onward)


Rebirth (Last DateTime)DateTimeThe time of the last rebirth request issued by MQTT Engine (available 4.0.22 onward)


Rebirth (Last) CauseString

The reason for the last rebirth request (available 4.0.22 onward)


Reasons for rebirth requests are:

MQTT Engine Message Queues
MQTT Engine queues messages to a set of internal queues fronting thread pools. One pool/queue per Sparkplug edge node under typical conditions/configuration. Since Sparkplug messages must be processed in order, these thread pools only contain a single thread. Under high load / message volume, these thread pools can get backed up and this is visible in the queue size. If the queue size is high, messages are backed up waiting to be processed. Monitoring these tags will help to identify any backup in MQTT Engine Sparkplug message processing.Tags to monitor - MQTT Engine message queues