Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Using these tags we can look into the health of each Edge Node connection: 

MQTT Engine Sparkplug Data Latency

A long latency time can indicate network issues.

Data Latency (ms)Long

The time in milliseconds between MQTT Engine receiving of the last message and the payload's reported time.

Note: For this to be very accurate the edge node's clock and the system clock running MQTT Engine should be synced

...


MQTT Engine Sparkplug Edge Node Birth Count
MQTT Engine keeps track of the number of Sparkplug Birth messages it receives from a Sparkplug edge node and/or Sparkplug device. This count is tracked in an Ignition tag under the MQTT Engine tag provider on a per-edge node basis. Monitoring the Birth Count tag across all edge nodes will provide insight into how often the Sparkplug edge node is sending Birth messages for various reasons; rebirth request, configuration changes at the edge, network issues, etc.

A high Birth count can be indicative of issues at the edge GW and repeated Birth messages can put additional load/stress on MQTT Engine and the GW hosting Engine

...

.

Birth Count

Long

The number of NBIRTH messages since the last time the info metrics were reset via the Node Info/Reset Info tag

...

Death CountLongThe number of NDEATH messages since the last time the info metrics were reset via the Node Info/Reset Info tag


MQTT Engine Sparkplug Edge Node Connection Status
MQTT Engine keeps track of the state of the MQTT connection for each Sparkplug edge node.  This connection status is tracked in an Ignition tag under the MQTT Engine tag provider on a per-edge node basis - the Online tag.

Monitoring the Online tag across all edge nodes will provide insight into how often the Sparkplug edge node is going on and offline. Repeated online/offline cycles can indicate network/connectivity issues at the edge GW.

Offline DateTimeDateTimeThe time at which the last NDEATH message was received by MQTT Engine


OnlineBooleanWhether or not the Edge Node is online. This is determined by whether the last lifecycle message was an NBIRTH or NDEATH


Online DateTimeDateTimeThe time at which the first NBIRTH message for a connection was received by MQTT Engine


MQTT Engine Rebirth Requests

MQTT Engine can ask the edge client, MQTT Transmission, to publish a new Birth message at any time - a rebirth request. Engine will request a rebirth from the edge when it encounters any errors that require “resetting” the Sparkplug session. Monitoring the Rebirth tags in the Node Info folders under [MQTT Engine]Edge Nodes will provide insight into the overall health of your MQTT data pipeline. If the rebirth count is high, that generally means there is a problem edge GW to central GW. If you historize the tags you will be able to track the reasons for the rebirth requests over time and use this data to root cause various issues with infrastructure, network, configuration, etc.

Rebirth CountInteger

The count of rebirth requests issued by MQTT Engine (available 4.0.22 onward)

Tip
A high rebirth count in a small time window is a clear indicator of issues at the Edge(s)


Rebirth (Last DateTime)DateTimeThe time of the last rebirth request issued by MQTT Engine (available 4.0.22 onward)

...

  • Triggered by user
    • A rebirth request for the Edge Node was manually triggered at MQTT Engine
  • Message sequence number error
    • The message sequence number received was not in order 

      Tip

      Common cause:

      • In a Sparkplug compliant system, the combination of Group ID and Edge Node ID (the Sparkplug Edge Node Descriptor) that identifies the Edge Node must be unique. If there are two or more Transmitters with the same Sparkplug Edge Node Descriptor, data from these two transmitters will sent with the same topic resulting in the next message sequence number expected by the MQTT client being incorrect. As a result, the MQTT Client will mark the data as stale and request a rebirth from the transmitter. If you have multiple MQTT Clients subscribing to the namespace, this will also likely create a firestorm of rebirth requests across the system.


  • Received a message for an edge node that is offline
    • An NDATA message was received from an Edge Node that is marked as Offline at MQTT Engine.

      Tip

      Common causes:

      • A BIRTH message was not published by the Edge Node or was not received by MQTT Engine after marking the Edge Node Offline
      • There are two or more Transmitters with the same Sparkplug Edge Node Descriptor. A DEATH message has been received for one Edge Node marking the Edge Node Offline and subsequent data is received from another Edge Node with teh same Sparkplug Edge Node Descriptor.


  • Reordering sequence numbers
    • The message sequence number received was not in order after waiting the specified number of milliseconds after receiving an out of order message for the expected message to arrive.

      Tip

      Common cause:

      • The reordering timeout at MQTT Engine, configured to support clustered MQTT servers which do not support guaranteed in order delivery of QoS 0 messages, is not long enough for MQTT Engine to receive a message with the next sequence number. 


  • Unable to set Edge Node online
    • Explanation?Engine failed to set the Edge Node online - review logs for exact cause
  • Failed to find metric name from alias
    • The alias in the DATA message did not match any alias previously published in the BIRTH message
  • UDT tag doesn't exist
    • UDT tag received in DATA message not found in previously published BIRTH messages
  • Unknown metric
    • Explanation?A DDATA message contains a metric not previously included in a BIRTH message
  • DDATA before BIRTH
    • An

      A DDATA message was received from an Edge Node Device that is marked as Offline at MQTT Engine

      Tip

      Common causes:

      • A DBIRTH message was not published by the Edge Node or was not received by MQTT Engine after marking the Edge Node Device Offline


MQTT Engine Message Queues
MQTT Engine queues messages to a set of internal queues fronting thread pools. One pool/queue per Sparkplug edge node under typical conditions/configuration. Since Sparkplug messages must be processed in order, these thread pools only contain a single thread. Under high load / message volume, these thread pools can get backed up and this is visible in the queue size. If the queue size is high, messages are backed up waiting to be processed. Monitoring these tags will help to identify any backup in MQTT Engine Sparkplug message processing.Tags to monitor - MQTT Engine message queues

  • [MQTT Engine]Engine Info/Queued Messages
    • A dataset showing the current message count for each message queue
  • [MQTT Engine]Engine Info/Queued Messages Total
    • The count of all current queued messages