PCE Health
The Public Experimental Health Check API displays health information about a 4X2 Supercluster or a PCE virtual appliance.
This API is only available for Illumio Core PCE installed on-premises and is not available for Illumio Cloud customers.
About the PCE Health API
With this API, you can see the following health information:
- How long the PCE has been running, its runlevel, and overall health (normal, warning, or error).
- Each node hostname, IP address, uptime, runlevel, and whether the PCE software is running properly.
- Each node type (core or data), and which data node is the database slave and which is the master. The replication delay for the database slave is also displayed.
- Information about PCE service alerts, such as the number of degraded or failed services in the cluster, so you can see where service failures have occurred.
-
The new health API schema (
health_definitions_schema.json
) is designed to be consumed by the UI as well as the API end-user. Metrics are listed in two sections: for an individual node, and the general metrics section.
Health Metrics
Application-level metrics have been added to PCE health API to allow for pro-active monitoring and gathering of insights into the system performance. These metrics cover all core PCE subsystems: core application (VEN heartbeats, policy), PCE platform (database health and disk latency) and data (traffic pipeline and storage).
You can monitor the PCE health and performance by looking at nodes, clusters, database replication, and other services. Monitoring can be performed in different ways, such as using the Health page in the PCE's web console UI, messages in the PCE syslog, and the Illumio REST API.
While the periodic syslog messages can be used for historic monitoring (time series), the API uses pre-defined and customizable thresholds to toggle information that was defined as warning
or critical
.
The following metric properties are used by the Health API:
- :
metric
(name, value, and units)- An example for metric : { metric: "Disk Usage", last_updated: "2020-03-12T08:46:25-07:00", entries: [...] }
- An example for values: [{ status: "normal", name: "usage", type: "percent", value: 12 }, { name: "disk", value: "persistent" }]
last_updated
timestamp (not available in the UI)
If you want to enable or disable individual metrics, use the CLI commands described in Configurable Thresholds for Health Metrics. You can also use the configurable thresholds technique to turn off all metrics.
Health metric schema
The existing UI schema is extended to allow generic metrics in two sections: the node section and general section.
The overall metric schema may look like this:
[
"metric": {
"description": "One or more entries encompassing the metric.",
"type": "object",
"required": [
"metric"
],
"properties": {
"metric": {
"type": "string"
},
"entries": {
"type": "array",
"items": {
"anyOf": [
{
"$ref": "#/definitions/cluster"
},
{
"$ref": "#/definitions/entry"
}
]
}
},
"last_updated": {
"type": "string",
"format": "date-time"
},
"display": {
"description": "An optional hint for the UI to display the data in a specific form.",
"type": "string",
"enum": [
"table",
"join"
]
}
}
Configure Thresholds for Health Metrics
Configure the thresholds that define the normal, warning, and critical status for each health metric.
Use the new command illumio-pce-env metrics --write
to adjust these thresholds.
See the PCE Administration Guide for more information.
PCE Health API Method
Functionality | HTTP | URI |
---|---|---|
Get the health information of the PCE Cluster and its nodes |
GET
|
|
Check PCE Health
URI to Check PCE Health
GET [api_version]/health
Curl Command Check PCE Health
curl -i -X GET https://pce.my-company.com:8443/api/v2/health -H 'Accept: application/json' -u $KEY:'TOKEN'
PCE Health Response Properties
Property | Description | Type | Required | |
---|---|---|---|---|
status
|
Current health status of the PCE. Possible values:
|
String | Yes | |
type
|
Type of the PCE:
|
String | ||
fqdn
|
The fully qualified domain name (FQDN) of the PCE. | String | ||
available_seconds
|
The length of time that this PCE has been available, measured in seconds. | Number | ||
notifications
|
Heath warnings related to the PCE, which contain the following properties:
|
Array String
String String |
Yes
Yes No |
|
listen_only_mode_enabled_at
|
Timestamp at which PCE Listen Only Mode was enabled. Format: For information about enabling or disabling listen-only mode for a PCE, see the PCE Administration Guide. |
String &Null | ||
upgrade_pending
|
||||
nodes
|
The nodes that comprise your PCE cluster. For each node of your PCE, this API call returns the following properties:
|
Array
Str&Null Str&Null Num&Null Num&Null
Num&Null Object String Number
Array String
Array
|
Yes Yes No No
No No Yes Yes
Yes Yes Yes
Yes
No
Yes |
|
network
|
PCE 2x2 or 4x2 Deployment For a PCE 2x2 or 4x2 deployment, the This property also indicates which data node in your PCE is the database master database and which is the database slave. This Sub-properties include:
Supercluster Deployment If you have deployed a PCE Supercluster, the PCE health call also returns information about the database replication between the PCE you are currently logged into and all other PCEs in the Supercluster. In a Supercluster deployment, the security policy provisioned on the leader is replicated to all other PCEs in the Supercluster. Additionally, all PCEs in the Supercluster (leader and members) replicate copies of each workload's context, such as IP addresses, to all other PCEs in the Supercluster. This other Properties include:
|
Object
Object
String Object String String String Object
String Number |
Yes
Yes Yes Yes Yes Yes Yes
Yes Yes |
|
metric
|
One or more entries encompassing the metric
|
Object String Array String String |
No | |
generated_at
|
The timestamp of when the PCE information was generated. Format: date-time |
String, Null | Yes |
Configuring Health Metrics
Turn off health metrics
If you want to enable or disable individual metrics, use the CLI commands described in Configurable Thresholds for Health Metrics. You can also use the configurable thresholds technique to turn off all metrics.
Configure thresholds for health metrics
Configure the thresholds that define the normal, warning, and critical status for each health metric.
Use the command illumio-pce-env metrics --write
to adjust these thresholds.
See the PCE Administration Guide for more information.