VEN-to-PCE Communication

This topic discusses how the VEN communicates with the PCE for both Illumio Core Cloud customers and Illumio Core On-Premises customers.

Details about VEN-to-PCE Communication

The VEN, by default, communicates with the PCE when installed in customers data centers (On-Premises) over the following ports:

  • Port 8443 using HTTPS for REST calls.
  • Port 8444 using TLS-over-TCP for the lightning bolt channel.

The VEN, by default, communicates with the Illumio Core Cloud PCE over the following ports:

  • Port 443 for REST calls.
  • Ports 443/444 for lightning-bolt channels.

The VEN uses Transport Level Security (TLS) to connect to the PCE. The PCE certificate must be trusted by the VEN before communication can occur.

The VEN sends the following details to the PCE:

  • Regular heartbeat with the latest hostname and other properties of the workload
  • Traffic log
  • Network interfaces
  • Processes
  • Open ports
  • Interactive users (Windows only)
  • Container workload information (C-VEN only)

The VEN receives the following details from the PCE:

  • Firewall policy
  • Lightning bolts/heartbeat responses with action to perform, such as sending a support report

VEN Connectivity

  • Online: The workload is connected to the network and can communicate with the PCE.
  • Offline: The workload is not connected to the network and cannot communicate with the PCE.
  • Suspended: The VEN is in the suspended state and any rules programmed into the workload's IP tables (including custom iptables rules) or Windows filtering platform firewalls are removed completely. No Illumio-related processes are running on the workload.

VEN offers limited IPv6 policy support. On a per- organization basis, the PCE can send allow-all-ipv6 or block-all-ipv6 policy enforceable by the VEN.

Communication Frequency

The following table shows the frequency of communications to the PCE for common VEN operations. The PCE Administration Guide includes more details about these intervals and their effects.

Function

Frequency

Notes

Firewall policy updates

Real-time if lightning bolts are enabled.

If lightning bolts are displayed or the channel is not functional, policy updates are communicated to the VEN by a heartbeat action.

Active service reporting

See note.

  • AgentManager performs all active service reporting tasks.
  • At start-up, a snapshot of processes and ports is sent to the PCE.
  • Every 24 hours, a snapshot of all listening processes is taken and sent to the PCE.

Interface reports and changes

Event driven.

Only if there are changes to the interfaces; otherwise, no data are sent.

Traffic flow log

Every 10 minutes.

  • The VEN checks if there are logs, and if so, sends them to the PCE.
  • If the PCE is inaccessible, the VEN retains flow summaries for the previous 24 hours but purges logs that are older than 24 hours, with the oldest log at every 24-hour mark.
  • When logs are purged, the VEN locally logs an alert, which is posted to the PCE as an event when connectivity is restored. 

Heartbeat

Every 5 minutes.

If the PCE does not receive three consecutive heartbeats, an event is written to the PCE's event log. See also VEN Heartbeats and Lost Agents.

Dead-peer interval

Configurable

Default is 60 minutes (or 12 heartbeats). See also VEN Offline Timers and Isolation.

VEN tampering detection

Within a few seconds on Windows and Linux.

For more information, see Host Firewall Tampering Protection.

VEN Heartbeats and Lost Agents

The VEN sends a heartbeat message every five minutes to the PCE to inform the PCE that it is up and running. If the VEN fails to send a heartbeat, check the workload where the VEN is installed and investigate any connectivity issues. If the VEN continues to fail to send a heartbeat, it eventually is marked Offline, which means it can no longer communicate with the PCE or other managed workloads.

PCE down or network issue and the VEN degraded state

  • If the VEN cannot connect to the PCE either because the PCE is down or because of a network issue, the VEN continues to enforce the last-known-good policy while it tries to reconnect with the PCE.
  • After missing three heartbeats, the VEN enters the degraded state. In the degraded state, the VEN ignores all the asynchronous commands received as lightning bolts from the PCE, except the commands for software upgrades and support reports.
  • After connectivity to the PCE is restored, the VEN comes out of the degraded state after three successful heartbeats.

Failed authentication and the VEN minimal state

  • If the VEN enters the degraded state because of failed authentications, the VEN enters a state called minimal. In the minimal state, the VEN only attempts to connect with the PCE every four hours through a heartbeat.
  • If the authentication failure was temporary, the VEN exits the minimal state after its first successful connection to the PCE. Whenever the VEN enters the minimal state, it stops the VTAP service. VTAP is then restarted when the VEN exits the minimal state.
  • If Kerberos authentication is used, the VEN attempts to refresh the agent token with a new Kerberos ticket before sending a heartbeat. If the authentication error is not recovered after four hours, the VEN sends a lost-agent message to the PCE which then logs a message in the Organization Events. The message informs the user that the VEN needs to be uninstalled or reinstalled manually on this workload.

VEN Offline Timers and Isolation

When the VEN on a workload is stopped, the VEN makes a "best effort" REST API goodbye call to the PCE. After a delay, specified by the "workload goodbye timer" (a default of 15 minutes), the PCE marks the workload as offline and removes it from the policy.

If the REST API call (goodbye) fails, or if the workload goes offline abruptly (for example, due to a power outage), the PCE stops receiving heartbeats from the workload. After the length of time specified by the value configured in the PCE web console Settings > Offline Timers, the PCE marks the workload as offline and recomputes policies for the peer workloads to isolate the offline workload. If this value has not been set, the default is 60 minutes, or 12 heartbeats.

Sampling Mode for VENs

If the VEN receives a sustained amount of high traffic per second from many individual connections, the VEN enters Sampling Mode to reduce the load. Sampling Mode is a protection mechanism to ensure that the VEN does not contribute to the consumption of CPU. In Sampling Mode, not every flow is reported. Instead, flows are periodically sampled and logged.

After CPU usage on the VEN decreases, Sampling Mode is disabled and each connection is reported to the VEN. The entry and exit from sampling-mode is automatically performed by VEN depending on the load on VEN.

Linux nf_conntrack_tcp_timeout_established

For VENs installed on Linux workloads, the VEN relies on conntrack to manage the nf_conntrack_tcp_timeout_established variable.

By default, as soon as the VEN is installed, it sets the nf_conntrack_tcp_timeout_established value to eight hours (28,800 seconds). This frequency is to manage workload memory by removing unused connections from the table and thereby increase performance.

If you change this setting via sysctl, it is reverted the next time the workload is rebooted or the next time the VEN's configuration file is read.

Wireless Connections and VPNs

The Illumio Core VEN supports wireless connections for VENs installed on endpoints in the Illumio Core.

For more information about installing the VEN on an endpoint, and supporting a wireless network connection, see Single Pane of Glass Guide.

NOTE: Wireless network support is only available for endpoints in Illumio Core. It is not available for other support server types, such as bare-metal servers, virtual machines (VMs), or container hosts.

Show Amount of Data Transfer

The operation of 'show amount of data transfer' capability on the PCE is a preview feature available with the 20.2.0 release. The PCE now reports amount of data transferred in to and out of workloads and applications in a datacenter. The number of bytes sent by and received by the provider of an application are provided separately. These values can be seen in traffic flow summaries streamed out of the PCE. This capability can be enabled on a per-workload basis in the Workload page. It can also be enabled in the pairing profile so that workloads are directly paired into this mode.

After the feature is enabled, the VEN starts reporting the number of bytes transferred over the connections. The PCE collects this data, adds relevant information, such as, labels and sends the traffic flow summaries out of the PCE.

The direction reported in flow summary is from the viewpoint of the provider of the flow.

  • Destination Total Bytes Out (dst_tbo): Number of bytes transferred out of provider (Connection Responder)
  • Destination Total Bytes In (dst_tbi): Number of bytes transferred in to provider (Connection Responder)

The number of bytes includes:

  1. L3 and L4 header sizes of each packet (IP Header and TCP Header)
  2. Sizes of multiple headers that may be included in communication (when SecureConnect is enabled)
  3. Retransmitted packets.

    The bytes transferred in the packets of a connection are included in measurement. This is similar to various networking products such as firewalls, span-port measurement tools, and other network traffic measurement tools that measure network traffic.

Term Description
dst_tbi

Destination Total Bytes

In Total bytes received till now by the destination over the flows included in this flow-summary in the latest sampled interval. This is the same as bytes sent by the source. Present in 'A', 'C', and 'T' flow-summaries. source = client = connection initiator, destination = server = connection responder.

dst_tbo

Destination Total Bytes

Out Total bytes sent till now by the destination over the flows included in this flow-summary in the latest sampled interval. This is the same as bytes received by the source. Present in 'A', 'C', and 'T' flow-summaries. source = client = connection initiator, destination = server = connection responder.

dst_dbi

Destination Delta Bytes

In Number of bytes received by the destination in the latest sampled interval, over the flows included in this flow-summary. This is the same as bytes sent by the source. Present in 'A', 'C', and 'T' flow-summaries. source = client = connection initiator, destination = server = connection responder.

dst_dbo

Destination Delta Bytes

Out Number of bytes sent by the destination in the latest sampled interval, over the flows included in this flow-summary. This is the same as bytes received by the source. Present in 'A', 'C', and 'T' flow-summaries. source = client = connection initiator, destination = server = connection responder.

interval_sec T

Time Interval in Seconds

Duration of latest sampled interval over which the above metrics are valid.

Connection State Description
A Active: The connection is still active at the time the record was posted. Typically observed with long-lived flows on source and destination side of communication.
T Timed Out: Flow does not exist any more. It has timed out. Typically observed on destination side of communication.
C Closed: Flow does not exist any more. It has been closed. Typically observed on source side of communication.
S Snapshot: Connection was active at the time VEN sampled the flow. Typically observed when the VEN is in Idle state.