VEN-to-PCE Communication

This topic discusses how the VEN communicates with the PCE for Illumio Xpress customers.

Details about VEN-to-PCE Communication

The VEN, by default, communicates with the Illumio Xpress PCE over the following ports:

  • Port 443 for REST calls.
  • Ports 443/444 for lightning-bolt channels.

The VEN uses Transport Level Security (TLS) to connect to the PCE. The PCE certificate must be trusted by the VEN before communication can occur.

The VEN sends the following details to the PCE:

  • Regular heartbeat with the latest hostname and other properties of the workload
  • Traffic log
  • Network interfaces
  • Processes
  • Open ports
  • Interactive users (Windows only)

The VEN receives the following details from the PCE:

  • Firewall policy
  • Lightning bolts/heartbeat responses with action to perform, such as sending a support report

VEN Connectivity

  • Online: The workload is connected to the network and can communicate with the PCE.
  • Offline: The workload is not connected to the network and cannot communicate with the PCE.
  • Suspended: The VEN is in the suspended state and any rules programmed into the workload's IP tables (including custom iptables rules) or Windows filtering platform firewalls are removed completely. No Illumio Xpress-related processes are running on the workload.

VEN offers limited IPv6 policy support. On a per- organization basis, the PCE can send allow-all-ipv6 or block-all-ipv6 policy enforceable by the VEN.

Communication Frequency

The following table shows the frequency of communications to the PCE for common VEN operations.

Function

Frequency

Notes

Firewall policy updates

Real-time if lightning bolts are enabled.

If lightning bolts are displayed or the channel is not functional, policy updates are communicated to the VEN by a heartbeat action.

Active service reporting

See note.

  • AgentManager performs all active service reporting tasks.
  • At start-up, a snapshot of processes and ports is sent to the PCE.
  • Every 24 hours, a snapshot of all listening processes is taken and sent to the PCE.

Interface reports and changes

Event driven.

Only if there are changes to the interfaces; otherwise, no data are sent.

Traffic flow log

Every 10 minutes.

  • The VEN checks if there are logs, and if so, sends them to the PCE.
  • If the PCE is inaccessible, the VEN retains flow summaries for the previous 24 hours but purges logs that are older than 24 hours, with the oldest log at every 24-hour mark.
  • When logs are purged, the VEN locally logs an alert, which is posted to the PCE as an event when connectivity is restored. 

Heartbeat

Every 5 minutes.

If the PCE does not receive three consecutive heartbeats, an event is written to the PCE's event log. See also VEN Heartbeats and Lost Agents.

Dead-peer interval

Configurable

Default is 60 minutes (or 12 heartbeats). See also VEN Offline Timers and Isolation.

VEN tampering detection

Within a few seconds on Windows and Linux.

For more information, see Host Firewall Tampering Protection.

VEN Heartbeats and Lost Agents

The VEN sends a heartbeat message every five minutes to the PCE to inform the PCE that it is up and running. If the VEN fails to send a heartbeat, check the workload where the VEN is installed and investigate any connectivity issues. If the VEN continues to fail to send a heartbeat, it eventually is marked Offline, which means it can no longer communicate with the PCE or other managed workloads.

PCE down or network issue and the VEN degraded state

  • If the VEN cannot connect to the PCE either because the PCE is down or because of a network issue, the VEN continues to enforce the last-known-good policy while it tries to reconnect with the PCE.
  • After missing three heartbeats, the VEN enters the degraded state. In the degraded state, the VEN ignores all the asynchronous commands received as lightning bolts from the PCE, except the commands for software upgrades and support reports.
  • After connectivity to the PCE is restored, the VEN comes out of the degraded state after three successful heartbeats.

Failed authentication and the VEN minimal state

  • If the VEN enters the degraded state because of failed authentications, the VEN enters a state called minimal. In the minimal state, the VEN only attempts to connect with the PCE every four hours through a heartbeat.
  • If the authentication failure was temporary, the VEN exits the minimal state after its first successful connection to the PCE. Whenever the VEN enters the minimal state, it stops the VTAP service. VTAP is then restarted when the VEN exits the minimal state.
  • If Kerberos authentication is used, the VEN attempts to refresh the agent token with a new Kerberos ticket before sending a heartbeat. If the authentication error is not recovered after four hours, the VEN sends a lost-agent message to the PCE which then logs a message in the Organization Events. The message informs the user that the VEN needs to be uninstalled or reinstalled manually on this workload.

VEN Offline Timers and Isolation

When the VEN on a workload is stopped, the VEN makes a "best effort" REST API goodbye call to the PCE. After a delay, specified by the "workload goodbye timer" (a default of 15 minutes), the PCE marks the workload as offline and removes it from the policy.

If the REST API call (goodbye) fails, or if the workload goes offline abruptly (for example, due to a power outage), the PCE stops receiving heartbeats from the workload. After the length of time specified by the value configured in the PCE web console Settings > Offline Timers, the PCE marks the workload as offline and recomputes policies for the peer workloads to isolate the offline workload. If this value has not been set, the default is 60 minutes, or 12 heartbeats.

Sampling Mode for VENs

If the VEN receives a sustained amount of high traffic per second from many individual connections, the VEN enters Sampling Mode to reduce the load. Sampling Mode is a protection mechanism to ensure that the VEN does not contribute to the consumption of CPU. In Sampling Mode, not every flow is reported. Instead, flows are periodically sampled and logged.

After CPU usage on the VEN decreases, Sampling Mode is disabled and each connection is reported to the VEN. The entry and exit from sampling-mode is automatically performed by VEN depending on the load on VEN.

Linux nf_conntrack_tcp_timeout_established

For VENs installed on Linux workloads, the VEN relies on conntrack to manage the nf_conntrack_tcp_timeout_established variable.

By default, as soon as the VEN is installed, it sets the nf_conntrack_tcp_timeout_established value to eight hours (28,800 seconds). This frequency is to manage workload memory by removing unused connections from the table and thereby increase performance.

If you change this setting via sysctl, it is reverted the next time the workload is rebooted or the next time the VEN's configuration file is read.

Wireless Connections and VPNs

Security policy is not enforced on wireless connections or VPNs on any of the supported platforms.