Production Network Monitoring

This chapter describes the dashboards provided on the Production Network tab, in which you can view traffic and events that occur on the production network interfaces connected to the DANZ Monitoring Fabric. This chapter includes the following sections:

sFlow

The sFlow dashboard is displayed by default when you click the Fabric option. It summarizes information from the sFlow messages sent to the Arista Analytics server from the DANZ Monitoring Fabric controller or other sFlow agents. This dashboard provides the following panels:
  • Top Sources
  • Source Port
  • Top Destinations
  • Destination Port
  • Traffic over time
  • Flow by Filter Interface
  • Flow by Device & IF
  • Count sFlow vs. Last Wk
  • Flow QoS PHB
  • Flow Source
  • Flow Destination
  • sFlow MTU Distribution
  • Flows by Time

sFlow and VXLAN

The sFlow dashboard shows both outerand inner flows of VXLAN packets based on the VNI number of the VXLAN packet. To see all the inner flows of a particular VXLAN packet, first filter by VXLAN packets on the App L4 Port window to display all of the VXLAN packets. Identify the VXLAN packet you are interested in from the Flows by Time window. Expand the row, and note the VNI number of the packet, then remove the VXLAN filter and filter based on the VNI number. This will show both the outer flow of the VXLAN packet and all the inner flows associated with that VXLAN packet.

NetFlow and IPFIX

When you click NetFlow, the system displays the following dashboard:
Figure 1. Production Network > NetFlow Dashboard

To obtain NetFlow packets, you must configure the NetFlow collector interface on the Arista Analytics Node, as described in the Setting up the NetFlow Collector on the Analytics Node section.

The NetFlow dashboard summarizes information from the NetFlow messages sent to the Arista Analytics Node from the DANZ Monitoring Fabric controller or other NetFlow flow exporter, and provides the following panels:
  • nFlow Source IP (inner) Destination IP (outer)
  • NF over Time
  • nFlow Live L4 Ports
  • nFlow by Filter Interface
  • nFlow by Production Device & IF
  • NF by QoS PHB
  • NF by DPI App Name
  • NF Top Talkers by Flow
  • NF Detail
Note: To display the fields in the nFlow by Filter Interface panel for NetFlow V5 and IPFIX generated by the DMF Service Node appliance, records-per-interface and records-per-dmf-interface knobs must be configured in the DANZ Monitoring Fabric controller.
Starting from BMF-7.2.1 release, the Arista Analytics Node can also handle NetFlow V5/V9 and IPFIX traffic. All of the flows represent with a Netflow index. From the NetFlow Dashboard, filter rules apply to display specific flow information.
Figure 2. NetFlow Version 5
Figure 3. NetFlow Version 9
Figure 4. NetFlow Version 10
Note:
  1. The Arista Analytics Node cluster listens to NetFlow v9 and IPFIX traffic on UDP port 4739. NetFlow v5 traffic learn on UDP port 2055.
  2. Refer to DANZ Monitoring Fabric 8.4 User Guide for NetFlow and IPFIX service configuration.
  3. Starting from the DMF-8.1.0 release, Analytics Node capability augment in support of the following Arista Enterprise-Specific Information Element IDs:
    • 1036 -AristaBscanExportReason
    • 1038 -AristaBscanTsFlowStart
    • 1039 -AristaBscanTsFlowEnd
    • 1040 -AristaBscanTsNewLearn
    • 1042 -AristaBscanTagControl
    • 1043 -AristaBscanFlowGroupId

Consolidating Netflow V9/IPFIX records

The user can consolidate Netflow V9 and IPFIX records by grouping those sharing similar identifying characteristics within a configurable time window.

This reduces the number of documents published in ElasticSearch, reducing disk usage and increasing efficiency, specially for long flows where a 40:1 consolidation has been observed.

In case oflow flow rate for packets, it is recommended to not enable this consolidation. It may result in delay in the publication of documents.

The following configuration sets the load-balancing policy of Netflow/IPFIX traffic among nodes in a DMF Analytics
 cluster:analytics# config
analytics(config)# analytics-service netflow-v9-ipfix
analytics(config-controller-service)# load-balancing policy source-hashing
The two settings are:
  • Source hashing: forwards packets to nodes statistically assigned by a hashtable of their source IP address. It is recommended to use this, since consolidation operations are performed on each node independently.
  • Round-robin: distributes the packets equally between the nodes, if source-hashing results in traffic distribution being significantly unbalanced. Round-robin is the default behavior.
Note: It is recommended to configure the round-robin to lighten the load on the leader node, when flow rate is higher than 10k/sec in cluster setup.
Note:This configuration doesn’t apply to single node deployments.

Kibana Setup

To perform the Kibana configuration, select the System > Configuration tab on the Fabric page and open the Analytics Configuration > netflow_stream panel:

Figure 5. Kibana setup
For editing the netflow stream, go to the following tab:
Figure 6. Edit the netflow stream
There are three required settings:
  • enable: enables or disables the consolidation.
  • window_size_ms: window size is adjusted by the rate of Netflow V9/IPFIX packets per second received by the analytics node. By default, the window is set to 30 seconds, though it is measured in millisecond.
  • mode: There are three supported modes:
    • ip-port: consolidates records with the same source IP address, destination IP address, IP protocol number, and lower numerical value of source or destination Layer 4 port number.
    • dmf-ip-port-switch: consolidates records from common DMF Filter switches that also meet "ip-port" criteria.
    • src-dst-mac: consolidates records with the same source and destination MAC address.
      Note:Only use this mode with Netflow V9/IPFIX templates collecting Layer 2 fields.
Starting in DMF-8.5.0, the configuration mentioned above is set under a “consolidation” JSON object as follows:
Figure 7. Consolidating Netflow

Consolidation Troubleshooting

If consolidation is enabled but does not occur, Arista Networks recommends creating a support bundle and contacting Arista TAC.

Load-balancing Troubleshooting

If there are any issues related to load-balancing, Arista Networks recommends creating a support bundle and contacting Arista TAC.

NetFlow and IPFIX Flow with Application Information

This section describes a new feature of Arista Analytics that combines Netflow and IPFIX records containing application information with Netflow and IPFIX records containing flow information.

This feature improves the visibility of data per application by correlating flow records with applications identified by the flow exporter.

This release supports only applications exported from Arista Networks Service Nodes. In a multi-node cluster, load balancing must be configured in the Analytics Node CLI command.

Configuration

In a multi-node Analytics cluster, set the load-balancing policy of Netflow/IPFIX traffic to source-hashing as the round-robin policy may cause application information to be missing from the resulting flow documents in ElasticSearch.
analytics# config
analytics(config)# analytics-service netflow-v9-ipfix
analytics(config-an-service)# load-balancing policy source-hashing
Note: This configuration doesn’t apply to single-node deployments.

Kibana Configuration

To perform the Kibana configuration, select the System > Configuration tab on the Fabric page and open the Analytics Configuration > netflow_stream visualization.
Figure 8. Dashboard - Netflow stream configuration
Add the app_id configuration object.
Figure 9. Edit - Netflow stream
In the app_id configuration object, one setting is required:
  • add_to_flows: Enables or disables the merging feature.

ElasticSearch Documents

Three fields display the application information in the final NetFlow/IPFIX document stored in ElasticSearch:

  • appScope: Name of the NetFlow/IPFIX exporter.
  • appName: Name of the application. This field is only populated if the exporter is NTOP.
  • appID: Unique application identifier assigned by the exporter.

Troubleshooting

If merging is enabled but does not occur, Arista Networks recommends creating a support bundle and contacting Arista TAC.

Limitations
  • Some flow records may not include the expected application information when configuring round-robin load balancing of Netflow/IPFIX traffic. Arista Networks recommends configuring the source-hashing load-balancing policy and sending all Netflow/IPFIX traffic to the Analytics Node from the same source IP address.
  • Application information and flow records are correlated only if the application record is received before the flow record.
  • Arista Networks only supports collecting application information from Netflow/IPFIX exporters: NTOP, Palo Alto Networks firewalls, and Arista Networks Service Node.

NetFlow and sFlow Traffic Volume Upsampling

This feature of Arista Analytics offers the ability to upsample traffic volume sampled by NetFlow V9/IPFIX and sFlow. This feature provides a better visibility of traffic volumes by approximating the number of bytes and packets from samples collected by the NetFlow V9/IPFIX or sFlow sampling protocols. It provides those approximation statistics along with the ElasticSearch statistics. The feature bases the approximations on the flow exporter’s sampling rate or a user-provided fixed factor.

Note: When the rate of flow packets is low or for short flows, the approximations will be inaccurate.

The DMF 8.5.0 release does not support the automated approximation of total bytes and packets for Netflow V9/IPFIX. If upsampling is needed, Arista Networks recommends configuring a fixed upsampling rate.

NetFlow/IPFIX Configuration

To perform the Kibana configuration, select the System > Configuration tab on the Fabric page and open the Analytics Configuration > netflow_stream visualization.

Figure 10. Dashboard - Netflow IPFIX configuration
Figure 11. Edit - Netflow IPFIX
There is one required setting, upsample_byte_packet_factor, with two possible options:
  • Auto: This is the default option. DMF 8.5.0 does not support automated upsampling for Netflow V9/IPFIX. Arista Networks recommends configuring an integer if upsampling is needed.
  • Integer: Multiply the number of bytes and packets for each collected sample by this configured number.

sFlow Configuration

To perform the Kibana configuration, select the System > Configuration tab on the Fabric page and open the Analytics Configuration > sFlow visualization.

Figure 12. Dashboard - sFlow configuration
Figure 13. Edit - sFlow
There is one required setting, upsample_byte_packet_factor, with two possible options:
  • Auto: Approximate the number of bytes and packets for each collected sample based on the collector’s sampling rate. Auto is the default option.
  • Integer: Multiply the number of bytes and packets for each collected sample by this configured number.

Dashboards

NetFlow Dashboard
The NetFlow dashboard is on the Production Network > NetFlow tab on the Fabric page. The following visualization will display upsampled statistics:
  • NF over Time
  • NF Top Talkers by Flow
Figure 14. NF Detail visualization
The DMF 8.5.0 release adds two new columns:
  • upsampledPacketCount: Approximate total count of packets for a flow.
  • upsampledByteCount: Approximate total count of bytes for a flow.
Note: In DMF 8.5.0, configuring upsampling to Auto, upsampledByteCount and upsampledPacketCount will copy the bytes and packets column and display the values of bytes and packets in the graphs and tables of this dashboard.
sFlow Dashboard

The SFlow dashboard is on the Production Network > sFlow tab on the Fabric page. The Traffic over Time visualization will display upsampled statistics.

Figure 15. Flow by Time visualization

The newly added upsampledByteCount represents a flow's approximate total count of bytes.

Troubleshooting

Arista Networks recommends creating a support bundle and contacting Arista Networks TAC if upsampling isn’t working correctly.

TCPFlow

When you click the TCPFlow tab, the system displays the following dashboard.
Figure 16. Production Network > TCPFlow Dashboard

The information on the TCPFlow dashboard depends on TCP handshake signals and deduplicates. The Filter Interface visualization indicates the filter switch port where data is received. The switch description is specified in the Description attribute of each switch, configured on the DANZ Monitoring Fabric controller. Device & IF on this dashboard refers to the end device and depends on LLDP packets received.

Flows

When you click the Flows tab, the system displays the following dashboard.
Figure 17. Production Network > Flows Dashboard
The Flows Dashboard summarizes information from sFlow and NetFlow messages and provides the following panels:
  • All Flows Type
  • All Flows Overtime
  • All Flows Details

Filters & Flows

When you click the Filters & Flows tab, the system displays the following dashboard.
Figure 18. Production Network > Filters & Flows Dashboard

ARP

When you click the ARP tab, the system displays the following dashboard. This data correlates by the tracked host feature on the DANZ Monitoring Fabric controller. You see All ARP data over time when you switch interface and production devices.
Figure 19. Production Network > ARP Dashboard

DHCP

When you click the DHCP tab, the system displays the following dashboard.
Figure 20. Production Network > DHCP Dashboard
Note: You can see information about operating systems on the network and data by filter interface and production device.
The DHCP Dashboard summarizes information from analyzing DHCP activity and provides the following panels:
  • DHCP OS Fingerprinted
  • DHCP Messages by Filter Interface
  • DHCP Messages by Production Switch
  • Non-whitelist DHCP Servers
  • DHCP Messages Over Time
  • DHCP Messages by Type
  • DHCP Messages

DNS

When you click the DNS tab, the system displays the following dashboard.
Figure 21. Production Network > DNS Dashboard
The DNS Dashboard summarizes information from analyzing DNS activity and provides the following panels:
  • DNS Top Servers
  • DNS Top Clients
  • DNS By Filter Interface
  • DNS by Production Device & IF
  • DNS Messages Over Time
  • Unauthorized DNS Servers
  • DNS RTT
  • DNS All Messages
  • DNS RCode Distro
  • DNS QType Description
  • DNS Top QNames
Note: The DNS RTT value computes using the query and response packet timestamps. If a query packet does not answer by a response packet within 180 seconds, the RTT value sets to -1.

ICMP

When you click the ICMP tab, the system displays the following dashboard.
Figure 22. Production Network > ICMP Dashboard
The ICMP Dashboard summarizes information from analyzing ICMP activity and provides the following panels:
  • Top ICMP Message Source
  • ICMP by Filter Interface
  • Top ICM Message Dest
  • ICMP by Error Description
  • ICMP by Production Switch
  • ICMP Top Err Dest IPs
  • ICMP Top Err Dest Port Apps
  • ICMP Messages Over Time
  • ICMP Table