|
|
This chapter describes how to customize FlowCollector operation using thread, filter, and protocol definitions, lists of port and autonomous system numbers, and other FlowCollector configuration parameters.
This chapter includes information on the following topics:
The process of customizing FlowCollector operation involves changes and additions to one or more of the following FlowCollector configuration and resource files:
For more information about creating or modifying threads and filters, refer to the section "Understanding FlowCollector Data Collection and Aggregation," later in this chapter.
For more information about creating or modifying protocols, refer to the section "Defining Protocols," later in this chapter.
Flow records having source ports that match defined values in this file are aggregated together. Flow records from source ports not defined in this file are aggregated under "Others."
For more information about creating or modifying source port numbers, refer to the section "Defining Source and Destination Port Numbers," later in this chapter.
Flow records having destination ports that match defined values in this file are aggregated together. Flow records from destination ports not defined in this file are aggregated under "Others."
For more information about creating or modifying destination port numbers, refer to the section "Defining Source and Destination Port Numbers," later in this chapter.
Flow records having source autonomous system numbers that match defined values in this file are aggregated together. Flow records from source autonomous system numbers not defined in this file are aggregated under "Others."
For more information about creating or modifying source autonomous system numbers, refer to the section "Defining Source and Destination Autonomous System Numbers," later in this chapter.
Flow records having destination autonomous system numbers that match defined values in this file are aggregated together. Flow records from destination autonomous system numbers not defined in this file are aggregated under "Others."
For more information about creating or modifying destination autonomous system numbers, refer to the section "Defining Source and Destination Autonomous System Numbers," later in this chapter.
FlowCollector collects and summarizes (aggregates) data into data files based on user-defined criteria specified in a FlowCollector thread. A thread is an aggregation task defined by a set of user-configurable attributes that specify how FlowCollector will aggregate the traffic flows stored on the workstation. Two key thread attributes are:
Figure 6-1 shows an example of how FlowCollector uses threads and filters. In this example, threadA uses filterA and the SourceNode aggregation scheme; threadB uses both filterA and filterB (filters can be shared among threads) and the DestPort aggregation scheme; threadC doesn't use any filters, but it also uses the DestPort aggregation scheme.

The following nf.config file example contains the thread definitions to accomplish the general data aggregation scheme shown in Figure 6-1. The file contains two filter definitions (filterA and filterB) and three thread definitions (threadA, threadB, and threadC) to accomplish the aggregation. NetFlow export traffic arrives on FlowCollector UDP port 9991. Data is to be aggregated by the following aggregation schemes:
Filter filterA permit nexthop 172.16.23.65 0.0.0.0 Filter filterB deny srcaddr 172.16.0.0 0.0.255.255 permit srcaddr 0.0.0.0 0.0.0.0 Thread threadA Filter filterA Aggregation SourceNode Period 10 Port 9991 State Active DataSetPath /opt/CSConfc/Data DiskSpaceLimit 0 FileRetain 0 Thread threadB Filter filterA Filter filterB Aggregation DestPort Period 10 Port 9991 State Active DataSetPath /opt/CSCOnfc/Data DiskSpaceLimit 0 FileRetain 0 Thread threadC Aggregation DestPort Period 10 Port 9991 State Active DataSetPath /opt/CSCOnfc/Data DiskSpaceLimit 0 FileRetain 0
In the example, threadA uses filterA to include only the traffic passing through the export device 172.16.23.65, while threadB uses both filters--filterA to include only the traffic passing through the export device 172.16.23.65, and filterB to exclude traffic from network 172.16.0.0. All three threads flush their aggregated data every 10 minutes into data files saved in the /opt/CSCOnfc/Data directory.
The syntax for a filter definition is as follows:
filterfilter-name{permit|deny}type value mask. . .{permit|deny}type value mask
| filter | The keyword that identifies the definition as a filter. |
| filter-name | The unique, user-specified name of the filter. The name can be up to 14 alphanumeric characters. |
| permit | The permit keyword keeps the data that matches the specified filter type and value. |
| deny | The deny keyword rejects the data that matches the specified filter type and value (matching flow data is ignored and not aggregated). |
| type | The filter type. See Table 6-1 for a description of filter types. |
| value | The value associated with the filter type. All filter types require a value. See Table 6-1 for a description of filter types and values. |
| mask | If the filter uses the srcaddr, dstaddr, or nexthop type, you must provide the IP netmask that qualifies the IP address used as value. See Table 6-1 for a description. |
Filter keyword and variable entries are not case-sensitive. Table 6-1 describes the default filter types provided with FlowCollector, the type of input required for the value, and whether the value requires a mask.
| Type | Value | Mask Required | Description |
|---|---|---|---|
| srcaddr | Source IP address | Yes | Filter the input data based on the source IP address. If you use this type, you must provide the IP netmask that qualifies the source IP address. |
| dstaddr | Destination IP address | Yes | Filter the input data based on the destination IP address. If you use this type, you must provide the IP netmask that qualifies the destination IP address. |
| srcport | Source port number | No | Filter the input data based on the source port number. |
| destport | Destination port number | No | Filter the input data based on the destination port number. |
| srcinterface | Source interface number | No | Filter the input data based on the source interface number. |
| dstinterface | Destination interface number | No | Filter the input data based on the destination interface number. |
| nexthop | Next hop IP address | Yes | Filter the input data based on the next hop IP address. If you use this type, you must provide the IP netmask that qualifies the next hop IP address. |
| Protocol | Protocol name | No | Filter the input data based on the protocol definitions in the nfknown.protocols file.
For more information on protocol definitions, refer to the section "Defining Protocols." |
| Prot | Protocol number | No | Filter the input data based on the protocol number in the flow record, where the protocol number corresponds to a protocol specified in the /etc/protocols file of your workstation. |
| TOS | Type of service | No | Filter the input data based on the type of service (ToS). |
| srcas | Source AS | No | Filter the input data based on the autonomous system number of the source, either origin or peer. |
| dstas | Destination AS | No | Filter the input data based on the autonomous system number of the destination, either origin or peer. |
When defining a filter, keep in mind the following qualifications:
A thread is a set of defined attributes that tells FlowCollector how to aggregate the traffic flows stored on the workstation.
The syntax for a thread definition is as follows:
Thread thread-name
[Filter filter-name]
.
.
.
[Filter filter-name]
Aggregation scheme
Period minutes
Port value
DataSetPath directory-path
[DiskSpaceLimit Megabytes]
State active|inactive
FileRetain number
The keywords and their arguments are listed on separate lines for legibility. Keyword and argument entries are not case-sensitive. Table 6-2 lists thread attributes and variables.
| Attribute | Variable | Definition |
|---|---|---|
| Thread | thread-name | Unique, user-defined name of the thread. Can be up to 18 alphanumeric characters. |
| Filter | filter-name | (Optional.) Unique name of a previously defined filter. You can specify one or more filters in a thread definition. When more than one filter is specified in a thread, the result is a logical AND of the functions defined in the filters.
Filters can be shared among threads. For more information on filters, refer to the section "Creating a Filter," later in this chapter. |
| Aggregation | scheme | A way to summarize data collected by FlowCollector. For more information about aggregation schemes, refer to the section "Aggregation Schemes," later in this chapter. |
| Period | minutes | The frequency, in minutes, for how often FlowCollector writes aggregated data from its memory buffers into a data file. Data received in each period is written into a separate file. For example, setting the period to 30 minutes generates two data files every hour. |
| Port | value | The UDP port number on which FlowCollector is expecting NetFlow data from NetFlow export devices. The valid range of ports is between 1024 and 65535.
In a default FlowCollector installation, UDP ports 9995 and 9996 are automatically configured as the UDP ports FlowCollector uses to receive NetFlow exported data. These numbers are defined in the default set of threads provided as part of the FlowCollector installation. You can define other UDP port numbers by selecting a number in the range 1024 to 65535, and using that number as the value in the Port attribute of an active thread definition. |
| DataSetPath | directory-path | Directory path used for storing the aggregated data (data files). If FlowCollector does not have write permission to the directory specified by a DataSetPath attribute in a thread definition, it uses $NFC_DIR as the root directory for the data files.
For more information on data files, refer to the chapter "Understanding the FlowCollector Data File Format," earlier in this guide. |
| DiskSpaceLimit | Megabytes | (Optional.) Defines a limit of the total disk usage for the disk partition where DataSetPath resides, beyond which this thread will no longer write data to the disk. This attribute essentially allows you to reserve disk space for other threads by limiting the amount of disk space consumed by this thread, and can help prevent disk space exhaustion as well. The default for DiskSpaceLimit is 0 MB, which means DiskSpaceLimit checking is disabled.
If the thread uses RawFlows as the aggregation scheme, the DiskSpaceLimit attribute has no effect. For more information on using the DiskSpaceLimit attribute, refer to the sections "DiskSpaceLimit Thread Attribute" and "Managing Disk Space," later in this chapter. |
| State | active or inactive | When a thread is active, FlowCollector aggregates data according to the attributes defined for the thread and produces data files; when the thread is inactive, FlowCollector does not aggregate data according to the attributes defined for the thread and does not produce data files.
You can have a maximum of 10 active threads at any time. |
| FileRetain | number | Specifies the maximum number of data files to retain for each export device per day for this thread. The default for FileRetain is 0, which means the FileRetain feature is disabled.
The FileRetain attribute must be the last attribute of a thread. |
In the following example, thread Alpha uses the SourceNode aggregation scheme. FlowCollector creates a data file in the directory /opt/CSCOnfc/Data every 30 minutes and keeps the last 24 data files per day:
Thread Alpha Aggregation SourceNode Period 30 Port 9991 State Active DataSetPath /opt/CSCOnfc/Data DiskSpaceLimit 0 FileRetain 24
FlowCollector provides a library of predefined aggregation schemes (see Table 6-3) that you can use to determine the type of information that is aggregated and stored in the data files.
Each aggregation scheme consists of one or more key fields, which tell FlowCollector what to look for in the exported NetFlow datagram, and one or more value fields, which contain statistical information pulled from the exported NetFlow datagram. The key fields and value fields shown in Table 6-3 correspond to the fields found in Version 1, Version 5, and Version 7 NetFlow export datagrams. Table 6-4 provides brief definitions of each of the key and value fields. For more information about these three versions of the NetFlow export datagram format, refer to the appendix "NetFlow Export Datagram Format," later in this guide.
For example, the SourceNode aggregation scheme uses just one key field, srcaddr (source address), and returns data for three value fields, the total number of packets sent, the total number of bytes sent, and the total number of flows aggregated into this record. Other aggregation schemes offer different combinations of key and value fields (see Table 6-3), and are described individually in the sections that follow.
| Aggregation Scheme | Key Fields | Value Fields | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| srcaddr | dstaddr | srcport | dstport | protocol | protocol byte (prot) | ToS | input interface | output interface | nexthop | src_as | dst_as | masked srcaddr | masked dstaddr | src_mask | dst_mask | packet count | byte count | flow count | firstTimeStamp | lastTimeStamp | totalActiveTime | |
| RawFlows |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| SourceNode | · |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| · | · | · |
|
|
|
| DestNode |
| · |
|
|
|
|
|
|
|
|
|
|
|
|
|
| · | · | · |
|
|
|
| HostMatrix | · | · |
|
|
|
|
|
|
|
|
|
|
|
|
|
| · | · | · |
|
|
|
| SourcePort |
|
| · |
|
|
|
|
|
|
|
|
|
|
|
|
| · | · | · |
|
|
|
| DestPort |
|
|
| · |
|
|
|
|
|
|
|
|
|
|
|
| · | · | · |
|
|
|
| Protocol |
|
|
|
| · |
|
|
|
|
|
|
|
|
|
|
| · | · | · |
|
|
|
| DetailDestNode |
| · | · | · | · |
|
|
|
|
|
|
|
|
|
|
| · | · | · |
|
|
|
| DetailHostMatrix | · | · | · | · | · |
|
|
|
|
|
|
|
|
|
|
| · | · | · | · | · | · |
| DetailInterface | · | · |
|
|
|
|
| · | · | · |
|
|
|
|
|
| · | · | · |
|
|
|
| CallRecord | · | · | · | · |
| · | · |
|
|
|
|
|
|
|
|
| · | · | · | · | · | · |
| ASMatrix |
|
|
|
|
|
|
|
|
|
| · | · |
|
|
|
| · | · | · |
|
|
|
| NetMatrix |
|
|
|
|
|
|
| · | · |
|
|
| · | · | · | · | · | · | · |
|
|
|
| DetailSourceNode | · |
| · | · | · |
|
|
|
|
|
|
|
|
|
|
| · | · | · |
|
|
|
| DetailASMatrix | · | · | · | · | · |
|
| · | · |
| · | · |
|
|
|
| · | · | · |
|
|
|
| Field | Description |
|---|---|
| srcaddr | Source IP address |
| dstaddr | Destination IP address |
| srcport | TCP/UDP source port number or equivalent |
| dstport | TCP/UDP destination port number or equivalent |
| protocol | The name or label assigned to a protocol definition in the nfknown.protocols file. |
| protocol byte (prot) | IP protocol type (for example, TCP=6; UDP=17) |
| ToS | IP type-of-service |
| input interface | SNMP index of input interface |
| output interface | SNMP index of output interface |
| nexthop | IP address of next hop export device |
| src_as | Autonomous system number of the source, either origin or peer |
| dst_as | Autonomous system number of the destination, either origin or peer |
| masked srcaddr | srcaddr masked with the source netmask (src_mask) |
| masked dstaddr | dstaddr masked with the destination netmask (dst_mask) |
| src_mask | Source IP address prefix mask bits |
| dst_mask | Destination IP address prefix mask bits |
| packet count | Packets counted as part of this record |
| byte count | Total number of Layer 3 bytes counted as part of this record |
| flow count | Total number of flows aggregated into this record |
| firstTimeStamp | The time, in UTC seconds, of the first packet summarized into this record |
| lastTimeStamp | The time, in UTC seconds, of the last packet summarized into this record |
| totalActiveTime | The sum of individual activetime for all the flows summarized into the current record |
The output from this aggregation scheme is an exact image of the NetFlow export datagram without aggregation, and is stored in binary data files of n minutes worth of data as specified by the Period attribute in the thread definition.
The output of this aggregation scheme consists of one record for each unique source IP address present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:
| Key field: | srcaddr |
| Value fields: | packet count, byte count, and flow count |
The output of this aggregation scheme consists of one record for each unique destination IP address present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:
| Key field: | dstaddr |
| Value fields: | packet count, byte count, and flow count |
The output of this aggregation scheme consists of one record for each unique source and destination IP address pair present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:
| Key fields: | srcaddr, dstaddr |
| Value fields: | packet count, byte count, and flow count |
The output of this aggregation scheme consists of one record for each unique source port present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:
| Key field: | srcport |
| Value fields: | packet count, byte count, and flow count |
Known source ports are defined in the nfknown.srcports file. Undefined source ports are aggregated as "Others" in the data file.
The output of this aggregation scheme consists of one record for each unique destination port present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:
| Key field: | dstport |
| Value fields: | packet count, byte count, and flow count |
Known destination ports are defined in the nfknown.dstports file. Undefined destination ports are aggregated as "Others" in the data file.
The output of this aggregation scheme consists of one record for each unique protocol (as defined in the nfknown.protocols file) present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:
| Key field: | protocol |
| Value fields: | packet count, byte count, and flow count |
Known protocols are defined in the nfknown.protocols file. Undefined protocols are aggregated as "Others" in the data file.
The output of this aggregation scheme consists of one record for each unique combination of destination IP address, source port, destination port, and protocol present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:
| Key fields: | dstaddr, srcport, dstport, protocol |
| Value fields: | packet count, byte count, and flow count |
| Key fields: | srcaddr, dstaddr, srcport, dstport, protocol |
| Value fields: | packet count, byte count, and flow count, firstTimeStamp, lastTimeStamp, and totalActiveTime |
The output of this aggregation scheme consists of one record for each unique combination of source IP address, destination IP address, input, output, and nexthop present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:
| Key fields: | srcaddr, dstaddr, input, output, nexthop |
| Value fields: | packet count, byte count, and flow count |
| Key fields: | srcaddr, dstaddr, srcport, dstport, protocol, ToS |
| Value fields: | packet count, byte count, and flow count, firstTimeStamp, lastTimeStamp, and totalActiveTime |
The output of this aggregation scheme consists of one record for each unique source and destination autonomous system number pair present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:
| Key fields: | src_as, dst_as |
| Value fields: | packet count, byte count, and flow count |
| Key fields: | input interface, output interface, masked srcaddr, masked dstaddr, src_mask, dst_mask |
| Value fields: | packet count, byte count, and flow count |
The output of this aggregation scheme consists of one record for each unique combination of source IP address, source port, destination port, and protocol present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:
| Key fields: | srcaddr, srcport, destport, protocol |
| Value fields: | packet count, byte count, and flow count |
| Key fields: | srcaddr, dstaddr, srcport, dstport, protocol, input interface, output interface, src_as, dst_as |
| Value fields: | packet count, byte count, and flow count |
Use the information in this section to define the protocols that you want FlowCollector to recognize as it aggregates data. The protocols FlowCollector recognizes are defined in the nfknown.protocols file, located in the $NFC_DIR/config directory.
FlowCollector will recognize the protocol and aggregate traffic statistics associated with the protocol only when the following conditions are met:
If you remove the protocol from the nfknown.protocols file, information for that protocol is no longer recognized and is aggregated under "Others."
Figure 6-2 shows an example of a typical communication session between host A and host B. This example assumes that NetFlow data export is enabled for the export device interfaces to both host A and host B so that exported NetFlow data gives FlowCollector statistics for communication in both directions (A to B; B to A). In this example, FlowCollector aggregates data for two protocols-- Telnet and FTP--between host A (the Telnet server, using port 23) and host B (the Telnet client, using port 9001).
Whether the aggregated data is stored in data files for later retrieval depends on how FlowCollector is customized.

Table 6-5 lists the raw data received in the data collection example shown in Figure 6-2.
| Source Address | Destination Address | Source Port | Destination Port | Protocol Byte | Packets | Bytes |
|---|---|---|---|---|---|---|
| A | B | 23 | 9001 | 6 | 20 | 2000 |
| B | A | 9001 | 23 | 6 | 30 | 1000 |
| A | B | 20 | 9002 | 6 | 20 | 200 |
| B | A | 9002 | 20 | 6 | 50 | 300 |
When you add the Telnet protocol definition to the nfkown.protocols file, the data file produced by the Protocol aggregation scheme will contain a row with 50 packets (20 plus 30) and 3000 bytes (2000 plus 1000).
In this example, no FTP protocol definition was added to the nfkown.protocols file, so the data file will also have another row for "Others" (including the FTP data) containing 70 packets (20 plus 50) and 500 bytes (200 plus 300).
The protocols listed in the nfknown.protocols file are used by the aggregation schemes and protocol filters you define in the nfconfig.file. To configure the protocols that FlowCollector recognizes, you must edit the nfknown.protocols file and add a definition that includes the following information:
The command syntax for a protocol definition is
protocol name [[srcport|dstport]number [OR [srcport|dstport] number]]prot value
where:
| protocol | The keyword that identifies the definition as a protocol. |
| name | Unique, user-specified name of the protocol definition. Can be up to 14 alphanumeric characters. |
| srcport | Source port |
| dstport | Destination port |
| number | Port number |
| OR | (Optional) Provides a Boolean OR functionality when you have more than one srcport or dstport. |
| prot | Protocol byte |
| value | Protocol type (similar to those specified in the /etc/protocols file on a UNIX workstation; for example, TCP=6, UDP=17). |
The known protocols (such as WWW, Telnet, FTP) listed in the nfknown.protocols files are similar to the definitions specified in the /etc/services file of a UNIX workstation. For information about the protocols and protocol types supported on your workstation, refer to the protocols file in the /etc directory on your workstation.
The protocol definitions in the nfknown.protocols file cause FlowCollector to recognize the protocols originating from, or terminating on the specified ports. For example, in the sample protocol list shown below, the first protocol definition uses the OR option to cause FlowCollector to recognize traffic flows for all Telnet sessions originating from or terminating on port 23.
Protocol TCP-Telnet Dstport 23 OR Srcport 23 Prot 6 Protocol TCP-FTP Srcport 20 OR Srcport 21 OR Dstport 20 OR Dstport 21 Prot 6 Protocol TCP-WWW Dstport 80 OR Srcport 80 Prot 6 Protocol TCP-SMTP Srcport 25 OR Dstport 25 Prot 6 Protocol TCP-Other Prot 6 Protocol UDP-TFTP Srcport 69 OR Dstport 69 Prot 17
Use the information in this section to specify the source and destination port numbers from which FlowCollector will collect and aggregate data. The port numbers FlowCollector recognizes are defined in the following files, located in the $NFC_DIR/config directory:
FlowCollector uses the contents of the nfknown.srcports and nfknown.dstports files in any aggregation scheme that uses SourcePort and DestPort fields. When you add a port definition to either of these of files, traffic to or from the defined port is counted separately in the data file. Unrecognized ports (ports not defined in their respective files) are aggregated as "Others" in the data file.
The command syntax for a port number, range of port numbers, or a range grouped under an assigned label is:
value[,value[:label]]
.
.
.
value[,value[:label]]
where:
| value | A number between 0 and 65535 |
| label | (Optional.) An alphanumeric ASCII string of up to 14 characters |
A range or ports is defined by using a comma to separate two numbers (an optional space can be added for legibility). A range can span any set of ports up to the maximum number of ports available on the system (currently 65,535). The following example shows a range of ports:
50, 100
You can also define a range of source or destination ports to be treated as one logical port, and assign a label to represent that range of ports. The following example shows a range of ports to be treated as the logical port named 10K_19K_Port_Range.
10000, 19999: 10K_19K_Port_Range
In this case, traffic is aggregated and reported for the logical port 10K_19K_Port_Range, rather than for each of the individual port numbers in the range.
The following example shows the contents of a sample nfknown.srcports file:
21:ftp 88 50, 100 10000, 19999: 10K_19K_Port_Range 20000, 29999: My_Range 40000, 49999: My_Range
In the example above, the entry:
| 21:ftp | Indicates that a flow with port number of 21 is aggregated under the label ftp in the data file. |
| 88 | Indicates that a flow with a port number of 88 is aggregated under the label 88 in the data file. |
| 50, 100 | Indicates that a flow with a port number in the range from 50 to 100 is aggregated under a label that is the same as its port number. For example, if a flow has the port number 75, the label of the flow in the data file is 75. Flows within the range, but with different port numbers are aggregated individually in the data file. |
| 10000, 19999: 10K_19K_Port_Range | Indicates that the port number of any port in the range is replaced by the label 10K_19K_Port_Range, and that flows within the range are aggregated together under the label 10K_19K_Port_Range in the data file. |
| 20000, 29999:My_Range 40000, 49999:My_Range | Indicate that the port number of any port in the two ranges is replaced by the label My_Range, and that flows within the specified ranges are aggregated together under the label My_Range in the data file. |
Use the information in this section to specify the source and destination autonomous systems from which FlowCollector will collect and aggregate data. The autonomous systems FlowCollector recognizes are defined in the following files, located in the $NFC_DIR/config directory:
FlowCollector uses the contents of the nfknown.srcasns and nfknown.dstasns files in aggregation schemes that make use of source and/or destination autonomous system numbers. When you add an autonomous system definition to either of these files, traffic to or from the autonomous system is counted separately in the data file. Any unrecognized autonomous system numbers (autonomous system numbers not defined in their respective files) are aggregated together and appear as "Others" in the data file.
The following example shows the contents of an nfknown.srcasns file.
1:Your_Network 2 10, 15 20, 30:My_Network 35, 40:My_Network
In this example, the entry:
| 1:Your_Network | Indicates that a flow with an autonomous system number of 1 is aggregated under the label Your_Network in the data file. |
| 2 | Indicates that a flow with an autonomous system number of 2 is aggregated under the label 2 in the data file. |
| 10, 15 | Indicates that a flow with an autonomous system number in the range from 10 to 15 is aggregated under a label that is the same as its autonomous system number. For example, if a flow has the autonomous system number 13, the label of the flow in the data file is 13. Flows within the range, but with different autonomous system numbers are aggregated individually in the data file. |
| 20, 30:My_Network 35, 40:My_Network | Indicate that the autonomous system number is replaced by the label My_Network, and that flows within the specified ranges are aggregated together in the data file under the label My_Network in the data file. |
Table 6-6 describes the available configuration parameters and their values.
| Flag | Possible Values | Description | Default Value |
|---|---|---|---|
| OUTPUT_DOTTEDADDRESS | Yes No | Writes the IP address to the data files in dotted decimal format, for example, 172.16.3.100. Writes the IP address to the data files in network address format, for example, 8557414940. | Yes |
| CSV_FORMAT | Yes No | Uses a comma (,) as the delimiter in writing aggregation output. Uses a vertical bar ( | ) as the delimiter. | No |
| LONG_OUTPUTFILE_SUFFIX | Yes No | Sets the output file extension to add the year, month, and date to the hour and minute, for example, _YYYY_MM_DD.HHMM suffix. Sets the output file extension to add HHMM. | No |
| GMT_FLAG | Yes No | Uses the Greenwich Mean Time reference to set date and time. Uses local time. This attribute affects the date and time used in naming the data file directory structure, names of data files, headers in data files, and messages in the log files. | Yes |
| DEVICE_DOTTEDADDRESS | Yes No | Uses the IP address of the sending export device for storage. Attempts to get DNS name first. If a ROUTER_GROUPNAME label has been defined using the ROUTER_GROUPNAME configuration parameter, that label is used; otherwise, the IP address or the DNS name is used, depending on the setting of the DEVICE_DOTTEDADDRESS configuration parameter. For more information on the ROUTER_GROUPNAME configuration parameter, refer to the section "Mapping a List of IP Addresses to One IP Address or Label," later in this section. | Yes |
| SOURCENODE_BUCSIZE DESTNODE_BUCSIZE HOSTMATRIX_BUCSIZE NETMATRIX_BUCSIZE DETAILSOURCENODE_BUCSIZE DETAILDESTNODE_BUCSIZE DETAILHOSTMATRIX_BUCSIZE DETAILINTERFACE_BUCSIZE CALLRECORD_BUCSIZE ASMATRIX_BUCSIZE DETAILASMATRIX_BUCSIZE | Varies according to configuration | The BUCSIZE parameters dictate FlowCollector performance. The term BUCSIZE refers to the number of buffer pages set aside to hold aggregated data for a given aggregation scheme. The general rule of thumb for BUCSIZE values: If an aggregation scheme produces n records in a collection interval, the corresponding BUCSIZE value should lie (approximately) between n/20 and n. The best approach is to sample NetFlow traffic, and then determine whether changes are required. | 2000 2000 2000 2000 2000 2000 6000 6000 50000 2000 50000 |
| SOCKET_BUFSIZE | Buffer size (in bytes) | Size of the UDP socket receive buffer. | 900000 |
| ROUTER_GROUPNAME | List of IP addresses or labels | Allows a user-specified IP address or label to be substituted for a list of IP addresses from which FlowCollector can receive NetFlow export datagrams. For more information, refer to the section "Mapping a List of IP Addresses to One IP Address or Label," later in this section. | Disabled |
| ACCEPT_PACKETS_FROM | List of IP addresses or labels | Allows packets to be filtered by source address (or by defined ROUTER_GROUPNAME label). For more information, refer to the section "Preventing FlowCollector from Accepting Unsolicited Packets," later in this chapter. | Disabled |
| USE_SHORT_CUT_ADDRESS_AS_SOURCE_IP | Yes No | Uses the address of the router being bypassed (short-cut) as the source of the corresponding flow. Uses the address of the export device being bypassed (short-cut) as the source of the corresponding flow. For more information, refer to the section "Retaining Router IP Addresses for Switched Export Packets," later in this chapter. | No |
| USER_SCRIPT_LOCATION | Path and file name | Specifies the location of a user-supplied script. For more information, refer to the section "Using a User-Defined Script to Process FlowCollector Data Files," later in this chapter. | Disabled |
| OUTPUT_BUFFER_SIZE | 1, 2, 4, 8, 16 | The size (in megabytes) of the memory buffer FlowCollector uses for I/O operations. For more information, refer to the section "Changing the Output Buffer Size," later in this chapter. | 4 |
Due to the high volume of NetFlow data export traffic, you might have to increase the normal buffer size associated with the UDP socket on which data is received. To do so, edit the value (in bytes) of the SOCKET_BUFSIZE parameter in the $NFC_DIR/config/nf.resources file.
You can substitute a user-specified IP address or label for a set of IP addresses from which FlowCollector receives NetFlow export datagrams. For example, you can specify the label "blab-gateway" as the label representing packets coming from three separate IP addresses: 171.69.1.172, 171.69.1.173, and 191.71.1.25.
To do this, you must edit the ROUTER_GROUPNAME parameter in the nf.resources file. The syntax is
ROUTER_GROUPNAME label {
a.b.c.d
.
.
.
w.x.y.z
}
where label is either an IP address or an ASCII word. Each of the IP addresses in the body of the ROUTER_GROUPNAME block must be on a separate line. An example of a ROUTER_GROUPNAME definition follows:
ROUTER_GROUPNAME blab-gateway {
171.69.1.172
171.69.1.173
191.71.1.25
}
If applicable, the mapped ROUTER_GROUPNAME will be used with all aggregation schemes, but FlowCollector uses the real IP address to report errors involving receipt of an invalid or unsolicited NetFlow export packet.
In its default configuration, FlowCollector accepts NetFlow export packets from any IP address. If necessary, you can specify the source IP addresses or defined ROUTER_GROUPNAME labels from which FlowCollector should receive NetFlow export packets, thus preventing FlowCollector from accepting packets from any unspecified sources.
To do this, you must remove the comment character from the beginning of each line in the ACCEPT_PACKETS_FROM parameter in the nf.resources file and edit the parameter to include the source IP addresses or ROUTER_GROUPNAME labels. The syntax of the parameter is
ACCEPT_PACKETS_FROM {
a.b.c.d
.
.
.
w.x.y.z
}
where each of the IP addresses (or ROUTER_GROUPNAME labels) defined in the body of the ACCEPT_PACKETS_FROM block must be on a separate line. An example of a ACCEPT_PACKETS_FROM definition follows:
ACCEPT_PACKETS_FROM {
131.108.2.1
131.108.2.2
131.108.2.3
blab_gateway
}
For information on ROUTER_GROUPNAME labels, refer to the section "Mapping a List of IP Addresses to One IP Address or Label," earlier in this chapter.
If your network includes switching devices that support Version 7 NetFlow export datagrams, you can configure FlowCollector to retain the IP address of the short-cut router as the source of data switched through a Cisco Catalyst 5000 series switch. To do this, you must edit the USE_SHORT_CUT_ADDRESS_AS_SOURCE_IP parameter in the nf.resources file. The syntax of the parameter is
USE_SHORT_CUT_ADDRESS_AS_SOURCE_IP value
where value is either yes or no. The default setting is no. If you change the setting to yes, FlowCollector uses the IP address of the bypassed router as the source of the corresponding flow.
You can specify the location of a script file that FlowCollector will execute after it has written a new data file. This capability makes it easier for your client applications to process a new data file without having to poll for it. FlowCollector invokes the script with the absolute path name of the newly written FlowCollector data file. FlowCollector expects the location of your user-supplied script to be defined by the USER_SCRIPT_LOCATION parameter in the nf.resources file. This parameter is read only at startup.
To use the USER_SCRIPT_LOCATION parameter, perform the following steps:
Step 1 Remove the comment character from the beginning of the USER_SCRIPT_LOCATION entry in the nf.resources file, so that it looks like this:
Step 2 Replace the existing path name with the path name for your script.
For example, if the path name for your script is /opt/CSCOnfc/my_script.sh, the revised parameter should read:
FlowCollector transfers output data in blocks to optimize performance and ensure the most efficient handling of data files as they are generated and written as disk files. The size of a block is user-configurable, and defined by the OUTPUT_BUFFER_SIZE parameter in the nf.resources file. The syntax of the parameter is
OUTPUT_BUFFER_SIZE size
where size is the new block size in megabytes (MB). The valid sizes are 1, 2, 4, 8, and 16 MB. The recommended setting is approximately 1/32 of the physical memory installed in the FlowCollector workstation. For example, if the physical memory of your FlowCollector workstation is 128 MB, the best setting is 4. If you inadvertently enter an invalid number, FlowCollector uses the next smaller valid number. For example, if your system is equipped with 128 MB and you enter the number 6, FlowCollector uses the next smaller valid number, 4, as the output buffer size.
Depending on the volume of flow data being exported from the export devices, as well as the FlowCollector thread attribute settings you use, FlowCollector can consume large amounts of disk space in a short period of time. FlowCollector provides several thread attributes and features that can help you manage your disk space usage:
As described earlier, a filter can help you discard any flow data that is not of interest to you. By using filters to ensure you are storing only data of interest, you can potentially reduce the amount of disk space used by FlowCollector.
Table 6-7 shows an example of some aggregation schemes with example file sizes and arrival rate.
| Aggregation Scheme | Aggregation Period | Output File Size | Flows per Second |
|---|---|---|---|
| DestNode | 10 minutes | 240 KB | 12.5 |
| HostMatrix | 10 minutes | 1.2 MB | 12.5 |
| HostMatrix | 2 minutes | 200 KB | 12.5 |
| DetailInterface | 5 minutes | 950 KB | 12.5 |
| DetailHostMatrix | 5 minutes | 3.8 MB | 30 |
| DetailHostMatrix | 10 minutes | 15.6 MB | 30 |
You can estimate the amount of UDP traffic that a export device generates when NetFlow Data Export is enabled. To do this you must understand the characteristics of the traffic in your network, including the average packets per second of switching throughput and the average number of packets per flow.
For example, if the average throughput on a NetFlow enabled export device is 150 kpps and the average number of packets per flow is 100, you may have approximately 1500 flow records per second (150 x 100) to be exported by the export device. Assuming NetFlow data export format Version 5 datagrams, you should expect approximately 50 NetFlow export datagrams per second (1500 flows/30 per export datagram) or 45 KB per second (30 x 1500 bytes/datagram) from the export device.
For example, if the total usage of the partition specified by the DataSetPath attribute is currently at 400 MB, and the DiskSpaceLimit attribute is set to 405 MB, data buffered in FlowCollector for the given thread are written to disk only if the resulting files are less than 5 MB in size. If the resulting files are 5 MB or larger, they are not written to disk, the buffered data will be discarded, and an error message is written to the log. This condition persists until you make additional disk space available.
Another example involves a case where two threads write their data files in the same dedicated data file partition. The DiskSpaceLimit attribute allows you to reserve disk space for one thread by limiting the disk space a second thread can consume. For example, if the data file partition shared by two threads is a 2000 MB space, and you want to ensure that one thread always has priority to disk space, you might set the DiskSpaceLimit attribute of the favored thread to 0 (no limit), and set the DiskSpaceLimit of the other thread to some adequate number, such as 750 MB. That way, if the combined total of disk space used by both threads is 750 MB, the favored thread will be given access to disk space, while the other thread is held off until more disk space becomes available.
Assuming you have other techniques to prevent disk space exhaustion, setting the DiskSpaceLimit attribute to 0 produces optimal performance, because FlowCollector can then bypass expensive disk space checks.
Finally, the DiskSpaceLimit attribute does not apply to the RawFlow aggregation scheme, because this scheme cannot tolerate the performance overhead associated with disk space checks.
The features and thread attributes mentioned above will help you to manage disk space usage by FlowCollector. You may need to use your own file archival and deletion techniques to save older data files and prevent disk space exhaustion.
The appendixes in this guide provide information on the following topics:
| For more information on... | Refer to ... |
|---|---|
| Helpful information and procedures in case you encounter problems while using FlowCollector | "Troubleshooting" |
| NetFlow export datagram formats | "NetFlow Export Datagram Format" |
|
|