cc/td/doc/product/rtrmgmt/nfc/nfc_2_0
hometocprevnextglossaryfeedbacksearchhelp
PDF

Table of Contents

Customizing FlowCollector

Customizing FlowCollector

This chapter describes how to customize FlowCollector operation using thread, filter, and protocol definitions, lists of port and autonomous system numbers, and other FlowCollector configuration parameters.

This chapter includes information on the following topics:

Before You Begin: FlowCollector Configuration and Resource Files

The process of customizing FlowCollector operation involves changes and additions to one or more of the following FlowCollector configuration and resource files:


Note You can change any of these files by using a text editor to edit the appropriate file.

You can also use the interactive features of the NFUI to add, modify, or delete thread definitions (nf.config file) and add or delete filter definitions (nf.config file) or protocol definitions (nfknown.protocols file). For information on the use of the NFUI, refer to the chapter "Using the NetFlow FlowCollector User Interface," earlier in this guide.

nfconfig.file

The nfconfig.file contains definitions of the aggregation tasks that collect and aggregate data exported from NetFlow export devices in your network. These aggregation tasks, defined in terms of threads and filters, tell FlowCollector how to collect and aggregate the incoming NetFlow export data. Each aggregation task must have a thread defined for it (filters are optional).

For more information about creating or modifying threads and filters, refer to the section "Understanding FlowCollector Data Collection and Aggregation," later in this chapter.

nfknown.protocols

The nfknown.protocols file contains definitions of recognized Application layer protocols (FTP, Telnet, etc.) for use in aggregating data. These definitions are also used for protocol filters. You edit this file to add or remove protocol definitions. FlowCollector scans this file and maintains a list of protocols it finds. FlowCollector searches the protocols in the order they are defined in nfknown.protocols.


Note To increase FlowCollector performance, put the most often used protocols at the beginning of the file.

For more information about creating or modifying protocols, refer to the section "Defining Protocols," later in this chapter.

nfknown.srcports

The nfknown.srcports file contains Transport layer source TCP or UDP port numbers used in the SourcePort aggregation scheme (or any other aggregation scheme using source port numbers as part of its key). These TCP or UDP port numbers correspond to the port numbers defined in RFC 1700, for example, Telnet = 23, FTP = 20 or 21.

Flow records having source ports that match defined values in this file are aggregated together. Flow records from source ports not defined in this file are aggregated under "Others."

For more information about creating or modifying source port numbers, refer to the section "Defining Source and Destination Port Numbers," later in this chapter.

nfknown.dstports

The nfknown.dstports file contains destination port numbers used in the DestPort aggregation scheme (or any other aggregation scheme using destination port numbers as part of its key).

Flow records having destination ports that match defined values in this file are aggregated together. Flow records from destination ports not defined in this file are aggregated under "Others."

For more information about creating or modifying destination port numbers, refer to the section "Defining Source and Destination Port Numbers," later in this chapter.

nfknown.srcasns

The nfknown.srcasns file contains source autonomous system numbers, either origin or peer, used in the ASMatrix and DetailASMatrix aggregation schemes (or any other aggregation scheme using source autonomous system numbers as part of its key).

Flow records having source autonomous system numbers that match defined values in this file are aggregated together. Flow records from source autonomous system numbers not defined in this file are aggregated under "Others."

For more information about creating or modifying source autonomous system numbers, refer to the section "Defining Source and Destination Autonomous System Numbers," later in this chapter.

nfknown.dstasns

The nfknown.dstasns file contains destination autonomous system numbers, either origin or peer, used in the ASMatrix and DetailASMatrix aggregation schemes (or any other aggregation scheme using destination autonomous system numbers as part of its key).

Flow records having destination autonomous system numbers that match defined values in this file are aggregated together. Flow records from destination autonomous system numbers not defined in this file are aggregated under "Others."

For more information about creating or modifying destination autonomous system numbers, refer to the section "Defining Source and Destination Autonomous System Numbers," later in this chapter.

nf.resources

The nf.resources file contains the variables and corresponding directory file path names used to configure your startup FlowCollector environment. Besides the path names, the nf.resources file also includes a number of configuration parameters for tuning FlowCollector performance and behavior.

Understanding FlowCollector Data Collection and Aggregation

FlowCollector collects and summarizes (aggregates) data into data files based on user-defined criteria specified in a FlowCollector thread. A thread is an aggregation task defined by a set of user-configurable attributes that specify how FlowCollector will aggregate the traffic flows stored on the workstation. Two key thread attributes are:

Figure 6-1 shows an example of how FlowCollector uses threads and filters. In this example, threadA uses filterA and the SourceNode aggregation scheme; threadB uses both filterA and filterB (filters can be shared among threads) and the DestPort aggregation scheme; threadC doesn't use any filters, but it also uses the DestPort aggregation scheme.


Figure 6-1: NetFlow FlowCollector Data Aggregation Example



The following nf.config file example contains the thread definitions to accomplish the general data aggregation scheme shown in Figure 6-1. The file contains two filter definitions (filterA and filterB) and three thread definitions (threadA, threadB, and threadC) to accomplish the aggregation. NetFlow export traffic arrives on FlowCollector UDP port 9991. Data is to be aggregated by the following aggregation schemes:

Filter filterA
       permit     nexthop    172.16.23.65        0.0.0.0
Filter filterB
       deny       srcaddr    172.16.0.0          0.0.255.255
       permit     srcaddr    0.0.0.0             0.0.0.0
Thread threadA
       Filter filterA
       Aggregation SourceNode
       Period 10
       Port 9991
       State Active
       DataSetPath /opt/CSConfc/Data
       DiskSpaceLimit 0
       FileRetain 0
Thread threadB
       Filter filterA
       Filter filterB
       Aggregation DestPort
       Period 10
       Port 9991
       State Active
       DataSetPath /opt/CSCOnfc/Data
       DiskSpaceLimit 0
       FileRetain 0
Thread threadC
       Aggregation DestPort
       Period 10
       Port 9991
       State Active
       DataSetPath /opt/CSCOnfc/Data
       DiskSpaceLimit 0
       FileRetain 0

In the example, threadA uses filterA to include only the traffic passing through the export device 172.16.23.65, while threadB uses both filters--filterA to include only the traffic passing through the export device 172.16.23.65, and filterB to exclude traffic from network 172.16.0.0. All three threads flush their aggregated data every 10 minutes into data files saved in the /opt/CSCOnfc/Data directory.

Creating a Filter

A filter defines which flow data is to be included or excluded as FlowCollector aggregates data. The default condition for a filter is to deny (exclude) the flow.

The syntax for a filter definition is as follows:

filter filter-name
       {permit|deny} type value mask
                  .
                  .
                  .
       {permit|deny} type value mask

where:

filter The keyword that identifies the definition as a filter.
filter-name The unique, user-specified name of the filter. The name can be up to 14 alphanumeric characters.
permit The permit keyword keeps the data that matches the specified filter type and value.
deny The deny keyword rejects the data that matches the specified filter type and value (matching flow data is ignored and not aggregated).
type The filter type. See Table 6-1 for a description of filter types.
value The value associated with the filter type. All filter types require a value. See Table 6-1 for a description of filter types and values.
mask If the filter uses the srcaddr, dstaddr, or nexthop type, you must provide the IP netmask that qualifies the IP address used as value. See Table 6-1 for a description.

Filter keyword and variable entries are not case-sensitive. Table 6-1 describes the default filter types provided with FlowCollector, the type of input required for the value, and whether the value requires a mask.


Table 6-1: Filter Types, Values, and Their Descriptions
Type Value Mask Required Description
srcaddr Source IP address Yes Filter the input data based on the source IP address. If you use this type, you must provide the IP netmask that qualifies the source IP address.
dstaddr Destination IP address Yes Filter the input data based on the destination IP address. If you use this type, you must provide the IP netmask that qualifies the destination IP address.
srcport Source port number No Filter the input data based on the source port number.
destport Destination port number No Filter the input data based on the destination port number.
srcinterface Source interface number No Filter the input data based on the source interface number.
dstinterface Destination interface number No Filter the input data based on the destination interface number.
nexthop Next hop IP address Yes Filter the input data based on the next hop IP address. If you use this type, you must provide the IP netmask that qualifies the next hop IP address.
Protocol Protocol name No Filter the input data based on the protocol definitions in the nfknown.protocols file.

For more information on protocol definitions, refer to the section "Defining Protocols."

Prot Protocol number No Filter the input data based on the protocol number in the flow record, where the protocol number corresponds to a protocol specified in the /etc/protocols file of your workstation.
TOS Type of service No Filter the input data based on the type of service (ToS).
srcas Source AS No Filter the input data based on the autonomous system number of the source, either origin or peer.
dstas Destination AS No Filter the input data based on the autonomous system number of the destination, either origin or peer.

When defining a filter, keep in mind the following qualifications:

In this example, all flows going to port 80 and all other flows are denied. If you want to deny flows to port 80 only, but permit all other flows, you need an explicit wildcard entry to permit the other flows. For example:
    filter kill-www
           deny Dstport 80
           permit Dstaddr 0.0.0.0 255.255.255.255

For example, the filter named filterA contains four filter conditions.The first condition states that all flows from network 171.69.1.0 are permitted. Also, based on the order in which filter condition definitions appear in this example, filterA allows a flow from Srcaddr = 171.69.1.24 with Srcport = 53.
    Filter filterA
           permit     Srcaddr        171.69.1.24       0.0.0.255
           deny       Srcaddr        204.233.0.0       0.0.255.255
           deny       Srcport        53
           permit     Dstaddr        0.0.0.0           255.255.255.255

If you want to permit traffic from network 171.69.1.0, but deny traffic coming from port 53, you should change the order of the filter conditions as follows:
    Filter filterA
           deny       Srcaddr        204.233.0.0       0.0.255.255
           deny       Srcport        53
           permit     Srcaddr        171.69.1.24       0.0.0.255
           permit     Srcaddr        0.0.0.0           255.255.255.255

The last filter condition overrides the default behavior, which calls for denying all flows that do not match any of the first three filter conditions.

Creating a Thread

A thread is a set of defined attributes that tells FlowCollector how to aggregate the traffic flows stored on the workstation.


Note You can create as many threads as required to meet your needs, but no more than 10 threads can be active at a time. Use the State attribute (described below) to make a thread active or inactive.

The syntax for a thread definition is as follows:

Thread thread-name
[Filter filter-name]
          .
          .
          .
[Filter filter-name]
Aggregation scheme 
Period minutes
Port value
DataSetPath directory-path
[DiskSpaceLimit Megabytes]
State active|inactive
FileRetain number

Note The FileRetain attribute must be the last attribute of a thread.

The keywords and their arguments are listed on separate lines for legibility. Keyword and argument entries are not case-sensitive. Table 6-2 lists thread attributes and variables.


Table 6-2: Attributes and Variables for Creating a Thread
Attribute Variable Definition
Thread thread-name Unique, user-defined name of the thread. Can be up to 18 alphanumeric characters.
Filter filter-name (Optional.) Unique name of a previously defined filter. You can specify one or more filters in a thread definition. When more than one filter is specified in a thread, the result is a logical AND of the functions defined in the filters.

Filters can be shared among threads. For more information on filters, refer to the section "Creating a Filter," later in this chapter.

Aggregation scheme A way to summarize data collected by FlowCollector. For more information about aggregation schemes, refer to the section "Aggregation Schemes," later in this chapter.
Period minutes The frequency, in minutes, for how often FlowCollector writes aggregated data from its memory buffers into a data file. Data received in each period is written into a separate file. For example, setting the period to 30 minutes generates two data files every hour.
Port value The UDP port number on which FlowCollector is expecting NetFlow data from NetFlow export devices. The valid range of ports is between 1024 and 65535.

In a default FlowCollector installation, UDP ports 9995 and 9996 are automatically configured as the UDP ports FlowCollector uses to receive NetFlow exported data. These numbers are defined in the default set of threads provided as part of the FlowCollector installation. You can define other UDP port numbers by selecting a number in the range 1024 to 65535, and using that number as the value in the Port attribute of an active thread definition.

DataSetPath directory-path Directory path used for storing the aggregated data (data files). If FlowCollector does not have write permission to the directory specified by a DataSetPath attribute in a thread definition, it uses $NFC_DIR as the root directory for the data files.

For more information on data files, refer to the chapter "Understanding the FlowCollector Data File Format," earlier in this guide.

DiskSpaceLimit Megabytes (Optional.) Defines a limit of the total disk usage for the disk partition where DataSetPath resides, beyond which this thread will no longer write data to the disk. This attribute essentially allows you to reserve disk space for other threads by limiting the amount of disk space consumed by this thread, and can help prevent disk space exhaustion as well. The default for DiskSpaceLimit is 0 MB, which means DiskSpaceLimit checking is disabled.

If the thread uses RawFlows as the aggregation scheme, the DiskSpaceLimit attribute has no effect.

For more information on using the DiskSpaceLimit attribute, refer to the sections "DiskSpaceLimit Thread Attribute" and "Managing Disk Space," later in this chapter.

State active or inactive When a thread is active, FlowCollector aggregates data according to the attributes defined for the thread and produces data files; when the thread is inactive, FlowCollector does not aggregate data according to the attributes defined for the thread and does not produce data files.

You can have a maximum of 10 active threads at any time.

FileRetain number Specifies the maximum number of data files to retain for each export device per day for this thread. The default for FileRetain is 0, which means the FileRetain feature is disabled.

The FileRetain attribute must be the last attribute of a thread.


Note You should not define two active threads that use the same aggregation scheme and DataSetPath. Doing this causes FlowCollector to produce an unusable data file.

In the following example, thread Alpha uses the SourceNode aggregation scheme. FlowCollector creates a data file in the directory /opt/CSCOnfc/Data every 30 minutes and keeps the last 24 data files per day:

Thread Alpha
       Aggregation SourceNode 
       Period 30
       Port 9991
       State Active
       DataSetPath /opt/CSCOnfc/Data
       DiskSpaceLimit 0
       FileRetain 24

Aggregation Schemes

FlowCollector provides a library of predefined aggregation schemes (see Table 6-3) that you can use to determine the type of information that is aggregated and stored in the data files.


Note You can specify only one aggregation scheme per thread.

Each aggregation scheme consists of one or more key fields, which tell FlowCollector what to look for in the exported NetFlow datagram, and one or more value fields, which contain statistical information pulled from the exported NetFlow datagram. The key fields and value fields shown in Table 6-3 correspond to the fields found in Version 1, Version 5, and Version 7 NetFlow export datagrams. Table 6-4 provides brief definitions of each of the key and value fields. For more information about these three versions of the NetFlow export datagram format, refer to the appendix "NetFlow Export Datagram Format," later in this guide.

For example, the SourceNode aggregation scheme uses just one key field, srcaddr (source address), and returns data for three value fields, the total number of packets sent, the total number of bytes sent, and the total number of flows aggregated into this record. Other aggregation schemes offer different combinations of key and value fields (see Table 6-3), and are described individually in the sections that follow.


Table  6-3: FlowCollector Aggregation Schemes, Key Fields, and Value Fields
Aggregation Scheme Key Fields Value Fields
srcaddr dstaddr srcport dstport protocol protocol byte (prot) ToS input interface output interface nexthop src_as dst_as masked srcaddr masked dstaddr src_mask dst_mask packet count byte count flow count firstTimeStamp lastTimeStamp totalActiveTime
RawFlows

SourceNode

·

·

·

·

DestNode

·

·

·

·

HostMatrix

·

·

·

·

·

SourcePort

·

·

·

·

DestPort

·

·

·

·

Protocol

·

·

·

·

DetailDestNode

·

·

·

·

·

·

·

DetailHostMatrix

·

·

·

·

·

·

·

·

·

·

·

DetailInterface

·

·

·

·

·

·

·

·

CallRecord

·

·

·

·

·

·

·

·

·

·

·

·

ASMatrix

·

·

·

·

·

NetMatrix

·

·

·

·

·

·

·

·

·

DetailSourceNode

·

·

·

·

·

·

·

DetailASMatrix

·

·

·

·

·

·

·

·

·

·

·

·


Table 6-4: Key and Value Field Definitions
Field Description
srcaddr Source IP address
dstaddr Destination IP address
srcport TCP/UDP source port number or equivalent
dstport TCP/UDP destination port number or equivalent
protocol The name or label assigned to a protocol definition in the nfknown.protocols file.
protocol byte (prot) IP protocol type (for example, TCP=6; UDP=17)
ToS IP type-of-service
input interface SNMP index of input interface
output interface SNMP index of output interface
nexthop IP address of next hop export device
src_as Autonomous system number of the source, either origin or peer
dst_as Autonomous system number of the destination, either origin or peer
masked srcaddr srcaddr masked with the source netmask (src_mask)
masked dstaddr dstaddr masked with the destination netmask (dst_mask)
src_mask Source IP address prefix mask bits
dst_mask Destination IP address prefix mask bits
packet count Packets counted as part of this record
byte count Total number of Layer 3 bytes counted as part of this record
flow count Total number of flows aggregated into this record
firstTimeStamp The time, in UTC seconds, of the first packet summarized into this record
lastTimeStamp The time, in UTC seconds, of the last packet summarized into this record
totalActiveTime The sum of individual activetime for all the flows summarized into the current record

RawFlows

The output from this aggregation scheme is an exact image of the NetFlow export datagram without aggregation, and is stored in binary data files of n minutes worth of data as specified by the Period attribute in the thread definition.


Note You cannot use filters or the DiskSpaceLimit thread attribute with the RawFlows aggregation scheme.

SourceNode

The output of this aggregation scheme consists of one record for each unique source IP address present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key field: srcaddr
Value fields: packet count, byte count, and flow count

DestNode

The output of this aggregation scheme consists of one record for each unique destination IP address present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key field: dstaddr
Value fields: packet count, byte count, and flow count

HostMatrix

The output of this aggregation scheme consists of one record for each unique source and destination IP address pair present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key fields: srcaddr, dstaddr
Value fields: packet count, byte count, and flow count

SourcePort

The output of this aggregation scheme consists of one record for each unique source port present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key field: srcport
Value fields: packet count, byte count, and flow count

Known source ports are defined in the nfknown.srcports file. Undefined source ports are aggregated as "Others" in the data file.

DestPort

The output of this aggregation scheme consists of one record for each unique destination port present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key field: dstport
Value fields: packet count, byte count, and flow count

Known destination ports are defined in the nfknown.dstports file. Undefined destination ports are aggregated as "Others" in the data file.

Protocol

The output of this aggregation scheme consists of one record for each unique protocol (as defined in the nfknown.protocols file) present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key field: protocol
Value fields: packet count, byte count, and flow count

Known protocols are defined in the nfknown.protocols file. Undefined protocols are aggregated as "Others" in the data file.

DetailDestNode

The output of this aggregation scheme consists of one record for each unique combination of destination IP address, source port, destination port, and protocol present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key fields: dstaddr, srcport, dstport, protocol
Value fields: packet count, byte count, and flow count

DetailHostMatrix

The output of this aggregation scheme consists of one record for each unique combination of source IP address, destination IP address, source port, destination port, and protocol present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key fields: srcaddr, dstaddr, srcport, dstport, protocol
Value fields: packet count, byte count, and flow count, firstTimeStamp, lastTimeStamp, and totalActiveTime

DetailInterface

The output of this aggregation scheme consists of one record for each unique combination of source IP address, destination IP address, input, output, and nexthop present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key fields: srcaddr, dstaddr, input, output, nexthop
Value fields: packet count, byte count, and flow count

CallRecord

The output of this aggregation scheme consists of one record for each unique combination of source IP address, destination IP address, source port, destination port, protocol, and type of service present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key fields: srcaddr, dstaddr, srcport, dstport, protocol, ToS
Value fields: packet count, byte count, and flow count, firstTimeStamp, lastTimeStamp, and totalActiveTime

ASMatrix

The output of this aggregation scheme consists of one record for each unique source and destination autonomous system number pair present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key fields: src_as, dst_as
Value fields: packet count, byte count, and flow count

Note This aggregation scheme is only valid when used with Version 5 or later export data because it requires data that is only present in Version 5 or later datagrams.

NetMatrix

The output of this aggregation scheme consists of one record for each unique combination of input interface, output interface, masked source IP address, masked destination IP address, source mask, and destination mask present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key fields: input interface, output interface, masked srcaddr, masked dstaddr, src_mask, dst_mask
Value fields: packet count, byte count, and flow count

Note This aggregation scheme is only valid when used with Version 5 export data because it requires data that is only present in Version 5 or later NetFlow datagrams.

DetailSourceNode

The output of this aggregation scheme consists of one record for each unique combination of source IP address, source port, destination port, and protocol present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key fields: srcaddr, srcport, destport, protocol
Value fields: packet count, byte count, and flow count

DetailASMatrix

The output of this aggregation scheme consists of one record for each unique combination of source IP address, destination IP address, source port, destination port, protocol, input interface, output interface, source autonomous system number, and destination autonomous system number present in the flow data received by FlowCollector during the current collection period. Each output record contains the following fields:

Key fields: srcaddr, dstaddr, srcport, dstport, protocol, input interface, output interface, src_as, dst_as
Value fields: packet count, byte count, and flow count

Note This aggregation scheme is only valid when used with Version 5 export data because it requires data that is only present in Version 5 or later NetFlow datagrams.

Defining Protocols

Use the information in this section to define the protocols that you want FlowCollector to recognize as it aggregates data. The protocols FlowCollector recognizes are defined in the nfknown.protocols file, located in the $NFC_DIR/config directory.


Note The path name to the nfknown.protocols file is defined in the nf.resources file.

FlowCollector will recognize the protocol and aggregate traffic statistics associated with the protocol only when the following conditions are met:

If you remove the protocol from the nfknown.protocols file, information for that protocol is no longer recognized and is aggregated under "Others."

Figure 6-2 shows an example of a typical communication session between host A and host B. This example assumes that NetFlow data export is enabled for the export device interfaces to both host A and host B so that exported NetFlow data gives FlowCollector statistics for communication in both directions (A to B; B to A). In this example, FlowCollector aggregates data for two protocols-- Telnet and FTP--between host A (the Telnet server, using port 23) and host B (the Telnet client, using port 9001).

Whether the aggregated data is stored in data files for later retrieval depends on how FlowCollector is customized.


Figure 6-2: Data Collection Example



Table 6-5 lists the raw data received in the data collection example shown in Figure 6-2.


Table 6-5: Data Received Example
Source Address Destination Address Source Port Destination Port Protocol Byte Packets Bytes
A B 23 9001 6 20 2000
B A 9001 23 6 30 1000
A B 20 9002 6 20 200
B A 9002 20 6 50 300

When you add the Telnet protocol definition to the nfkown.protocols file, the data file produced by the Protocol aggregation scheme will contain a row with 50 packets (20 plus 30) and 3000 bytes (2000 plus 1000).

In this example, no FTP protocol definition was added to the nfkown.protocols file, so the data file will also have another row for "Others" (including the FTP data) containing 70 packets (20 plus 50) and 500 bytes (200 plus 300).

The protocols listed in the nfknown.protocols file are used by the aggregation schemes and protocol filters you define in the nfconfig.file. To configure the protocols that FlowCollector recognizes, you must edit the nfknown.protocols file and add a definition that includes the following information:

The command syntax for a protocol definition is

protocol name
       [[srcport|dstport] number [OR [srcport|dstportnumber]]
         prot value

where:

protocol The keyword that identifies the definition as a protocol.
name Unique, user-specified name of the protocol definition. Can be up to 14 alphanumeric characters.
srcport Source port
dstport Destination port
number Port number
OR (Optional) Provides a Boolean OR functionality when you have more than one srcport or dstport.
prot Protocol byte
value Protocol type (similar to those specified in the /etc/protocols file on a UNIX workstation; for example, TCP=6, UDP=17).

Note Protocol keyword and variable entries are not case-sensitive.

The known protocols (such as WWW, Telnet, FTP) listed in the nfknown.protocols files are similar to the definitions specified in the /etc/services file of a UNIX workstation. For information about the protocols and protocol types supported on your workstation, refer to the protocols file in the /etc directory on your workstation.

The protocol definitions in the nfknown.protocols file cause FlowCollector to recognize the protocols originating from, or terminating on the specified ports. For example, in the sample protocol list shown below, the first protocol definition uses the OR option to cause FlowCollector to recognize traffic flows for all Telnet sessions originating from or terminating on port 23.

Protocol TCP-Telnet
         Dstport 23 OR Srcport 23
         Prot 6
Protocol TCP-FTP
         Srcport 20 OR Srcport 21 OR Dstport 20 OR Dstport 21
         Prot 6
Protocol TCP-WWW
         Dstport 80 OR Srcport 80
         Prot 6
Protocol TCP-SMTP
         Srcport 25 OR Dstport 25
         Prot 6
Protocol TCP-Other
         Prot 6
Protocol UDP-TFTP
         Srcport 69 OR Dstport 69
         Prot 17

Defining Source and Destination Port Numbers

Use the information in this section to specify the source and destination port numbers from which FlowCollector will collect and aggregate data. The port numbers FlowCollector recognizes are defined in the following files, located in the $NFC_DIR/config directory:


Note The path names to these files are defined in the nf.resources file.

FlowCollector uses the contents of the nfknown.srcports and nfknown.dstports files in any aggregation scheme that uses SourcePort and DestPort fields. When you add a port definition to either of these of files, traffic to or from the defined port is counted separately in the data file. Unrecognized ports (ports not defined in their respective files) are aggregated as "Others" in the data file.


Note The nfknown.dstports file uses the same format and syntax conventions as the nfknown.srcports file.

The command syntax for a port number, range of port numbers, or a range grouped under an assigned label is:

value[,value[:label]]
           .
           .
           .
value[,value[:label]]

where:

value A number between 0 and 65535
label (Optional.) An alphanumeric ASCII string of up to 14 characters

A range or ports is defined by using a comma to separate two numbers (an optional space can be added for legibility). A range can span any set of ports up to the maximum number of ports available on the system (currently 65,535). The following example shows a range of ports:

50, 100

You can also define a range of source or destination ports to be treated as one logical port, and assign a label to represent that range of ports. The following example shows a range of ports to be treated as the logical port named 10K_19K_Port_Range.

10000, 19999: 10K_19K_Port_Range

In this case, traffic is aggregated and reported for the logical port 10K_19K_Port_Range, rather than for each of the individual port numbers in the range.

The following example shows the contents of a sample nfknown.srcports file:

21:ftp
88
50, 100
10000, 19999: 10K_19K_Port_Range
20000, 29999: My_Range
40000, 49999: My_Range

In the example above, the entry:

21:ftp Indicates that a flow with port number of 21 is aggregated under the label ftp in the data file.
88 Indicates that a flow with a port number of 88 is aggregated under the label 88 in the data file.
50, 100 Indicates that a flow with a port number in the range from 50 to 100 is aggregated under a label that is the same as its port number. For example, if a flow has the port number 75, the label of the flow in the data file is 75. Flows within the range, but with different port numbers are aggregated individually in the data file.
10000, 19999: 10K_19K_Port_Range Indicates that the port number of any port in the range is replaced by the label 10K_19K_Port_Range, and that flows within the range are aggregated together under the label 10K_19K_Port_Range in the data file.
20000, 29999:My_Range
40000, 49999:My_Range
Indicate that the port number of any port in the two ranges is replaced by the label My_Range, and that flows within the specified ranges are aggregated together under the label My_Range in the data file.

Defining Source and Destination Autonomous System Numbers

Use the information in this section to specify the source and destination autonomous systems from which FlowCollector will collect and aggregate data. The autonomous systems FlowCollector recognizes are defined in the following files, located in the $NFC_DIR/config directory:


Note The path names to these files are built into the nf.resources file.

FlowCollector uses the contents of the nfknown.srcasns and nfknown.dstasns files in aggregation schemes that make use of source and/or destination autonomous system numbers. When you add an autonomous system definition to either of these files, traffic to or from the autonomous system is counted separately in the data file. Any unrecognized autonomous system numbers (autonomous system numbers not defined in their respective files) are aggregated together and appear as "Others" in the data file.


Note The the nfknown.srcasns and nfknown.dstasns files use the same format and syntax conventions as the nfknown.srcports file.

The following example shows the contents of an nfknown.srcasns file.

1:Your_Network
2
10, 15
20, 30:My_Network
35, 40:My_Network

In this example, the entry:

1:Your_Network Indicates that a flow with an autonomous system number of 1 is aggregated under the label Your_Network in the data file.
2 Indicates that a flow with an autonomous system number of 2 is aggregated under the label 2 in the data file.
10, 15 Indicates that a flow with an autonomous system number in the range from 10 to 15 is aggregated under a label that is the same as its autonomous system number. For example, if a flow has the autonomous system number 13, the label of the flow in the data file is 13. Flows within the range, but with different autonomous system numbers are aggregated individually in the data file.
20, 30:My_Network
35, 40:My_Network
Indicate that the autonomous system number is replaced by the label My_Network, and that flows within the specified ranges are aggregated together in the data file under the label My_Network in the data file.

Modifying FlowCollector Resources

The nf.resources file contains the configuration parameter settings and directory file path names used to configure your startup FlowCollector environment. Besides the path name definitions, the nf.resources file also includes a set of parameters for tuning FlowCollector performance. The nf.resources file is located in $/NFC_DIR/config.

Table 6-6 describes the available configuration parameters and their values.


Table  6-6: Configuration Parameters
Flag Possible Values Description Default
Value
OUTPUT_DOTTEDADDRESS Yes


No
Writes the IP address to the data files in dotted decimal format, for example, 172.16.3.100.

Writes the IP address to the data files in network address format, for example, 8557414940.
Yes
CSV_FORMAT Yes

No
Uses a comma (,) as the delimiter in writing aggregation output.

Uses a vertical bar ( | ) as the delimiter.
No
LONG_OUTPUTFILE_SUFFIX Yes



No
Sets the output file extension to add the year, month, and date
to the hour and minute, for example,
_YYYY_MM_DD.HHMM suffix.

Sets the output file extension to add HHMM.
No
GMT_FLAG Yes

No
Uses the Greenwich Mean Time reference to set date and time.

Uses local time.

This attribute affects the date and time used in naming the data file directory structure, names of data files, headers in data files, and messages in the log files.
Yes
DEVICE_DOTTEDADDRESS Yes

No
Uses the IP address of the sending export device for storage.

Attempts to get DNS name first.

If a ROUTER_GROUPNAME label has been defined using the ROUTER_GROUPNAME configuration parameter, that label is used; otherwise, the IP address or the DNS name is used, depending on the setting of the DEVICE_DOTTEDADDRESS configuration parameter. For more information on the ROUTER_GROUPNAME configuration parameter, refer to the section "Mapping a List of IP Addresses to One IP Address or Label," later in this section.
Yes
SOURCENODE_BUCSIZE
DESTNODE_BUCSIZE

HOSTMATRIX_BUCSIZE

NETMATRIX_BUCSIZE

DETAILSOURCENODE_BUCSIZE

DETAILDESTNODE_BUCSIZE

DETAILHOSTMATRIX_BUCSIZE

DETAILINTERFACE_BUCSIZE

CALLRECORD_BUCSIZE

ASMATRIX_BUCSIZE

DETAILASMATRIX_BUCSIZE
Varies according to configuration The BUCSIZE parameters dictate FlowCollector performance. The term BUCSIZE refers to the number of buffer pages set aside to hold aggregated data for a given aggregation scheme.

The general rule of thumb for BUCSIZE values: If an aggregation scheme produces n records in a collection interval, the corresponding BUCSIZE value should lie (approximately) between n/20 and n.

The best approach is to sample NetFlow traffic, and then determine whether changes are required.
2000
2000
2000
2000
2000
2000
6000
6000
50000
2000
50000
SOCKET_BUFSIZE Buffer size (in bytes) Size of the UDP socket receive buffer. 900000
ROUTER_GROUPNAME List of IP addresses
or labels
Allows a user-specified IP address or label to be substituted for a list of IP addresses from which FlowCollector can receive NetFlow export datagrams.

For more information, refer to the section "Mapping a List of IP Addresses to One IP Address or Label," later in this section.
Disabled
ACCEPT_PACKETS_FROM List of IP addresses
or labels
Allows packets to be filtered by source address (or by defined ROUTER_GROUPNAME label). For more information, refer to the section "Preventing FlowCollector from Accepting Unsolicited Packets," later in this chapter. Disabled
USE_SHORT_CUT_ADDRESS_AS_SOURCE_IP Yes


No
Uses the address of the router being bypassed (short-cut) as the source of the corresponding flow.

Uses the address of the export device being bypassed (short-cut) as the source of the corresponding flow.

For more information, refer to the section "Retaining Router IP Addresses for Switched Export Packets," later in this chapter.
No
USER_SCRIPT_LOCATION Path and file name Specifies the location of a user-supplied script.

For more information, refer to the section "Using a User-Defined Script to Process FlowCollector Data Files," later in this chapter.
Disabled
OUTPUT_BUFFER_SIZE 1, 2, 4, 8, 16 The size (in megabytes) of the memory buffer FlowCollector uses for I/O operations.

For more information, refer to the section "Changing the Output Buffer Size," later in this chapter.
4

Increasing UDP Socket Receive Buffer Size

Due to the high volume of NetFlow data export traffic, you might have to increase the normal buffer size associated with the UDP socket on which data is received. To do so, edit the value (in bytes) of the SOCKET_BUFSIZE parameter in the $NFC_DIR/config/nf.resources file.

Mapping a List of IP Addresses to One IP Address or Label

You can substitute a user-specified IP address or label for a set of IP addresses from which FlowCollector receives NetFlow export datagrams. For example, you can specify the label "blab-gateway" as the label representing packets coming from three separate IP addresses: 171.69.1.172, 171.69.1.173, and 191.71.1.25.

To do this, you must edit the ROUTER_GROUPNAME parameter in the nf.resources file. The syntax is

ROUTER_GROUPNAME label {
     a.b.c.d
        .
        .
        .
     w.x.y.z
}

where label is either an IP address or an ASCII word. Each of the IP addresses in the body of the ROUTER_GROUPNAME block must be on a separate line. An example of a ROUTER_GROUPNAME definition follows:

ROUTER_GROUPNAME blab-gateway {
     171.69.1.172
     171.69.1.173
     191.71.1.25
}

If applicable, the mapped ROUTER_GROUPNAME will be used with all aggregation schemes, but FlowCollector uses the real IP address to report errors involving receipt of an invalid or unsolicited NetFlow export packet.

Preventing FlowCollector from Accepting Unsolicited Packets

In its default configuration, FlowCollector accepts NetFlow export packets from any IP address. If necessary, you can specify the source IP addresses or defined ROUTER_GROUPNAME labels from which FlowCollector should receive NetFlow export packets, thus preventing FlowCollector from accepting packets from any unspecified sources.

To do this, you must remove the comment character from the beginning of each line in the ACCEPT_PACKETS_FROM parameter in the nf.resources file and edit the parameter to include the source IP addresses or ROUTER_GROUPNAME labels. The syntax of the parameter is

ACCEPT_PACKETS_FROM {
     a.b.c.d
        .
        .
        .
     w.x.y.z
}

where each of the IP addresses (or ROUTER_GROUPNAME labels) defined in the body of the ACCEPT_PACKETS_FROM block must be on a separate line. An example of a ACCEPT_PACKETS_FROM definition follows:

ACCEPT_PACKETS_FROM {
     131.108.2.1
     131.108.2.2
     131.108.2.3
     blab_gateway
}

For information on ROUTER_GROUPNAME labels, refer to the section "Mapping a List of IP Addresses to One IP Address or Label," earlier in this chapter.


Note By default, FlowCollector accepts packets from all sources.

Retaining Router IP Addresses for Switched Export Packets

If your network includes switching devices that support Version 7 NetFlow export datagrams, you can configure FlowCollector to retain the IP address of the short-cut router as the source of data switched through a Cisco Catalyst 5000 series switch. To do this, you must edit the USE_SHORT_CUT_ADDRESS_AS_SOURCE_IP parameter in the nf.resources file. The syntax of the parameter is

USE_SHORT_CUT_ADDRESS_AS_SOURCE_IP value

where value is either yes or no. The default setting is no. If you change the setting to yes, FlowCollector uses the IP address of the bypassed router as the source of the corresponding flow.


Note When the USE_SHORT_CUT_ADDRESS_AS_SOURCE_IP parameter is set to yes, FlowCollector is not able to show the missed records count in the header of the data files, because it is impossible to predict the IP address of the bypassed router for a lost flow record.

Using a User-Defined Script to Process FlowCollector Data Files

You can specify the location of a script file that FlowCollector will execute after it has written a new data file. This capability makes it easier for your client applications to process a new data file without having to poll for it. FlowCollector invokes the script with the absolute path name of the newly written FlowCollector data file. FlowCollector expects the location of your user-supplied script to be defined by the USER_SCRIPT_LOCATION parameter in the nf.resources file. This parameter is read only at startup.

To use the USER_SCRIPT_LOCATION parameter, perform the following steps:

Step 1 Remove the comment character from the beginning of the USER_SCRIPT_LOCATION entry in the nf.resources file, so that it looks like this:

Step 2 Replace the existing path name with the path name for your script.

For example, if the path name for your script is /opt/CSCOnfc/my_script.sh, the revised parameter should read:


Changing the Output Buffer Size

FlowCollector transfers output data in blocks to optimize performance and ensure the most efficient handling of data files as they are generated and written as disk files. The size of a block is user-configurable, and defined by the OUTPUT_BUFFER_SIZE parameter in the nf.resources file. The syntax of the parameter is

OUTPUT_BUFFER_SIZE size

where size is the new block size in megabytes (MB). The valid sizes are 1, 2, 4, 8, and 16 MB. The recommended setting is approximately 1/32 of the physical memory installed in the FlowCollector workstation. For example, if the physical memory of your FlowCollector workstation is 128 MB, the best setting is 4. If you inadvertently enter an invalid number, FlowCollector uses the next smaller valid number. For example, if your system is equipped with 128 MB and you enter the number 6, FlowCollector uses the next smaller valid number, 4, as the output buffer size.

Managing Disk Space

Depending on the volume of flow data being exported from the export devices, as well as the FlowCollector thread attribute settings you use, FlowCollector can consume large amounts of disk space in a short period of time. FlowCollector provides several thread attributes and features that can help you manage your disk space usage:

Filters

As described earlier, a filter can help you discard any flow data that is not of interest to you. By using filters to ensure you are storing only data of interest, you can potentially reduce the amount of disk space used by FlowCollector.

Aggregation Schemes

Aggregation schemes are used to define how you want FlowCollector to summarize the flow data being exported from your export devices. By using only those aggregation schemes required for your application and, when possible, by selecting the aggregation schemes that generate the least amount of data on disk, you can reduce the amount of disk space used by FlowCollector. For example, using the HostMatrix aggregation scheme will result in less disk space usage than would the DetailHostMatrix scheme. Of course, which aggregation schemes you use will be determined primarily by the data you are interested in and how you want to summarize that data. It is important to realize, however, that the different aggregation schemes can greatly affect the amount of disk space used by FlowCollector.

Table 6-7 shows an example of some aggregation schemes with example file sizes and arrival rate.


Table 6-7: Aggregation Scheme Examples
Aggregation Scheme Aggregation Period Output File Size Flows
per Second
DestNode 10 minutes 240 KB 12.5
HostMatrix 10 minutes 1.2 MB 12.5
HostMatrix 2 minutes 200 KB 12.5
DetailInterface 5 minutes 950 KB 12.5
DetailHostMatrix 5 minutes 3.8 MB 30
DetailHostMatrix 10 minutes 15.6 MB 30

You can estimate the amount of UDP traffic that a export device generates when NetFlow Data Export is enabled. To do this you must understand the characteristics of the traffic in your network, including the average packets per second of switching throughput and the average number of packets per flow.

For example, if the average throughput on a NetFlow enabled export device is 150 kpps and the average number of packets per flow is 100, you may have approximately 1500 flow records per second (150 x 100) to be exported by the export device. Assuming NetFlow data export format Version 5 datagrams, you should expect approximately 50 NetFlow export datagrams per second (1500 flows/30 per export datagram) or 45 KB per second (30 x 1500 bytes/datagram) from the export device.

FileRetain Thread Attribute

The FileRetain thread attribute allows you to limit the number of files FlowCollector will retain on disk per day for the given thread. By setting the FileRetain attribute to a lower value, you can reduce the amount of disk space FlowCollector uses by reducing the number of retained data files.

DiskSpaceLimit Thread Attribute

The DiskSpaceLimit thread attribute allows you to limit the total amount of disk space a thread can use in the partition specified by the DataSetPath attribute. Beyond the limit specified by the DiskSpaceLimit attribute, FlowCollector will no longer write data to the disk for the given thread.

For example, if the total usage of the partition specified by the DataSetPath attribute is currently at 400 MB, and the DiskSpaceLimit attribute is set to 405 MB, data buffered in FlowCollector for the given thread are written to disk only if the resulting files are less than 5 MB in size. If the resulting files are 5 MB or larger, they are not written to disk, the buffered data will be discarded, and an error message is written to the log. This condition persists until you make additional disk space available.

Another example involves a case where two threads write their data files in the same dedicated data file partition. The DiskSpaceLimit attribute allows you to reserve disk space for one thread by limiting the disk space a second thread can consume. For example, if the data file partition shared by two threads is a 2000 MB space, and you want to ensure that one thread always has priority to disk space, you might set the DiskSpaceLimit attribute of the favored thread to 0 (no limit), and set the DiskSpaceLimit of the other thread to some adequate number, such as 750 MB. That way, if the combined total of disk space used by both threads is 750 MB, the favored thread will be given access to disk space, while the other thread is held off until more disk space becomes available.

Assuming you have other techniques to prevent disk space exhaustion, setting the DiskSpaceLimit attribute to 0 produces optimal performance, because FlowCollector can then bypass expensive disk space checks.

Finally, the DiskSpaceLimit attribute does not apply to the RawFlow aggregation scheme, because this scheme cannot tolerate the performance overhead associated with disk space checks.

The features and thread attributes mentioned above will help you to manage disk space usage by FlowCollector. You may need to use your own file archival and deletion techniques to save older data files and prevent disk space exhaustion.

Where to Go from Here

The appendixes in this guide provide information on the following topics:

For more information on... Refer to ...
Helpful information and procedures in case you encounter problems while using FlowCollector "Troubleshooting"
NetFlow export datagram formats "NetFlow Export Datagram Format"

hometocprevnextglossaryfeedbacksearchhelp
Copyright 1989-1998 © Cisco Systems Inc.