cc/td/doc/product/rtrmgmt/nfc/nfc_2_0
hometocprevnextglossaryfeedbacksearchhelp
PDF

Table of Contents

Understanding the FlowCollector Data File Format

Understanding the FlowCollector Data File Format

This chapter tells you how to interpret the data collected and saved in FlowCollector data files. The chapter includes information on the following topics:

Data File Directory Structure

Once you start FlowCollector, it begins to collect data based on your aggregation schemes and store the collected data in data files.

If you specified a custom data file directory path as the DataSetPath attribute for a thread, the data files are stored in the directory you specified; otherwise, FlowCollector uses the default path, which is /opt/CSCOnfc/Data, and the default data file directory structure shown in Figure 5-1.


Figure 5-1:
Default Data File Directory Structure



Starting with the root directory specified in the DataSetPath attribute of the thread (see Figure 5-2 for an example), a directory is created for each day (for example 1998_05_19). Under the date directory, a subdirectory is created for each export device (for example, gw.router) or ROUTER_GROUPNAME label, and under the export device, there is a subdirectory for each aggregation scheme (for example, CallRecord, SourcePort, etc.). The data files are stored by file name under the aggregation scheme subdirectory.


Figure 5-2: Data File Directory Structure Example



For information on how file names are formed, refer to the section "Data File Names," later in this chapter

Data File Names

The name given to a data file takes either a long form (export-resource-name_yyyy_mm_dd.hhmm) or a short form (export-resource-name.hhmm), depending on the setting specified for the LONG_OUTPUTFILE_SUFFIX configuration parameter in the nf.resources file.


Note The long form of the name is the default file. To use the short form, you must edit the nf.resources file and set the LONG_OUTPUTFILE_SUFFIX label to NO. For more information, refer to Table 6-6 in the chapter "Customizing FlowCollector," later in this guide.

Table 5-1 describes the fields of the data file name format.


Table 5-1: Data File Name Format Fields and Descriptions
Field Description
export-resource-name If this export device is named in the ROUTER_GROUPNAME configuration parameter, this value is the label defined by the ROUTER_GROUPNAME parameter; otherwise, the domain name system (DNS) name of the export device is the source of the data. If the DNS name is not available, the IP address of the export device is used.

For information on the ROUTER_GROUPNAME configuration parameter, refer to the section "Mapping a List of IP Addresses to One IP Address or Label," in the chapter "Customizing FlowCollector," later in this guide.

hhmm File creation time in hours and minutes, which is derived from time in the context of the time zone (local or Greenwich Mean Time [GMT]). FlowCollector uses GMT as the default time zone.
yyyy_mm_dd Date in year, month, and day format.

The following are examples of short and long data file names:

gw-router.1530
gw-router_1996_03_15.1530

Data File Format

The data file consists of a header and one or more data records. As an example, Figure 5-3 shows an abbreviated CallRecord data file.


Figure 5-3: Abbreviated CallRecord Data File



Data File Header Format

The data file header consists of nine field-pairs, each made up of a keyword (in BLOCK characters) and its corresponding value. The keyword and value pairs are separated by either a vertical bar (|) or a comma (,). (You configure the choice of vertical bar or comma by using the CSV_FORMAT parameter in the nf.resources file.)

SOURCE source|FORMAT format|AGGREGATION aggregation|PERIOD period| STARTTIME time|ENDTIME time|FLOWS flows|MISSED missed|RECORDS records

Note All the aggregation schemes use the same data file header format.

Table 5-2: Data File Keywords, Values, and Descriptions
Keyword Value Description
SOURCE source The label that identifies the source of the NetFlow export traffic summarized in this data file. The label can be the IP address, in dotted decimal format, or an ASCII name. If you are using the ROUTER_GROUPNAME feature, the label will be the group name specified in the ROUTER_GROUPNAME configuration parameter.
FORMAT format An alphabetical tag to track the version of the data file generated by FlowCollector. Currently, the format tag is "A."
AGGREGATION aggregation The name of a valid, predefined aggregation scheme that was used to create this data file (refer to the section "Aggregation Schemes" in the chapter "Customizing FlowCollector.")
PERIOD period Data collection period, specified in minutes. Under some circumstances, FlowCollector might generate a data file before the current data collection period expires. In such a case, FlowCollector adds the keyword PARTIAL to the file name, and the PERIOD field in the header is identified as PERIOD PARTIAL. For more information on partial data files, refer to the section "Partial Data Files," later in this chapter.
STARTTIME time The time in universal time coordinate (UTC) seconds when this data collection period began.
ENDTIME time The time in UTC seconds when this data collection period ended.
FLOWS flows The total number of NetFlow export records that are aggregated in this data file.
MISSED missed The number of flow records that FlowCollector should have received, but did not. The MISSED value is derived from the sequence numbers (where present) in each packet.

If the only data aggregated into a data file is from a Version 1 NetFlow export datagram, or a Version 7 NetFlow export datagram with short-cut mode turned on, the MISSED field in the header will contain -1 as the value.

If any data is aggregated from datagrams in which sequence numbers are available (Version 5, or Version 7 without short-cut mode), the MISSED field in the header will contain the actual count of missed flow records, even when there is a mix of Version 1, Version 5, and Version 7 NetFlow export datagrams.

RECORDS records The count of the aggregation records present in this data file.

The following example is a typical data file header:

SOURCE 171.71.34.37|FORMAT A|AGGREGATION ASMatrix|PERIOD 15|STARTTIME 881972378|
ENDTIME 881973278|FLOWS 39757|MISSED 0|RECORDS 5

Data Record Format

The body of a data file consists of one or more data records. Each data record consists of a key portion--one or more key fields, and a value portion--one or more value fields. For example, in a data record for the CallRecord aggregation scheme, the key portion is the first six fields and consists of the following aggregation scheme keys:

srcaddr|dstaddr|srcport|dstport|protocol|tos

The value portion is the last six fields and consists of the following aggregation values:

packets|bytes|flows|firstFlowStamp|lastFlowStamp|totalActiveTime

In the data record, as in the header, the fields are separated from each other by a vertical bar ( | ) with no space before or after. You can optionally set the delimiter to a comma ( , ) with no space before or after. (You configure the choice of vertical bar or comma by using the CSV_FORMAT parameter in the nf.resources file.)


Note Depending on the aggregation scheme you select, the data record will contain a different combination of fields than shown in the CallRecord data record example.

CallRecord Aggregation Scheme Data File Example

The following CallRecord data file example shows the data file header and the first two data records.

SOURCE 192.1.134.7|FORMAT A|AGGREGATION CallRecord|PERIOD 15|STARTTIME 881972378|
ENDTIME 881973278|FLOWS 59709|MISSED 0|RECORDS 2345
171.69.1.17|172.23.34.36|2963|6000|6|114|2|176|1|768550628|768550628|0
171.69.1.23|171.69.25.133|2972|6500|17|0|3|172|1|768520516|768520520|4135
.
.
.

In the CallRecord aggregation scheme, the key portion of the data record is the first six fields and consists of the following aggregation fields:

srcaddr|dstaddr|srcport|dstport|protocol byte|tos

For example, the first six fields in the second data record from the example above are

171.69.1.23|171.69.25.133|2972|6500|17|0

where the fields are described as follows:

Field Description
171.69.1.23 Source IP address (srcaddr)
171.69.25.133 Destination IP address (dstaddr)
2972 Source port (srcport)
6500 Destination port (dstport)
17 Protocol byte
0 Type of Service (ToS)

The value portion is the last six fields and consists of the following aggregation values:

packets|bytes|flows|firstFlowStamp|lastFlowStamp|totalActiveTime

For example, the last six fields in the second data record from the example above are

3|172|1|768520516|768520520|529372

where the fields are described as follows:

Field Description
3 Packet count
172 Byte count
1 Flow count
768520516 Time in UTC seconds of the first packet that is summarized in this record
768520520 Time in UTC seconds of the last packet that is summarized in this record
4135 Total active time, in milliseconds; defined as the sum of individual active time calculations for all the flows summarized into the current record.

Note The totalActiveTime value can be 0 if one single-packet flow produces this record. In this case, the value of the flows field is 1, and the values for firstFlowStamp and lastFlowStamp are identical.

Partial Data Files

Under normal circumstances, FlowCollector generates a data file every n minutes, as specified by the Period attribute in the thread definition (refer to the section "Creating a Thread," in the chapter "Customizing FlowCollector," later in this guide). For example, if the Period attribute in a thread is set to 10 minutes, FlowCollector collects data for 10 minutes, writes that data into a data file, and then starts over in a new data collection period.

Under certain circumstances, FlowCollector might be forced to generate a data file before the current data collection period expires. Such a data file is referred to as a partial data file (because it does not represent data collected for the entire defined collection period), and happens because:

The data in a partial data file is valid data; the file just doesn't represent a full data collection period and is differentiated to prevent data statistics from being distorted by comparing data from full and partial periods.

If FlowCollector has data in its internal aggregation buffers when one of the above cases occurs, it writes the data into one or more data files. Because the current data collection period has not expired, FlowCollector generates and marks the data files differently: The keyword "PARTIAL" is added to the data file name as a suffix, and the PERIOD field in the header is identified as "PERIOD PARTIAL."

In the following two data file examples, the first example shows a complete data file named gw-router.1530, while the second example shows a partial data file (using a different aggregation scheme) named gw-router.1535.PARTIAL.

SOURCE gw.router|FORMAT A|AGGREGATION Protocol|PERIOD 10|STARTTIME 855794894|
ENDTIME 855795494
smtp|14|959|1
dns-udp-server|19|4975|9
nfs-udp-server|399|124992|24
www-tcp-server|111|63895|17
Others|10989|1689379|123
udp-bin|2955|2667629|82
SOURCE gw.router|FORMAT A|AGGREGATION DestPort|PERIOD PARTIAL|
STARTTIME 855795254|ENDTIME 855795554
1022|456|29061|5
0|53|73728|1
513|1183|48352|3
771|5|376|2
69|2|154|1
:

Using the filesready File to Track Data Files

FlowCollector periodically appends the absolute path names of data files that it has generated to a list in a log file named filesready.YYYY_MM_DD, where YYYY_MM_DD represent the year, month, and day timestamp used to identify the file. The filesready file is located with the other log files in the $NFC_DIR/logs directory. There is one such file per DataSetPath setting per day.

Typically, a client application would read this file every n minutes, process it to determine the names of any newly added data files, and then retrieve those new data files.

After it finishes writing a new data file, FlowCollector appends the absolute path name of the new data file onto the list in the filesready file. If FlowCollector deletes some data files as instructed by the FileRetain setting in its thread definitions, it updates the corresponding filesready file. The following example shows the partial contents and organization of a typical filesready file.

/opt/CSCOnfc/Data/1998_02_11/171.71.34.79/Protocol/171.71.34.79.2135
/opt/CSCOnfc/Data/1998_02_11/171.71.34.79/DetailASMatrix/171.71.34.79.2136
/opt/CSCOnfc/Data/1998_02_11/171.71.34.79/Protocol/171.71.34.79.2136
/opt/CSCOnfc/Data/1998_02_11/171.71.34.79/DetailASMatrix/171.71.34.79.2137
/opt/CSCOnfc/Data/1998_02_11/171.71.34.79/Protocol/171.71.34.79.2137
/opt/CSCOnfc/Data/1998_02_11/171.71.34.79/DetailASMatrix/171.71.34.79.2138
/opt/CSCOnfc/Data/1998_02_11/171.71.34.79/Protocol/171.71.34.79.2138
/opt/CSCOnfc/Data/1998_02_11/171.71.34.79/DetailASMatrix/171.71.34.79.2139
/opt/CSCOnfc/Data/1998_02_11/171.71.34.79/Protocol/171.71.34.79.2139

Where to Go from Here

The remaining chapters and appendixes in this guide provide information on the following topics:

For more information on... Refer to ...
Customizing FlowCollector operation using thread, filter, and protocol definitions, lists of port and autonomous system numbers, and other FlowCollector configuration parameters "Customizing FlowCollector"
Helpful information and procedures in case you encounter problems while using FlowCollector "Troubleshooting"
NetFlow export datagram formats "NetFlow Export Datagram Format"

hometocprevnextglossaryfeedbacksearchhelp
Copyright 1989-1998 © Cisco Systems Inc.