@ CSV Version of the Network Intrusion Dataset

In order to make the network instrusion dataset easier to import into database systems, and easier to analyse by non-network experts, we have converted the network data set to CSV files. The files are:

Dataset Records
base.csv.gz 358760
net1.csv.gz 628775
net2.csv.gz 481851
net3.csv.gz 509263
net4.csv.gz 632036

Description of fields:

time Converted to floating pt seconds ... hr*3600+min*60+secs
addr and port The first two fields of the src and dest address make up the fake address, so the converted address was made as: x + y*256
(you may want to get rid of x.y.256.256.port)
flag Added a "U" for udp data (only has ulen) X - means packet was a DNS name server request or response. The ID# and rest of data is in the "op" field. (see tcpdump descrip.) XPE - means there were no ports... from "fragmented packets"
seq1 The data sequence number of the packet
seq2 The data sequence number of the data expected in return
buf The number of bytes of receive buffer space available
ack The sequence number of the next data expected from the other direction on this connection
win The number of bytes of receive buffer space available from the other direction on this connection
ulen The length, if a udp packet
op Optional info such as (df) ... do not fragment


Example of a first few lines of baseline data:

38141.516172,4,80,2,2609,.,,,438528422,9112,,," (DF)"

With the CSV file, a non value in the field is allowed. Two fields are string values - flag and op, which means they are enclosed with double quotes. The rest are numeric (one floating point).

