1. Choosing Traffic Tools/Techniques¶

The tools Skaion provides can (and usually are) used in combintation to make full cyber environments. To help choose which tools may be good to include below is a discussion of the roles and advantages of each. Some other options will also be considered.

The types of traffic we will consider are:

Live Traffic/Data (Not available from Skaion)
Packet Generation (Mostly not available from Skaion)
Modeling Networks (Available as MPTGS)
Modeling Users (Available as ConsoleUser)

1.1. Live Traffic/Data¶

The most realistic data possible is, of course, the actual data in the real environment. There are both policy and technical challenges when using real data from a live enviornment.

Pros

Cons

Most realistic for that envrionment at that time

Often difficult to acquire

Contains the fullest set of data

May contain unexpected or unwanted traffic like novel malware

Captures full complexity of user interactions (with services and other users)

May contain PII or copyrighted material

Generally easy to start using

Can be difficult to establish ground truth in the data

Cannot change the environment the captured data easily or in a controlled way to experiment

When it is repeated to get more volume, the data itself is repeated which can result in more regularity than would actually be expected.

Limited ability to vary brings a risk of overfitting or over training if there is not a large amount or the captured data is not rich enough

Example use case: searching for anomolies in a real environment.

1.2. Packet Generation¶

This type of traffic uses tools that, often at very high speeds, craft packets that meet some criteria and dump them on the wire. Many of these tools are focused on the load they can achieve and do not bother to try to maintain any realism.

Pros

Cons

Able to generate large volume

Limited realism of stateful traffic.

Often easy to setup

Often packets are filled with random data which makes it ill suited to any deep packet inspection or content aware use cases.

Example use case: load testing a new switch/router.

1.3. Modeling Networks¶

With this appoach, general statistics about a network of interest are speficied, and tools attempt to generate traffic that reproduces those statistics. Skaion’s MPTGS tool largely uses this approach.

In the MPTGS a operator specifies how much of different activities they expect to see in each time slice, and the traffic generator starts and stops instances of “users” doing that activity to produce those targets.

Pros

Cons

Generate live, stateful traffic matching desired properties

Requires enough resources to produce the right amounts of traffic from each part of the test environment

Random choices create unique though similar data for each run

Can be time consuming to configure/set up

Produce moderate levels of traffic from each source

Traffic types are limited to supported types

Can change user profiles/models to rerun test with different

Example use case: testing a cyber defense tool in a lab environment.

1.4. Modeling Users¶

All of the above strategies assume, among other things, that there is no direct monitoring of the host users are using. This doesn’t hold in many situations where either host-based sensors are being used or there will be live interactions with the hosts (like when a red team will actually compromise a host). While real hosts can be added to environments with the other traffic generation options, if those hosts are to be more than passive landing points they will also need activity generation.

It is also worth noting here that not all host activity results in network activity. For example, editing a document may not produce any associated network traffic, but pasting plagerised or malicous content into it might be important to monitor.

Skaion’s ConsoleUser tool provides a way to do this. By intereacting with a target host by controlling the keyboard and mouse and monitoring the screen (unless other control structures like the accissbility layer are used) human-like activity can be carried out. In this, we attempt to model users, and the traffic that comes out is as a side effect of user activity, just as in the real world.

Pros

Cons

Supports host activities

Requires a real host endpoint for each that is observed in resulting traffic

Can enact human activities in controlled ways without being exact replays

Works at human speed, so waiting for random events can be slow

Get all real interactions between software/platform and all other computers in the enviornment (like SMB discovery and attempts to update software)

Sensitive to visual changes on the screen (with typical connector)

Can change user profiles/models to rerun test with different

Example use case: testing a cyber defense tool that includes host-based sensors.

1. Choosing Traffic Tools/Techniques¶

1.1. Live Traffic/Data¶

1.2. Packet Generation¶

1.3. Modeling Networks¶

1.4. Modeling Users¶

Table of Contents

Previous topic

Next topic

This Page

Pros	Cons
Most realistic for that envrionment at that time	Often difficult to acquire
Contains the fullest set of data	May contain unexpected or unwanted traffic like novel malware
Captures full complexity of user interactions (with services and other users)	May contain PII or copyrighted material
Generally easy to start using	Can be difficult to establish ground truth in the data
	Cannot change the environment the captured data easily or in a controlled way to experiment
	When it is repeated to get more volume, the data itself is repeated which can result in more regularity than would actually be expected.
	Limited ability to vary brings a risk of overfitting or over training if there is not a large amount or the captured data is not rich enough

Pros	Cons
Able to generate large volume	Limited realism of stateful traffic.
Often easy to setup	Often packets are filled with random data which makes it ill suited to any deep packet inspection or content aware use cases.

Pros	Cons
Generate live, stateful traffic matching desired properties	Requires enough resources to produce the right amounts of traffic from each part of the test environment
Random choices create unique though similar data for each run	Can be time consuming to configure/set up
Produce moderate levels of traffic from each source	Traffic types are limited to supported types
Can change user profiles/models to rerun test with different

Pros	Cons
Supports host activities	Requires a real host endpoint for each that is observed in resulting traffic
Can enact human activities in controlled ways without being exact replays	Works at human speed, so waiting for random events can be slow
Get all real interactions between software/platform and all other computers in the enviornment (like SMB discovery and attempts to update software)	Sensitive to visual changes on the screen (with typical connector)
Can change user profiles/models to rerun test with different