Choosing Traffic Tools/Techniques ================================= The tools Skaion provides can (and usually are) used in combintation to make full cyber environments. To help choose which tools may be good to include below is a discussion of the roles and advantages of each. Some other options will also be considered. The types of traffic we will consider are: - :ref:`live_traffic` (Not available from Skaion) - :ref:`packet_generation` (Mostly not available from Skaion) - :ref:`network_modeling` (Available as MPTGS_) - :ref:`user_modeling` (Available as ConsoleUser_) .. _live_traffic: Live Traffic/Data ----------------- The most realistic data possible is, of course, the actual data in the real environment. There are both policy and technical challenges when using real data from a live enviornment. +-----------------------------------+-----------------------------------+ | Pros | Cons | +===================================+===================================+ | * Most realistic for that | * Often difficult to acquire | | envrionment at that time | | +-----------------------------------+-----------------------------------+ | * Contains the fullest set of data| * May contain unexpected or | | | unwanted traffic like novel | | | malware | +-----------------------------------+-----------------------------------+ | * Captures full complexity of | * May contain PII or copyrighted | | user interactions (with services| material | | and other users) | | +-----------------------------------+-----------------------------------+ | * Generally easy to start using | * Can be difficult to establish | | | ground truth in the data | +-----------------------------------+-----------------------------------+ | | * Cannot change the environment | | | the captured data easily or in | | | a controlled way to experiment | +-----------------------------------+-----------------------------------+ | | * When it is repeated to get more | | | volume, the data itself is | | | repeated which can result in | | | more regularity than would | | | actually be expected. | +-----------------------------------+-----------------------------------+ | | * Limited ability to vary brings | | | a risk of overfitting or | | | over training if there is not | | | a large amount or the captured | | | data is not rich enough | +-----------------------------------+-----------------------------------+ Example use case: searching for anomolies in a real environment. .. _packet_generation: Packet Generation ----------------- This type of traffic uses tools that, often at very high speeds, craft packets that meet some criteria and dump them on the wire. Many of these tools are focused on the load they can achieve and do not bother to try to maintain any realism. +-----------------------------------+-----------------------------------+ | Pros | Cons | +===================================+===================================+ | * Able to generate large volume | * Limited realism of stateful | | | traffic. | +-----------------------------------+-----------------------------------+ | * Often easy to setup | * Often packets are filled with | | | random data which makes it ill | | | suited to any deep packet | | | inspection or content aware | | | use cases. | +-----------------------------------+-----------------------------------+ Example use case: load testing a new switch/router. .. _network_modeling: Modeling Networks ----------------- With this appoach, general statistics about a network of interest are speficied, and tools attempt to generate traffic that reproduces those statistics. Skaion's MPTGS_ tool largely uses this approach. In the MPTGS a operator specifies how much of different activities they expect to see in each time slice, and the traffic generator starts and stops instances of "users" doing that activity to produce those targets. +-----------------------------------+-----------------------------------+ | Pros | Cons | +===================================+===================================+ | * Generate live, stateful traffic | * Requires enough resources to | | matching desired properties | produce the right amounts of | | | traffic from each part of the | | | test environment | +-----------------------------------+-----------------------------------+ | * Random choices create unique | * Can be time consuming to | | though similar data for each | configure/set up | | run | | +-----------------------------------+-----------------------------------+ | * Produce moderate levels of | * Traffic types are limited to | | traffic from each source | supported types | +-----------------------------------+-----------------------------------+ | * Can change user profiles/models | | | to rerun test with different | | +-----------------------------------+-----------------------------------+ Example use case: testing a cyber defense tool in a lab environment. .. _user_modeling: Modeling Users -------------- All of the above strategies assume, among other things, that there is no direct monitoring of the host users are using. This doesn't hold in many situations where either host-based sensors are being used or there will be live interactions with the hosts (like when a red team will actually compromise a host). While real hosts can be added to environments with the other traffic generation options, if those hosts are to be more than passive landing points they will also need activity generation. It is also worth noting here that not all host activity results in network activity. For example, editing a document may not produce any associated network traffic, but pasting plagerised or malicous content into it might be important to monitor. Skaion's ConsoleUser_ tool provides a way to do this. By intereacting with a target host by controlling the keyboard and mouse and monitoring the screen (unless other control structures like the accissbility layer are used) human-like activity can be carried out. In this, we attempt to model users, and the traffic that comes out is as a side effect of user activity, just as in the real world. +-----------------------------------+-----------------------------------+ | Pros | Cons | +===================================+===================================+ | * Supports host activities | * Requires a real host endpoint | | | for each that is observed in | | | resulting traffic | +-----------------------------------+-----------------------------------+ | * Can enact human activities in | * Works at human speed, so waiting| | controlled ways without being | for random events can be slow | | exact replays | | +-----------------------------------+-----------------------------------+ | * Get all real interactions | * Sensitive to visual changes on | | between software/platform and | the screen (with typical | | all other computers in the | connector) | | enviornment (like SMB discovery | | | and attempts to update software)| | +-----------------------------------+-----------------------------------+ | * Can change user profiles/models | | | to rerun test with different | | +-----------------------------------+-----------------------------------+ Example use case: testing a cyber defense tool that includes host-based sensors. .. _MPTGS: https://docs.skaion.com/mptgs/index.html .. _ConsoleUser: https://docs.skaion.com/cu/index.html