1.1. Style and Guidelines

Skaion mostly follows PEP8. In general, if pylint complains about it, we fix it.

Skaion uses Sphinx for documentation. This uses its form of markup in docstrings. It can parse Python files and generate documentation for the module and all of the classes, methods, functions, etc. defined within (see The ConsoleUser API, assuming those things (classes, functions, methods, etc.) are documented in a suitable way.

Each module should begin with a docstring that describes that module. The name of the module should appear as a heading (e.g., it should appear on a line with an equal length line of - or = below it). Public classes, functions, and methods should also have docstrings.

Modules that define a class ConsoleUser loads must be named such that the module and class have the same name. For example, the file CmdWinXP.py defines a class named CmdWinXP. ConsoleUser will automatically load modules based on names from config files and expects to find classes with the matching name within.

1.2. ConsoleUser Overview

ConsoleUser consists of several layers of abstraction. All together the layers combine to make it possible to enact human-like activities on a target host. The different layers make it possible to provide different levels of realism, from modeling typing and mouse movement characteristics to favorite websites to visit. Extending ConsoleUser then first requires recognizing which abstractions will be included in the change.

The different parts are outlined below, roughly ordered from the most abstract to the least.

1.2.1. Brains

Brains are the programs that make use of the interactions to create the behaviors that should be observable. The Brains can be very specific implementations that will do the exact same sequence every time, or a program that makes random decisions about what to do next, or any other program that models a set of activities.

The most commonly used Brain is the MarkovBrain, which chooses the next activity based on a series of Markov models. The main part of the MarkovBrain chooses which application/module should be used, and for how long that will be in control. When it is time, a Markov matrix is consulted to determine the next module, which may be the same as the current one.

Each module is implemented by its own class that determines which activities should be done next. Those activities are often user level activities like “go to a particular URL” or “send this email to this recipient” as the module exposes.

These modules are responsible for deciding the particulars of the activity, like the target(s) (e.g., the files to download via ftp or the text to use as the subject and body of an email message). If the task requires any setup or tear down the module brain is also responsible for that.

The brains, as we see, decide what to do. They do not decide how to do it. That is passed to lower levels of abstraction.

1.2.2. Users

A collection of configuration files together specify the properties of a single notional user. That User is associated with a particular computer to use, which has a certain operating system and set of applications installed on it. This User knows how to use certain types of applications, and knows how to connect to the target computer. The User also has a name, and knows login credentials for appropriate sites.

The User instance exposes high level mappings of application archtypes to the controlling program (i.e., a Brain). Each User instance has its own set of applications it uses, so Users with different configurations may not be interchangeable to all Brains. It is important to make sure that the Brain’s Markov matrix of applications matches those that are exposed by the User.

The User does also expose low level tasks not associated with any particular activity. For system interactions that support it, the User also allows a Brain to direct a string to be typed (or just specific keyboard events) or the mouse to be moved or clicked. Most Brains will use the abstractions that express a task to be performed and let the library take the appropriate steps to accomplish that task.

1.2.3. Application Archetypes

The next level of abstraction is the application type. This at the level of “Web Browser” not “Chrome 73 on Window 10”. That is, it is an abstract layer that encompasses the activities that a User might want to do with any of a number of specific applications. These will expose high level tasks that are likely to be common to many or most of the applications of that type.

For example, the WebBrowser archtype exposes methods to “click a random link” or “go to a particular URI” which are activities that every browser is likely to support. These tasks are expected to be independent of the details of the particular application and platform it is running on.

1.2.4. Concrete Applications

These represent ways to accomplish the tasks exposed by the application archtype given a specific implementation of the application on a particular platform. This is where things go from a general EmailClient to the particular mail reader being used often including the version of that application. Thunderbird 45 on Windows 7 works differently from Thunderbird 3 on Fedora 14, so the User will load a concrete class that implements the exposed archtype.

1.2.5. Low Level Actions

The applications expose high level tasks that a User may want to perform. It is also possible to interact with the target system using the low level actions such as moving the mouse or find something on the screen. The application defined tasks compose many of these to achieve some goal for the user; however, the User exposes directly taking these actions outside any Application context. Brains should generally go through the application layers, but this is provided in cases where it is necessary.

1.2.5.1. Control Actions

These types of actions can be simple or composed of several steps themselves. Such actions can be to move the mouse, press a mouse button down, release a mouse button, or combine those into a click or drag. Similarly, the keyboard can be used to type a string with its own escape sequences to indicate meta keys or repeated sequences based on the SendKeys library. Such keyboard actions would involve pressing a key down and releasing it, each of which is its own action that can be called.

1.2.5.2. Pattern Matching

In order to know where to move the mouse, for example, being able to find interesting spots on the screen is very important. Interpreting and understanding the screen is a very difficult problem. Instead of taking that on head first, ConsoleUser let’s a human tell it what to look for, just like the patterns to match in an expect script.

The most common RFB-based configuration for ConsoleUser treats the screen and the patterns to match as PNG files. ConsoleUser can be told to match a PNG against the screen, and it will find all of the places where that match occurs. The ability to do this is also exposed through the User object.

Do not write code that assumes that this is the way patterns are matched. Other connection types do not give an image of the screen back, and so use other search strategies. ConsoleUser provides a Window class to encapsulate all of the searching and interacting. That class should handle all of these types of activities for most situations. Using coordinates or assuming that one strategy is in use will interfere with portability to other system types (like AT-SPI).

Because the connection and pattern types can be changed via config file the code does not include the file extension. The undecorated name is used, and the proper extension (e.g., .png or .json) is added at run time by the relevant class loading the files.

1.2.6. Logging

ConsoleUser uses a custom logging module, which is generally available to every instance of every class. Usually it is exposed via the self.logger instance. It’s documented in the Logger module, but there are some conventions that should be followed.

First, every method (with rare exception) should add log messages when it starts and ends, in a particular format. For instance methods, which comprise the vast majority of such calls, the Logger module provides a decorator to facilitate this. You can import and then use the @trace_instance decorator for each method. This will log basic starting and ending messages in the expected format.

The decorator does not allow adding any extra information, which is not necessary but can sometimes be helpful. Most of the time any extra information can be added with extra debug log messages, but if a custom trace message is desire it can be done manually. To do it it call self.logging.trace at the start of the method and any place it can be exited. The starting message should always begin starting {class}.{method} and may optionally include other details after that. Each exit point should be start ending {class}.{method}: {return value}. This convention facilitates debugging by having a predictable start and end marker to each method and makes clear where the log message originates.

Other log messages can be presented at one of several log levels to ease troubleshooting without necessarily taking too much disk space.

The Logging module will handle opening and managing its output to the correct destination, which could be file(s) or syslog or other potential locations. The underlying destination and mechanism are transparent to the callers.