Tutorial topics:

Getting Started: What's the big idea?

BehaviorSearch is a software tool designed to help with automating the exploration of agent-based models (ABMs), by using genetic algorithms and other heuristic techniques to search the parameter-space.

BehaviorSearch interfaces with the popular NetLogo ABM development platform, to provide a low-threshold way to search for combinations of model parameter settings that will result in a specified target behavior.

How to explore your model in four steps:

  1. Choose model parameters to vary and what ranges are allowed.
  2. Design a quantitative measure for a behavior of interest.
  3. Choose a search algorithm to find parameters that maximize (or minimize) your measure.
  4. Run the search and examine the results.

Note: In this tutorial, we will assume that you already have a functioning NetLogo model that you are interested in exploring/analyzing.

Figure 1: BehaviorSearch Experiment Editor GUI,
color coded by the different functional areas of the interface.

When you first open BehaviorSearch, the window that appears is the BehaviorSearch Experiment Editor. BehaviorSearch is centered around the paradigm of an experiment (or search protocol), which contains all of the information necessary to specify how the search should be performed on a model. The BehaviorSearch GUI helps you create, open, modify, and save these experiments (stored as files with the ".bsearch" extension).

The experiment editor may feel daunting at first, with so many choices and options to be filled in. However, it is organized in logical subsections, as shown in Figure 1.

We will now discuss each of these regions in more detail, and how the various options affect the search process.

Specifying a parameter space

First, choose the model you are interested in exploring (with the "Browse for model" button, or by typing in the path). Let's assume you choose the Flocking model from NetLogo's models library (in the Biology category).

The next step is to specify settings, or ranges of settings, for each of the model's parameters. The easiest way to get started is to click the "Load param ranges from model interface" button, which will automatically extract the parameters/ranges that are included in your model's interface tab (i.e. SLIDERS, CHOOSERS, and SWITCHES).

Figure 2: An example search space specification for
the Flocking model from NetLogo's models library.

One search space specification for the Flocking model consists of the following definitions (also shown in Figure 2):

["population" 50]
["vision" [0 0.25 10]]
["minimum-separation" [0 0.25 5]]
["max-align-turn" [0 0.25 20]]
["max-separate-turn" [0 0.25 20]]
["max-cohere-turn" [0 0.25 20]]

In this case, the population parameter (number of birds) is being held constant, whereas each of the other parameters are allowed to range. For instance, the vision parameter ranges from 0 up to 10, by increments of 0.25.

The total size of this search space is 41×21×81×81×81 = 457570701 possible combinations of parameter settings.

You may notice that the syntax for specifying parameter ranges is very similar to that for BehaviorSpace. However, in BehaviorSearch it is also possible to specify a continuous range for a parameter, by using "C" for the increment -- see examples below.

Note: With BehaviorSearch you are specifying only the potential range of variables for the search space, unlike with BehaviorSpace, where every combination will be run. The size of the search space will often be very large, and the search algorithms you choose will only examine a small fraction of it. If you want to perform an exhaustive (factorial-type) search, you should use the BehaviorSpace tool that's built-in to NetLogo. BehaviorSearch is useful when you have a parameter space that's too large too enumerate, and you're willing to use heuristic search methods to try to find parameters that yield behavior that you're interested in.
Technical note: Although you may specify "C" for a continuous ranged parameter, the range may still be discretized if the search space encoding representation that you choose does not support continuous/ real-valued numbers. For instance, the StandardBinaryChromosome and GrayBinaryChromosome encodings will use only 16 bits for a continuous parameter, which allows the parameter to take on up to 65,536 values. For encodings that do allow continous parameters (such as MixedTypeChromosome), the "continuous" variable will technically be limited to 64-bit floating-point precision.

Designing a behavioral measure

Designing a quantative measure that captures the extent to which a model exhibits some type of behavior is often the most challenging part of the process. This is particularly the case if the behavior of interest is particularly qualitative in nature, or involves complex relationships between groups of agents. While it can sometimes be difficult to design an appropriate measure for your particular ABM and behavior, once you have done so it becomes possible to automate the task of searching for that behavior in your model. For the sake of this tutorial, we will only discuss reasonably simple and straightforward behavioral measures. However, in general the measures can be quite complex and powerful, and can address a variety of model exploration and analysis tasks.

Note: Proper methodology for the design of quantitative measures for ABM parameter search is an area of future research -- we plan to publish a more complete discussion of this topic, including a range of richer examples in the future. At present, some preliminary work discussing these ideas can be found on the papers page.)

Options for model running and data collection

For the Flocking model, one potentially interesting question is about the convergence of the flock of birds. In particular, let's investigate directional convergence (as opposed to positional convergence). In the Flocking model, the birds all start facing random directions, but (depending on the parameter settings) they may all converge on the same heading eventually. In this example, our search task will be to find what parameters cause the most convergence to happen quickly.

There are several things that we must specify, regarding how the model should be run, so that we can collect data that will help us search for the model conditions that cause convergence. Specifications for our example are shown in Figure 3; we will discuss them each in turn.

Figure 3: Specifying how the model should be run,
what data to collect, and when to record it.

For our convergence example, the measure is:

standard-deviation [dx] of turtles + standard-deviation [dy] of turtles

The measure can consist of any numeric NetLogo expression - what's important is that the measure is correlated with the behavior we would like to elicit from the model. In this case, we are measuring the sum of the variation of turtles headings in both the X and Y directions. When this measure is high, turtles are pointed all over. When this measure is 0, all of the turtles have exactly the same heading. Finding convergence is the same as minimizing this measure of flock heading variation. We will start recording this measure after 75 ticks have passed, and we are only running the model for 100 steps in total, meaning that we will be recording data between tick 75 and tick 100.

Designing the objective function:

Above, we specified how to collect the data we need from the model in order to perform our search. However, there are several more things that must be specified before our search goal is completely defined. We know how to collect the data, but now we need to turn it into an objective function (a.k.a. "fitness function"). For our flocking convergence example, the additional information we must fill in is shown in Figure 4.

Figure 4: Specifying the search goal/task
(the objective/fitness function)

Choosing a search algorithm

Once the objective function is fully specified, the final choice is about how the parameter space should be represented and explored. This involves choosing a search encoding representation (for the search space), as well as choosing a search algorithm and setting the necessary parameters for that search algorithm. The choices for the Flocking convergence example are shown in Figure 5.

Making choices: Choosing the search algorithm, search algorithm parameters, and search space representation can feel very daunting. The terminology is rather abstruse, and it's not necessarily clear what the best choice is. Part of our ongoing research agenda involves comparing the performance of various algorithms and developing guidelines for choosing efficient/appropriate search techniques. Eventually standards and guidelines will emerge, to help with this decision-making process. Until then, we offer a few general guidelines in this tutorial, and encourage you to explore several options and see what works best for you.

Poor choices may cause the search to progress more slowly (or totally stall). Thus, negative search results may mean that the behavior you're interested in never occurs in the model, or it may mean that you need to try other search methods. On the other hand, positive results tell you that the behavior does occur, and also what parameter settings can elicit it.

Figure 5: Search algorithm options
and search space representation.

Search Space Encoding Representation

The search space consists of all allowable combinations of settings for the model parameters, as you specified above. BehaviorSearch currently supports four different search space encodings (and its extensible architecture makes it simple to plug in new ones in the future):

Search Algorithms and Options

The search algorithm determines what order the different parameter settings will be tried in, in order to find the behavior that you quantified above.

BehaviorSearch currently includes 4 search algorithms (though additional algorithms will be added in future releases). Here is a description of each:

Note on mutation: The mutation-rate affects different search space representations differently. In the StandardBinary and GrayBinary encodings, the mutation-rate is the probability of each bit mutating, whereas for the MixedType and RealHypercube, it is the probability of each whole parameter mutating. Thus, you should generally use much higher mutation-rates (~ 0.5) for the MixedType and RealHypercube representations, and much lower rates (~ 0.02) for binary encodings.

Fitness Caching

There is also a checkbox labeled Use fitness caching. This controls whether the search algorithm caches (memorizes) the result of the objective (fitness) function every time it gets evaluated, so that it doesn't have to recompute it if the search returns to those exact same parameter settings again. Since running ABM simulations can be time-consuming (especially when dealing with large agent populations for many ticks), turning on "fitness caching" can potentially be a considerable time-saver. However, because ABMs are usually stochastic, each time a point in the space is re-evaluated, the search process would get a new independent estimation of the value at that location. The trade-offs in the interaction between the effects of fitness caching and "noisy" fitness evaluation have not been fully investigated -- at present it is unclear precisely when fitness caching will improve overall search performance.

For this tutorial example on Flocking convergence, we are using fitness caching and a GrayBinaryChromosome search space encoding representation, together with a standard generational GA with a population size of 30, a 5% mutation rate, and a crossover rate of 70%, using tournament selection with tournament size 3. For this model and search task, these choices were found to be effective (and far superior to RandomSearch). However, we have not exhaustively tested combinations of search algorithms, search algorithm parameters, and encodings, so other choices might prove to be more efficient.

Running the search and examining the results

Figure 6: Run options dialog.

After setting all of those options for how to perform the search, we are almost ready to hit the Run BehaviorSearch button, down in the lower right corner of the window. However, first we should save this "search protocol" file that we've been editing, using the File->Save menu item.

Setting Up the Search Experiment:

Clicking Run BehaviorSearch brings up a dialog for choosing some additional options relating to the search running configuration, as shown in Figure 6.

Figure 7: Search progress dialog.

Running the Search:

While the search is running, you will see a progress dialog (similar to Figure 7).

The most significant part of this dialog is the plot showing the searches' progress. On the x-axis is the number of model runs that the search has performed (again, note that this number does not count model runs due to BestChecking). On the y-axis is the current best fitness (objective) function value that the search has discovered so far. If BestChecking replicates are used, the plot shows the independently "re-checked" fitness values. If not, then the plot will show the fitness values that the search algorithm observed.

Note: Again, we must emphasize that BestChecking does not affect the search process, but it can show you when the search algorithm is failing to make true progress toward the goal, even when the search keeps finding better fitness values due to noise. So although BestChecking slows down the search process somewhat, it is usually a good idea, especially when you are first running experiments on a model.

The progress dialog shows the "best fitness" histories from the searches that have already been completed, with the currently running search shown as a thicker line. Also, a progress bar shows what fraction of the way you are through the currently running search.

Tip: Searching the parameter-space of ABMs is a time-intensive operation, and it is quite easy to construct searches that will take a long, long time to complete. (On a modern dual-core laptop, this tutorial's experiment about flocking convergence, took about an hour to complete the two repetitions of the search.) In general, you should first design search experiments that only run the model a small number of times, or only run the model a few ticks with a small number of agents. After a small preliminary run to make sure no errors occur, and things appear to be working, you can scale it up to larger runs, and you will probably have a better idea of how long large searches will take to complete.

Examining the results:

While future versions of BehaviorSearch may include some tools/scripts for visualizing the search results, currently the output data is given in several files, using the straightforward CSV (comma separated value) file format, and the methods of processing or visualizing it are up to you (Excel, R, SciPy/matplotlib, etc). Here is the list of output files that were generated for our Flocking convergence example:

The most basic usage of search results is to look at the parameter settings in the final best files, and then open your agent-based model back up in NetLogo and run the model with those parameters, to see how the model behaves under these conditions. Doing this can help uncover unexpected or interesting behavior occurring in the model, and it can also help you identify whether the quantitative measure of model behavior that you created is doing a sufficiently good job of identifying the type of behavior that you are interested in. For further discussion of the type of graphs that one might make, or inferences that one might draw from search results, you might find it useful to read “Finding Forms of Flocking” or one of the other papers linked to on the papers page. This paper discusses results for a Flocking convergence search experiment very similar to the one presented here, as well as several examples of searching for other types of flocking behavior.

Figure 8: Terminal displaying command line arguments usage.
(click to enlarge)

Command Line Usage

BehaviorSearch can also run searches from the command line, using behaviorsearch_headless.bat (behaviorsearch_headless.sh on Mac/Linux). The "headless" version of BehaviorSearch does not support designing new search experiments -- just running search protocols (*.bsearch) that have already been created using the BehaviorSearch GUI. (Search protocol files can also be created manually -- .bsearch files are simply valid XML files based on the document type definition found in "resources/behaviorsearch.dtd".)

This mode of running BehaviorSearch without a GUI is particularly useful for long batch runs, or running searches on high performance computing clusters.

To see the command line argument options, just run it without any arguments, as shown in Figure 8.

The most basic usage is:

behaviorsearch.sh -p myexperiment.bsearch -o myoutput

In this case, BehaviorSearch will run one search, the protocol for which is specified in "myexperiment.bsearch", and the output will be written to "myoutput.xxx.csv", for each of the output files created.

Additional help, feedback, bug reports, etc.

Additional help:

At present, this tutorial is the only documentation available for BehaviorSearch, apart from some context-sensitive help in the software (in the form of mouse-over tooltips and help buttons). However, if you have questions that were not covered in this documentation, please contact us, and we will do our best to respond promptly with answers.

Feedback and bugs:

Both the BehaviorSearch software and the accompanying documentation are in an active state of development. The most recent version of the software can be obtained from www.behaviorsearch.org. If you find errors in this tutorial, or bugs in the software itself, please contact us and report the issue!

Likewise, we welcome general comments and feature requests.

Note: BehaviorSearch is currently a separately maintained project, independent from NetLogo, so please do not send your bug reports or support requests to the NetLogo development team.