## Tutorial topics:

### Getting Started: What's the big idea?

BehaviorSearch is a software tool designed to help with automating the exploration of agent-based models (ABMs), by using genetic algorithms and other heuristic techniques to search the parameter-space.

BehaviorSearch interfaces with the popular NetLogo ABM development platform, to provide a *low-threshold* way to search for combinations of model parameter settings that will result in a specified target behavior.

#### How to explore your model in four steps:

- Choose model parameters to vary and what ranges are allowed.
- Design a quantitative measure for a behavior of interest.
- Choose a search algorithm to find parameters that maximize (or minimize) your measure.
- Run the search and examine the results.

**Note:**In this tutorial, we will assume that you already have a functioning NetLogo model that you are interested in exploring/analyzing.

Figure 1: BehaviorSearch Experiment Editor GUI,

color coded by the different functional areas of the interface.

When you first open BehaviorSearch, the window that appears is the BehaviorSearch Experiment Editor.
BehaviorSearch is centered around the paradigm of an *experiment* (or *search protocol*), which contains all of the information necessary to specify how the search should be performed on a model.
The BehaviorSearch GUI helps you create, open, modify, and save these experiments (stored as files with the ".bsearch" extension).

The experiment editor may feel daunting at first, with so many choices and options to be filled in. However, it is organized in logical subsections, as shown in Figure 1.

- The thin purple region at top specifies the NetLogo model you're working with.
- The blue region (upper left) specifies the parameters to vary, and valid ranges for each parameter.
- The green region (upper right) specifies how to run the model, what measure to collect, and when.
- The red region (lower right) specifies how many times to run the model, as well as how to take the data collected from the model runs and turn it into an objective function for the search method to minimize or maximize.
- The yellow region (lower left) specifies what search algorithm and search space encoding to use, as well as any additional parameters needed to configure the search algorithm.

### Specifying a parameter space

First, choose the model you are interested in exploring (with the "Browse for model" button, or by typing in the path). Let's assume you choose the Flocking model from NetLogo's models library (in the Biology category).

The next step is to specify settings, or ranges of settings, for each of the model's parameters. The easiest way to get started is to click the "Load param ranges from model interface" button, which will automatically extract the parameters/ranges that are included in your model's interface tab (i.e. SLIDERS, CHOOSERS, and SWITCHES).

Figure 2: An example search space specification for

the Flocking model from NetLogo's models library.

One search space specification for the Flocking model consists of the following definitions (also shown in Figure 2):

`
["population" 50]
["vision" [0 0.25 10]]
["minimum-separation" [0 0.25 5]]
["max-align-turn" [0 0.25 20]]
["max-separate-turn" [0 0.25 20]]
["max-cohere-turn" [0 0.25 20]]
`

In this case, the `population` parameter (number of birds) is being held constant, whereas each of the other parameters are allowed to range.
For instance, the `vision` parameter ranges from 0 up to 10, by increments of 0.25.

The total size of this search space is 41×21×81×81×81 = 457570701 possible combinations of parameter settings.

You may notice that the syntax for specifying parameter ranges is *very* similar to that for *BehaviorSpace*. However, in *BehaviorSearch* it is also possible to specify a continuous range for a parameter, by using "C" for the increment -- see examples below.

- Boolean and discrete-choice parameters (i.e. switches and choosers):
`["variable-name" choice1 choice2 ...]`- Ex. 1)
`["wind?" true false]` - Ex. 2)
`["neighborhood-mode" "Moore" "Von Neumann"]`

- Ex. 1)
- Integer and numeric ranges:
`["variable-name" [start increment stop]]`- Ex. 1)
`["population" [100 10 200]]`→ (100,110,120,...,200) - Ex. 2)
`["death-rate" [0.2 0.01 0.4]]`→ (0.2, 0.21, 0.22, ... 0.4) - Ex. 3)
`["death-rate" [0.2 "C" 0.4]]`→ (all numbers between 0.2 and 0.4)

- Ex. 1)

**Note:**With

*BehaviorSearch*you are specifying only the potential range of variables for the search space, unlike with

*BehaviorSpace*, where every combination will be run. The size of the search space will often be very large, and the search algorithms you choose will only examine a small fraction of it. If you want to perform an exhaustive (factorial-type) search, you should use the BehaviorSpace tool that's built-in to NetLogo. BehaviorSearch is useful when you have a parameter space that's too large too enumerate, and you're willing to use heuristic search methods to try to find parameters that yield behavior that you're interested in.

**Technical note:**Although you may specify "C" for a continuous ranged parameter, the range may still be discretized if the

*search space encoding representation*that you choose does not support continuous/ real-valued numbers. For instance, the StandardBinaryChromosome and GrayBinaryChromosome encodings will use only 16 bits for a continuous parameter, which allows the parameter to take on up to 65,536 values. For encodings that do allow continous parameters (such as MixedTypeChromosome), the "continuous" variable will technically be limited to 64-bit floating-point precision.

### Designing a behavioral measure

Designing a quantative measure that captures the extent to which a model exhibits some type of behavior is often the most challenging part of the process. This is particularly the case if the behavior of interest is particularly qualitative in nature, or involves complex relationships between groups of agents. While it can sometimes be difficult to design an appropriate measure for your particular ABM and behavior, once you have done so it becomes possible to automate the task of searching for that behavior in your model. For the sake of this tutorial, we will only discuss reasonably simple and straightforward behavioral measures. However, in general the measures can be quite complex and powerful, and can address a variety of model exploration and analysis tasks.

**Note:**Proper methodology for the design of quantitative measures for ABM parameter search is an area of future research -- we plan to publish a more complete discussion of this topic, including a range of richer examples in the future. At present, some preliminary work discussing these ideas can be found on the papers page.)

#### Options for model running and data collection

For the Flocking model, one potentially interesting question is about the convergence of the flock of birds. In particular, let's investigate directional convergence (as opposed to positional convergence). In the Flocking model, the birds all start facing random directions, but (depending on the parameter settings) they may all converge on the same heading eventually. In this example, our search task will be to find what parameters cause the most convergence to happen quickly.

There are several things that we must specify, regarding how the model should be run, so that we can collect data that will help us search for the model conditions that cause convergence. Specifications for our example are shown in Figure 3; we will discuss them each in turn.

Figure 3: Specifying how the model should be run,

what data to collect, and when to record it.

**Setup:**NetLogo commands to run to setup the model. (typically, just call the SETUP procedure.)**Step:**NetLogo commands to be run over and over again. (typically, just call the GO procedure.)**Measure:**this is a NetLogo expression, which somehow quantifies the behavior that we are interested in searching for. (discussed more below)**Measure if:**a condition controlling on which steps the**measure**takes place. (in this case, only after 75 ticks have passed.)**Stop if:**a stop condition for the model (optional)**Step limit:**a limit on the number of times the**step**commands will be run.

For our convergence example, the measure is:

`standard-deviation [dx] of turtles + standard-deviation [dy] of turtles`

The measure can consist of any numeric NetLogo expression - what's important is that the measure is correlated with the behavior we would like to elicit from the model. In this case, we are measuring the sum of the variation of turtles headings in both the X and Y directions. When this measure is high, turtles are pointed all over. When this measure is 0, all of the turtles have exactly the same heading. Finding convergence is the same as minimizing this measure of flock heading variation. We will start recording this measure after 75 ticks have passed, and we are only running the model for 100 steps in total, meaning that we will be recording data between tick 75 and tick 100.

#### Designing the objective function:

Above, we specified how to collect the data we need from the model in order to perform our search. However, there are several more things that must be specified before our search goal is completely defined. We know how to collect the data, but now we need to turn it into an objective function (a.k.a. "fitness function"). For our flocking convergence example, the additional information we must fill in is shown in Figure 4.

Figure 4: Specifying the search goal/task

(the objective/fitness function)

**Goal:**Are we trying to minimize or maximize something?**Collected measure:**During one model run, we may have collected the measure multiple times. How can we condense that into one number?- In this case, MEAN_ACROSS_STEPS will be taking the mean (average) of the 26 measures (taken from tick 75 to tick 100).
- (there are other choices for min/max/median/variance)
- Also, the AT_FINAL_STEP choice is useful if you are only interested in the last measure that was recorded.

**Sampling:**How many times should the model be run? Running the model once may not give representative results, so you may want to perform multiple replicate runs (with different initial random seeds), and collect behavioral measures from each of them.- For this example, we will run the Flocking model 5 times.
- (Technical note: BehaviorSearch currently only supports fixed sampling, though in the future more advanced search techniques might use adaptive sampling methods.)

**Combine replicates:**If you are doing multiple replicate runs of the model, how do you combine those results, to get a single number for our objective function?- For this example, we will be taking the MEAN (average) result of the 5 model runs.
- MEDIAN may be a better choice if your measure occasionally yields ridiculously high or low outlier values, which you'd like to ignore.
- On the other hand, MIN/MAX may be useful choices if you want to search for parameters that cause extreme behavior, while ignoring average behavior.
- The VARIANCE choices may be useful for finding parameters for which there is volatility in whether the model exhibits a behavior or not.
Such volatility
*might*indicate a phase transition between two regimes of model behavior. STDEV is the same, except that the fitness function values will be in the same units of the original parameter, which may be preferable for human interpretation.

**Take derivative?**Sometimes you would like to find a point in the parameter space where the change in your behavioral measure is maximized (or minimized) with respect to a small change in some parameter. Such places may indicate a*phase transition*,*critical point*, or*leverage point*. The**Take derivative?**option allows you to maximize/minimize the approximate derivative of your fitness function with respect to a specified parameter and a specified*delta*(change amount).- If
**Use ABS value?**is checked, then the reported difference is always positive. - If you choose the special value “@MUTATE@” then finds a neighboring point in the search space using mutation from the parameter settings being evaluated, with the mutation rate specified by
*delta*. - This Flocking example does not use the derivative functionality, but the
*Example Fire Derivative*example that comes with BehaviorSearch does.

- If
**Evaluation limit:**How many total model runs should the search process perform, before stopping?**BestChecking replicates:**the number of additional replicate model runs that should be performed to get an unbiased estimate of the true objective function value, each time the search algorithm finds a new set of parameters that it thinks is "better" than any previous set.- The motivation for this is that ABMs are usually stochastic, and when sampling a measure a small/finite number of times (such as 5, in our example here), there is likely to still be some "noise" in the objective function. Thus a search algorithm may appear to be making progress, finding better and better parameter settings, when in fact the better results are due to random noise. Using
*BestChecking*replicates can help you identify when this is the case. - Also, since new "bests" are found relatively infrequently, you can usually afford to specify a higher number of BestChecking replicates than you can for normal sampling, yielding more statistically significant reading of the objective function as the best parameters that the search found.
- BestChecking replicates are not counted against the total "model run" limit for the search. These replicates are extrinsic to the search process, but are included in the output results to evaluate the search performance, and verify the objective function values that are obtained.

- The motivation for this is that ABMs are usually stochastic, and when sampling a measure a small/finite number of times (such as 5, in our example here), there is likely to still be some "noise" in the objective function. Thus a search algorithm may appear to be making progress, finding better and better parameter settings, when in fact the better results are due to random noise. Using

### Choosing a search algorithm

Once the objective function is fully specified, the final choice is about how the parameter space should be represented and explored. This involves choosing a search encoding representation (for the search space), as well as choosing a search algorithm and setting the necessary parameters for that search algorithm. The choices for the Flocking convergence example are shown in Figure 5.

**Making choices:**Choosing the search algorithm, search algorithm parameters, and search space representation can feel very daunting. The terminology is rather abstruse, and it's not necessarily clear what the best choice is. Part of our ongoing research agenda involves comparing the performance of various algorithms and developing guidelines for choosing efficient/appropriate search techniques. Eventually standards and guidelines will emerge, to help with this decision-making process. Until then, we offer a few general guidelines in this tutorial, and encourage you to explore several options and see what works best for you.

Poor choices may cause the search to progress more slowly (or totally stall). Thus, negative search results may mean that the behavior you're interested in never occurs in the model, or it may mean that you need to try other search methods. On the other hand, positive results tell you that the behavior does occur, and also what parameter settings can elicit it.

Figure 5: Search algorithm options

and search space representation.

#### Search Space Encoding Representation

The *search space* consists of all allowable combinations of settings for the model parameters, as you specified above.
BehaviorSearch currently supports four different search space encodings (and its extensible architecture makes it simple to plug in new ones in the future):

**MixedTypeChromosome**This encoding most closely matches the way that one commonly thinks of the ABM parameters. Each parameter is stored separately with its own data type (discrete numeric, continuous numeric, categorical, boolean, etc). Mutation applies to each parameter separately (e.g. continous parameters use Gaussian mutation, boolean parameters get flipped).**StandardBinaryChromosome**In this encoding, every parameter is converted into a string of binary digits, and these sequences are concatenated together into one large bit array. Mutation and crossover then occur on a per-bit basis.**GrayBinaryChromosome**Similar to StandardBinaryChromosome, except that numeric values are encoded to binary strings using a Gray code, instead of the standard "high order" bit ordering. Gray codes have generally been found to give better performance for search representations, since numeric values that are close together are more likely to be fewer mutations away from each other.**RealHypercubeChromosome**In this encoding, every parameter (numeric or not) is represented by a "real-valued" continuous variable. This encoding exists mainly to facilitate the (future) use of algorithms that assume a continuous numeric space (such as Particle Swarm Optimization), and allow them to be applied even when some of the model parameters are not numeric.

#### Search Algorithms and Options

The search algorithm determines what order the different parameter settings will be tried in, in order to find the behavior that you quantified above.

BehaviorSearch currently includes 4 search algorithms (though additional algorithms will be added in future releases). Here is a description of each:

**RandomSearch**This is a naive baseline search algorithm, which simply randomly generates one set of parameters after another, computing the objective function for each in turn, and at the end it returns the best settings it found. RandomSearch is unlikely to be the most efficient search choice, but it is very straightforward method, and it performs an unbiased exploration of the search space.**MutationHillClimber**This search algorithm starts with a random location in the search space (i.e. set of parameters), and then repeatedly generates a neighboring location (using a mutation operator) and moves to that new location only if it is better than the old location.**mutation-rate**controls the probability of mutation**restart-after-stall-count:**if the hill climber makes some number (*restart-after-stall-count*) of unsuccessful attempts to move to a random neighbor, it assumes it is trapped at a local optimum in the space, so it restarts by jumping to a new random location anywhere in the search space.

**SimualtedAnnealing**This search algorithm is similar to a hill climbing approach, except that a downhill (inferior) move may also occur, but only with a certain probability based on the*temperature*of the system, which decreases over time. Simulated annealing is inspired by the physical annealing process in metallurgy: heating followed by the controlled cooling of a material in order to increase crystal size.**mutation-rate**how much mutation occurs when choosing a candidate location for moving.**restart-after-stall-count:**if it doesn't manage to move to a new location after X attempts, reset the temperature, jump to a random location in the search space and try again.**initial-temperature**the system's initial 'temperature' (a reasonable choice would be the average expected difference in the fitness function's value for two random points in the search space)**temperature-change-factor**the system's current 'temperature' is multiplied by this factor (which needs to be less than 1!) after each move. (Using this exponential temperature decay means that temperature will approach 0 over time. Unfortunately, the optimal rate for the temperature to decrease varies between problems.)

**StandardGA**This is a simple (classic) Genetic Algorithm, which can operate on any of the available search representations.**mutation-rate:**controls the probability of mutation**crossover-rate:**the probability of using two parents when creating a child (otherwise the child is created asexually)**population-size:**the number of individuals in each generation.**tournament-size:**the number of individuals that compete to be selected for reproduction via*tournament selection*. Higher values cause more*selection pressure*which will push the GA population to converge more quickly. Usually 2 or 3 is a good value.**population-model:**'generational', 'steady-state-replace-random', or 'steady-state-replace-worst'*generational*means the whole population is replaced at once*steady-state*means that only one single individual is replaced by reproduction each iteration. The individual being replaced may be randomly-chosen, or the current worst.

**Note on mutation:**The mutation-rate affects different search space representations differently. In the StandardBinary and GrayBinary encodings, the mutation-rate is the probability of each bit mutating, whereas for the MixedType and RealHypercube, it is the probability of each whole parameter mutating. Thus, you should generally use much higher mutation-rates (~ 0.5) for the MixedType and RealHypercube representations, and much lower rates (~ 0.02) for binary encodings.

#### Fitness Caching

There is also a checkbox labeled **Use fitness caching**.
This controls whether the search algorithm caches (memorizes) the result of the objective (fitness) function every time it gets evaluated, so that it doesn't have to recompute it if the search returns to those exact same parameter settings again.
Since running ABM simulations can be time-consuming (especially when dealing with large agent populations for many ticks), turning on "fitness caching" can potentially be a considerable time-saver.
*However*, because ABMs are usually stochastic, each time a point in the space is re-evaluated, the search process would get a new independent estimation of the value at that location.
The trade-offs in the interaction between the effects of fitness caching and "noisy" fitness evaluation have not been fully investigated -- at present it is unclear precisely when fitness caching will improve overall search performance.

For this tutorial example on Flocking convergence, we are using fitness caching and a GrayBinaryChromosome search space encoding representation, together with a standard generational GA with a population size of 30, a 5% mutation rate, and a crossover rate of 70%, using tournament selection with tournament size 3. For this model and search task, these choices were found to be effective (and far superior to RandomSearch). However, we have not exhaustively tested combinations of search algorithms, search algorithm parameters, and encodings, so other choices might prove to be more efficient.

### Running the search and examining the results

Figure 6: Run options dialog.

After setting all of those options for how to perform the search, we are almost ready to hit the **Run BehaviorSearch** button, down in the lower right corner of the window.
However, first we should save this "search protocol" file that we've been editing, using the File->Save menu item.

#### Setting Up the Search Experiment:

Clicking **Run BehaviorSearch** brings up a dialog for choosing some additional options relating to the search running configuration, as shown in Figure 6.

**Output file stem:**where to save the output data from the search. Specifically, a number of files will be created, each starting with this same filename "stem" (e.g. "stem.finalBests.csv", "stem.bestHistory", etc). The output file details will be discussed below.**Number of searches:**How many times should the whole search process be repeated? A single search may not find the best parameter; additional searches improve confidence.**Starting at search ID:**This option only affects the ID numbering in the output files. (One day you might run searches numbered 1...5, and the next day run searches 6...10).**Initial random seed:**Random seed for the pseudo-random number generator, to start the first search (additional searches will be seeded with following consecutive numbers). This is useful for reproducing the exact same search later.**Number of threads:**Using multiple threads can significantly speed up the search process (on multi-processor/multi-core computers).*(Different numbers of threads should only affect running-time, and not the results obtained)*.**Brief Output?**BehaviorSearch's default behavior is to create a variety of output data files (discussed below), some of which can be quite large (containig the results of all model runs and all objective function evaluations). Selecting this checkbox suppresses the creation of the two largest output files.

#### Running the Search:

While the search is running, you will see a progress dialog (similar to Figure 7).

The most significant part of this dialog is the plot showing the searches' progress. On the x-axis is the number of model runs that the search has performed (again, note that this number does not count model runs due to BestChecking). On the y-axis is the current best fitness (objective) function value that the search has discovered so far. If BestChecking replicates are used, the plot shows the independently "re-checked" fitness values. If not, then the plot will show the fitness values that the search algorithm observed.

**Note:**Again, we must emphasize that BestChecking

*does not affect*the search process, but it can show you when the search algorithm is failing to make true progress toward the goal, even when the search keeps finding better fitness values due to noise. So although BestChecking slows down the search process somewhat, it is usually a good idea, especially when you are first running experiments on a model.

The progress dialog shows the "best fitness" histories from the searches that have already been completed, with the currently running search shown as a thicker line. Also, a progress bar shows what fraction of the way you are through the currently running search.

**Tip:**Searching the parameter-space of ABMs is a time-intensive operation, and it is quite easy to construct searches that will take a long, long time to complete. (On a modern dual-core laptop, this tutorial's experiment about flocking convergence, took about an hour to complete the two repetitions of the search.) In general, you should first design search experiments that only run the model a small number of times, or only run the model a few ticks with a small number of agents. After a small preliminary run to make sure no errors occur, and things appear to be working, you can scale it up to larger runs, and you will probably have a better idea of how long large searches will take to complete.

#### Examining the results:

While future versions of BehaviorSearch may include some tools/scripts for visualizing the search results, currently the output data is given in several files, using the straightforward CSV (comma separated value) file format, and the methods of processing or visualizing it are up to you (Excel, R, SciPy/matplotlib, etc). Here is the list of output files that were generated for our Flocking convergence example:

**example_flocking.searchConfig.xml -**this XML file isn't really output data -- it just contains all of the settings that you used for running the searches. This is useful to keep with your output data, so that if you come back to it later and can't remember what search options you chose, it will all be stored here. Additionally, since this XML file is in the same format as the .BSEARCH files, you can also load this XML file into BehaviorSearch GUI to examine it, or run further similar searches.**example_flocking.modelRunHistory.csv -**this file contains a row for each time the ABM was run by the search progress, showing the parameter settings that the model was run with, as well as the value of your "measure". It also gives the random seed that the ABM was setup with, so if you like, you can go back and run it in NetLogo with exactly the same settings.**example_flocking.objectiveFunctionHistory.csv -**this file contains a row for each time the objective function is evaluated. (Remember, the objective function may require doing N replicate model runs, and then taking the average, or combining the data from those runs in some other way.)**example_flocking.bestHistory.csv -**this file contains a row for each objective function evaluation that was better than all previous ones.**example_flocking.finalBests.csv -**This file is a short list of only the single best parameter settings found by each of the searches. If you only look at one output file, this is the one.**example_flocking.finalCheckedBests.csv -**This file is similar to the .finalBests.csv file, but rather than listing the best parameter settings found*by*each search process, it lists the parameter settings that (using the extrinsic BestChecking), were found to have the best independently verified objective function value. Parameter settings listed in this file are the most likely to be the best (*although if the objective function is noisy, the objective function values reported here may be an overestimate of the real objective function value, as a result of the bias incurred by selecting the best out of the set*).

## Command Line Usage

BehaviorSearch can also run searches from the command line, using behaviorsearch_headless.bat (behaviorsearch_headless.sh on Mac/Linux).
The "headless" version of BehaviorSearch does not support designing new search experiments -- just running search protocols (*.bsearch) that have already been created using the BehaviorSearch GUI. *(Search protocol files can also be created manually -- .bsearch files are simply valid XML files based on the document type definition found in "resources/behaviorsearch.dtd".)*

This mode of running BehaviorSearch without a GUI is particularly useful for long batch runs, or running searches on high performance computing clusters.

To see the command line argument options, just run it without any arguments, as shown in Figure 8.

The most basic usage is:

`behaviorsearch.sh -p myexperiment.bsearch -o myoutput`

In this case, BehaviorSearch will run one search, the protocol for which is specified in "myexperiment.bsearch", and the output will be written to "myoutput.xxx.csv", for each of the output files created.

## Additional help, feedback, bug reports, etc.

#### Additional help:

At present, this tutorial is the only documentation available for BehaviorSearch, apart from some context-sensitive help in the software (in the form of mouse-over tooltips and help buttons). However, if you have questions that were not covered in this documentation, please contact us, and we will do our best to respond promptly with answers.

#### Feedback and bugs:

Both the BehaviorSearch software and the accompanying documentation are in an active state of development. The most recent version of the software can be obtained from www.behaviorsearch.org.
If you find **errors** in this tutorial, or **bugs** in the software itself, please contact us and report the issue!

Likewise, we welcome general comments and feature requests.

**Note:**BehaviorSearch is currently a separately maintained project, independent from NetLogo, so please do not send your bug reports or support requests to the NetLogo development team.