Getting Started: What's the big idea?
BehaviorSearch is a software tool designed to help with automating the exploration of agent-based models (ABMs), by using genetic algorithms and other heuristic techniques to search the parameter-space.
BehaviorSearch interfaces with the popular NetLogo ABM development platform, to provide a low-threshold way to search for combinations of model parameter settings that will result in a specified target behavior.
How to explore your model in four steps:
- Choose model parameters to vary and what ranges are allowed.
- Design a quantitative measure for a behavior of interest.
- Choose a search algorithm to find parameters that maximize (or minimize) your measure.
- Run the search and examine the results.
Figure 1: BehaviorSearch Experiment Editor GUI,
color coded by the different functional areas of the interface.
When you first open BehaviorSearch, the window that appears is the BehaviorSearch Experiment Editor. BehaviorSearch is centered around the paradigm of an experiment (or search protocol), which contains all of the information necessary to specify how the search should be performed on a model. The BehaviorSearch GUI helps you create, open, modify, and save these experiments (stored as files with the ".bsearch" extension).
The experiment editor may feel daunting at first, with so many choices and options to be filled in. However, it is organized in logical subsections, as shown in Figure 1.
- The thin purple region at top specifies the NetLogo model you're working with.
- The blue region (upper left) specifies the parameters to vary, and valid ranges for each parameter.
- The green region (upper right) specifies how to run the model, what measure to collect, and when.
- The red region (lower right) specifies how many times to run the model, as well as how to take the data collected from the model runs and turn it into an objective function for the search method to minimize or maximize.
- The yellow region (lower left) specifies what search algorithm and search space encoding to use, as well as any additional parameters needed to configure the search algorithm.
Specifying a parameter space
First, choose the model you are interested in exploring (with the "Browse for model" button, or by typing in the path). Let's assume you choose the Flocking model from NetLogo's models library (in the Biology category).
The next step is to specify settings, or ranges of settings, for each of the model's parameters. The easiest way to get started is to click the "Load param ranges from model interface" button, which will automatically extract the parameters/ranges that are included in your model's interface tab (i.e. SLIDERS, CHOOSERS, and SWITCHES).
Figure 2: An example search space specification for
the Flocking model from NetLogo's models library.
One search space specification for the Flocking model consists of the following definitions (also shown in Figure 2):
["vision" [0 0.25 10]]
["minimum-separation" [0 0.25 5]]
["max-align-turn" [0 0.25 20]]
["max-separate-turn" [0 0.25 20]]
["max-cohere-turn" [0 0.25 20]]
In this case, the population parameter (number of birds) is being held constant, whereas each of the other parameters are allowed to range. For instance, the vision parameter ranges from 0 up to 10, by increments of 0.25.
The total size of this search space is 41×21×81×81×81 = 457570701 possible combinations of parameter settings.
You may notice that the syntax for specifying parameter ranges is very similar to that for BehaviorSpace. However, in BehaviorSearch it is also possible to specify a continuous range for a parameter, by using "C" for the increment -- see examples below.
- Boolean and discrete-choice parameters (i.e. switches and choosers): ["variable-name" choice1 choice2 ...]
- Ex. 1) ["wind?" true false]
- Ex. 2) ["neighborhood-mode" "Moore" "Von Neumann"]
- Integer and numeric ranges: ["variable-name" [start increment stop]]
- Ex. 1) ["population" [100 10 200]] → (100,110,120,...,200)
- Ex. 2) ["death-rate" [0.2 0.01 0.4]] → (0.2, 0.21, 0.22, ... 0.4)
- Ex. 3) ["death-rate" [0.2 "C" 0.4]] → (all numbers between 0.2 and 0.4)
Designing a behavioral measure
Designing a quantative measure that captures the extent to which a model exhibits some type of behavior is often the most challenging part of the process. This is particularly the case if the behavior of interest is particularly qualitative in nature, or involves complex relationships between groups of agents. While it can sometimes be difficult to design an appropriate measure for your particular ABM and behavior, once you have done so it becomes possible to automate the task of searching for that behavior in your model. For the sake of this tutorial, we will only discuss reasonably simple and straightforward behavioral measures. However, in general the measures can be quite complex and powerful, and can address a variety of model exploration and analysis tasks.
Options for model running and data collection
For the Flocking model, one potentially interesting question is about the convergence of the flock of birds. In particular, let's investigate directional convergence (as opposed to positional convergence). In the Flocking model, the birds all start facing random directions, but (depending on the parameter settings) they may all converge on the same heading eventually. In this example, our search task will be to find what parameters cause the most convergence to happen quickly.
There are several things that we must specify, regarding how the model should be run, so that we can collect data that will help us search for the model conditions that cause convergence. Specifications for our example are shown in Figure 3; we will discuss them each in turn.
Figure 3: Specifying how the model should be run,
what data to collect, and when to record it.
- Setup: NetLogo commands to run to setup the model. (typically, just call the SETUP procedure.)
- Step: NetLogo commands to be run over and over again. (typically, just call the GO procedure.)
- Measure: this is a NetLogo expression, which somehow quantifies the behavior that we are interested in searching for. (discussed more below)
- Measure if: a condition controlling on which steps the measure takes place. (in this case, only after 75 ticks have passed.)
- Stop if: a stop condition for the model (optional)
- Step limit: a limit on the number of times the step commands will be run.
For our convergence example, the measure is:
standard-deviation [dx] of turtles + standard-deviation [dy] of turtles
The measure can consist of any numeric NetLogo expression - what's important is that the measure is correlated with the behavior we would like to elicit from the model. In this case, we are measuring the sum of the variation of turtles headings in both the X and Y directions. When this measure is high, turtles are pointed all over. When this measure is 0, all of the turtles have exactly the same heading. Finding convergence is the same as minimizing this measure of flock heading variation. We will start recording this measure after 75 ticks have passed, and we are only running the model for 100 steps in total, meaning that we will be recording data between tick 75 and tick 100.
Designing the objective function:
Above, we specified how to collect the data we need from the model in order to perform our search. However, there are several more things that must be specified before our search goal is completely defined. We know how to collect the data, but now we need to turn it into an objective function (a.k.a. "fitness function"). For our flocking convergence example, the additional information we must fill in is shown in Figure 4.
Figure 4: Specifying the search goal/task
(the objective/fitness function)
- Goal: Are we trying to minimize or maximize something?
- Collected measure: During one model run, we may have collected the measure multiple times. How can we condense that into one number?
- In this case, MEAN_ACROSS_STEPS will be taking the mean (average) of the 26 measures (taken from tick 75 to tick 100).
- (there are other choices for min/max/median/variance)
- Also, the AT_FINAL_STEP choice is useful if you are only interested in the last measure that was recorded.
- Sampling: How many times should the model be run? Running the model once may not give representative results, so you may want to perform multiple replicate runs (with different initial random seeds), and collect behavioral measures from each of them.
- For this example, we will run the Flocking model 5 times.
- (Technical note: BehaviorSearch currently only supports fixed sampling, though in the future more advanced search techniques might use adaptive sampling methods.)
- Combine replicates: If you are doing multiple replicate runs of the model, how do you combine those results, to get a single number for our objective function?
- For this example, we will be taking the MEAN (average) result of the 5 model runs.
- MEDIAN may be a better choice if your measure occasionally yields ridiculously high or low outlier values, which you'd like to ignore.
- On the other hand, MIN/MAX may be useful choices if you want to search for parameters that cause extreme behavior, while ignoring average behavior.
- The VARIANCE choices may be useful for finding parameters for which there is volatility in whether the model exhibits a behavior or not. Such volatility might indicate a phase transition between two regimes of model behavior. STDEV is the same, except that the fitness function values will be in the same units of the original parameter, which may be preferable for human interpretation.
- Take derivative? Sometimes you would like to find a point in the parameter space where the change in your behavioral measure is maximized (or minimized) with respect to a small change in some parameter.
Such places may indicate a phase transition, critical point, or leverage point.
The Take derivative? option allows you to maximize/minimize the approximate derivative of your fitness function with respect to a specified parameter and a specified delta (change amount).
- If Use ABS value? is checked, then the reported difference is always positive.
- If you choose the special value “@MUTATE@” then finds a neighboring point in the search space using mutation from the parameter settings being evaluated, with the mutation rate specified by delta.
- This Flocking example does not use the derivative functionality, but the Example Fire Derivative example that comes with BehaviorSearch does.
- Evaluation limit: How many total model runs should the search process perform, before stopping?
- BestChecking replicates: the number of additional replicate model runs that should be performed to get an unbiased estimate of the true objective function value, each time the search algorithm finds a new set of parameters that it thinks is "better" than any previous set.
- The motivation for this is that ABMs are usually stochastic, and when sampling a measure a small/finite number of times (such as 5, in our example here), there is likely to still be some "noise" in the objective function. Thus a search algorithm may appear to be making progress, finding better and better parameter settings, when in fact the better results are due to random noise. Using BestChecking replicates can help you identify when this is the case.
- Also, since new "bests" are found relatively infrequently, you can usually afford to specify a higher number of BestChecking replicates than you can for normal sampling, yielding more statistically significant reading of the objective function as the best parameters that the search found.
- BestChecking replicates are not counted against the total "model run" limit for the search. These replicates are extrinsic to the search process, but are included in the output results to evaluate the search performance, and verify the objective function values that are obtained.
Choosing a search algorithm
Once the objective function is fully specified, the final choice is about how the parameter space should be represented and explored. This involves choosing a search encoding representation (for the search space), as well as choosing a search algorithm and setting the necessary parameters for that search algorithm. The choices for the Flocking convergence example are shown in Figure 5.
Poor choices may cause the search to progress more slowly (or totally stall). Thus, negative search results may mean that the behavior you're interested in never occurs in the model, or it may mean that you need to try other search methods. On the other hand, positive results tell you that the behavior does occur, and also what parameter settings can elicit it.
Figure 5: Search algorithm options
and search space representation.
Search Space Encoding Representation
The search space consists of all allowable combinations of settings for the model parameters, as you specified above. BehaviorSearch currently supports four different search space encodings (and its extensible architecture makes it simple to plug in new ones in the future):
- MixedTypeChromosome This encoding most closely matches the way that one commonly thinks of the ABM parameters. Each parameter is stored separately with its own data type (discrete numeric, continuous numeric, categorical, boolean, etc). Mutation applies to each parameter separately (e.g. continous parameters use Gaussian mutation, boolean parameters get flipped).
- StandardBinaryChromosome In this encoding, every parameter is converted into a string of binary digits, and these sequences are concatenated together into one large bit array. Mutation and crossover then occur on a per-bit basis.
- GrayBinaryChromosome Similar to StandardBinaryChromosome, except that numeric values are encoded to binary strings using a Gray code, instead of the standard "high order" bit ordering. Gray codes have generally been found to give better performance for search representations, since numeric values that are close together are more likely to be fewer mutations away from each other.
- RealHypercubeChromosome In this encoding, every parameter (numeric or not) is represented by a "real-valued" continuous variable. This encoding exists mainly to facilitate the (future) use of algorithms that assume a continuous numeric space (such as Particle Swarm Optimization), and allow them to be applied even when some of the model parameters are not numeric.
Search Algorithms and Options
The search algorithm determines what order the different parameter settings will be tried in, in order to find the behavior that you quantified above.
BehaviorSearch currently includes 4 search algorithms (though additional algorithms will be added in future releases). Here is a description of each:
- RandomSearch This is a naive baseline search algorithm, which simply randomly generates one set of parameters after another, computing the objective function for each in turn, and at the end it returns the best settings it found. RandomSearch is unlikely to be the most efficient search choice, but it is very straightforward method, and it performs an unbiased exploration of the search space.
- MutationHillClimber This search algorithm starts with a random location in the search space (i.e. set of parameters), and then repeatedly generates a neighboring location (using a mutation operator) and moves to that new location only if it is better than the old location.
- mutation-rate controls the probability of mutation
- restart-after-stall-count: if the hill climber makes some number (restart-after-stall-count) of unsuccessful attempts to move to a random neighbor, it assumes it is trapped at a local optimum in the space, so it restarts by jumping to a new random location anywhere in the search space.
- SimualtedAnnealing This search algorithm is similar to a hill climbing approach, except that a downhill (inferior) move may also occur, but only with a certain probability based on the temperature of the system, which decreases over time. Simulated annealing is inspired by the physical annealing process in metallurgy: heating followed by the controlled cooling of a material in order to increase crystal size.
- mutation-rate how much mutation occurs when choosing a candidate location for moving.
- restart-after-stall-count: if it doesn't manage to move to a new location after X attempts, reset the temperature, jump to a random location in the search space and try again.
- initial-temperature the system's initial 'temperature' (a reasonable choice would be the average expected difference in the fitness function's value for two random points in the search space)
- temperature-change-factor the system's current 'temperature' is multiplied by this factor (which needs to be less than 1!) after each move. (Using this exponential temperature decay means that temperature will approach 0 over time. Unfortunately, the optimal rate for the temperature to decrease varies between problems.)
- StandardGA This is a simple (classic) Genetic Algorithm, which can operate on any of the available search representations.
- mutation-rate: controls the probability of mutation
- crossover-rate: the probability of using two parents when creating a child (otherwise the child is created asexually)
- population-size: the number of individuals in each generation.
- tournament-size: the number of individuals that compete to be selected for reproduction via tournament selection. Higher values cause more selection pressure which will push the GA population to converge more quickly. Usually 2 or 3 is a good value.
- population-model: 'generational', 'steady-state-replace-random', or 'steady-state-replace-worst'
- generational means the whole population is replaced at once
- steady-state means that only one single individual is replaced by reproduction each iteration. The individual being replaced may be randomly-chosen, or the current worst.
There is also a checkbox labeled Use fitness caching. This controls whether the search algorithm caches (memorizes) the result of the objective (fitness) function every time it gets evaluated, so that it doesn't have to recompute it if the search returns to those exact same parameter settings again. Since running ABM simulations can be time-consuming (especially when dealing with large agent populations for many ticks), turning on "fitness caching" can potentially be a considerable time-saver. However, because ABMs are usually stochastic, each time a point in the space is re-evaluated, the search process would get a new independent estimation of the value at that location. The trade-offs in the interaction between the effects of fitness caching and "noisy" fitness evaluation have not been fully investigated -- at present it is unclear precisely when fitness caching will improve overall search performance.
For this tutorial example on Flocking convergence, we are using fitness caching and a GrayBinaryChromosome search space encoding representation, together with a standard generational GA with a population size of 30, a 5% mutation rate, and a crossover rate of 70%, using tournament selection with tournament size 3. For this model and search task, these choices were found to be effective (and far superior to RandomSearch). However, we have not exhaustively tested combinations of search algorithms, search algorithm parameters, and encodings, so other choices might prove to be more efficient.
Running the search and examining the results
Figure 6: Run options dialog.
After setting all of those options for how to perform the search, we are almost ready to hit the Run BehaviorSearch button, down in the lower right corner of the window. However, first we should save this "search protocol" file that we've been editing, using the File->Save menu item.
Setting Up the Search Experiment:
Clicking Run BehaviorSearch brings up a dialog for choosing some additional options relating to the search running configuration, as shown in Figure 6.
- Output file stem: where to save the output data from the search. Specifically, a number of files will be created, each starting with this same filename "stem" (e.g. "stem.finalBests.csv", "stem.bestHistory", etc). The output file details will be discussed below.
- Number of searches: How many times should the whole search process be repeated? A single search may not find the best parameter; additional searches improve confidence.
- Starting at search ID: This option only affects the ID numbering in the output files. (One day you might run searches numbered 1...5, and the next day run searches 6...10).
- Initial random seed: Random seed for the pseudo-random number generator, to start the first search (additional searches will be seeded with following consecutive numbers). This is useful for reproducing the exact same search later.
- Number of threads: Using multiple threads can significantly speed up the search process (on multi-processor/multi-core computers). (Different numbers of threads should only affect running-time, and not the results obtained).
- Brief Output? BehaviorSearch's default behavior is to create a variety of output data files (discussed below), some of which can be quite large (containig the results of all model runs and all objective function evaluations). Selecting this checkbox suppresses the creation of the two largest output files.
Running the Search:
While the search is running, you will see a progress dialog (similar to Figure 7).
The most significant part of this dialog is the plot showing the searches' progress. On the x-axis is the number of model runs that the search has performed (again, note that this number does not count model runs due to BestChecking). On the y-axis is the current best fitness (objective) function value that the search has discovered so far. If BestChecking replicates are used, the plot shows the independently "re-checked" fitness values. If not, then the plot will show the fitness values that the search algorithm observed.
The progress dialog shows the "best fitness" histories from the searches that have already been completed, with the currently running search shown as a thicker line. Also, a progress bar shows what fraction of the way you are through the currently running search.
Examining the results:
While future versions of BehaviorSearch may include some tools/scripts for visualizing the search results, currently the output data is given in several files, using the straightforward CSV (comma separated value) file format, and the methods of processing or visualizing it are up to you (Excel, R, SciPy/matplotlib, etc). Here is the list of output files that were generated for our Flocking convergence example:
- example_flocking.searchConfig.xml - this XML file isn't really output data -- it just contains all of the settings that you used for running the searches. This is useful to keep with your output data, so that if you come back to it later and can't remember what search options you chose, it will all be stored here. Additionally, since this XML file is in the same format as the .BSEARCH files, you can also load this XML file into BehaviorSearch GUI to examine it, or run further similar searches.
- example_flocking.modelRunHistory.csv - this file contains a row for each time the ABM was run by the search progress, showing the parameter settings that the model was run with, as well as the value of your "measure". It also gives the random seed that the ABM was setup with, so if you like, you can go back and run it in NetLogo with exactly the same settings.
- example_flocking.objectiveFunctionHistory.csv - this file contains a row for each time the objective function is evaluated. (Remember, the objective function may require doing N replicate model runs, and then taking the average, or combining the data from those runs in some other way.)
- example_flocking.bestHistory.csv - this file contains a row for each objective function evaluation that was better than all previous ones.
- example_flocking.finalBests.csv - This file is a short list of only the single best parameter settings found by each of the searches. If you only look at one output file, this is the one.
- example_flocking.finalCheckedBests.csv - This file is similar to the .finalBests.csv file, but rather than listing the best parameter settings found by each search process, it lists the parameter settings that (using the extrinsic BestChecking), were found to have the best independently verified objective function value. Parameter settings listed in this file are the most likely to be the best (although if the objective function is noisy, the objective function values reported here may be an overestimate of the real objective function value, as a result of the bias incurred by selecting the best out of the set).
Command Line Usage
BehaviorSearch can also run searches from the command line, using behaviorsearch_headless.bat (behaviorsearch_headless.sh on Mac/Linux). The "headless" version of BehaviorSearch does not support designing new search experiments -- just running search protocols (*.bsearch) that have already been created using the BehaviorSearch GUI. (Search protocol files can also be created manually -- .bsearch files are simply valid XML files based on the document type definition found in "resources/behaviorsearch.dtd".)
This mode of running BehaviorSearch without a GUI is particularly useful for long batch runs, or running searches on high performance computing clusters.
To see the command line argument options, just run it without any arguments, as shown in Figure 8.
The most basic usage is:behaviorsearch.sh -p myexperiment.bsearch -o myoutput
In this case, BehaviorSearch will run one search, the protocol for which is specified in "myexperiment.bsearch", and the output will be written to "myoutput.xxx.csv", for each of the output files created.
Additional help, feedback, bug reports, etc.
At present, this tutorial is the only documentation available for BehaviorSearch, apart from some context-sensitive help in the software (in the form of mouse-over tooltips and help buttons). However, if you have questions that were not covered in this documentation, please contact us, and we will do our best to respond promptly with answers.
Feedback and bugs:
Both the BehaviorSearch software and the accompanying documentation are in an active state of development. The most recent version of the software can be obtained from www.behaviorsearch.org. If you find errors in this tutorial, or bugs in the software itself, please contact us and report the issue!
Likewise, we welcome general comments and feature requests.