Optimising for Performance
This appendix lists some suggestions on how to get the best possible performance from your
evolutionary Java programs. Much of the advice here applies whether or not you are using the
Watchmaker Framework to develop your evolutionary programs.
As with all optimisations in software development, the golden rule is don't do it unless you have
a demonstrable need for improved performance. Optimisations often introduce complexity and make
code harder to maintain. Before starting on any optimisations, always use a profiler to identify
the bottlenecks in your application. This will pinpoint the areas where optimisations are most
likely to beneficial. It is pointless to expend effort to try to speed up a routine that accounts
for only 0.1% of the CPU time.
Optimising the Fitness Evaluator
fitness functionoptimisation of
For most non-trivial evolutionary algorithms, the bulk of the work is the evaluation of
candidate solutions. For this reason the fitness function is often the obvious place to
make improvements. A fitness evaluator should do no more work than is absolutely
necessary on each invocation. If there is some initialisation that is repeated unnecessarily,
consider moving it to the constructor. If similar calculations are performed every time,
consider pre-computing the possible results and using a look-up table. When you consider
that the evaluator may be invoked millions of times in a single run, it is clear that even
small optimisations to the fitness function may add up to substantial reductions in running
time.
The Caching Fitness Evaluator
CachingFitnessEvaluator
fitness functioncaching
elitism
In some evolutionary programs individuals can survive from generation to generation unmodified.
The most obvious example of this is elitism. Individuals that are preserved through elitism
will appear unaltered in the next generation and may survive for many generations. Individuals
may also survive without modification if the evolutionary operators in use are probabilistic and
don't always affect every candidate.
If fitness evaluations are expensive, it is wasteful to repeatedly recalculate fitness values
for unaltered individuals. The Watchmaker Framework provides the
org.uncommons.watchmaker.framework.CachingFitnessEvaluator class to
address this problem. It acts as a wrapper for your fitness evaluator and caches the results
of fitness calculations. If the same candidate is evaluated twice, the cached value is returned
the second time thus avoiding the cost of recalculating the fitness score. The cache uses Java's
weak references to avoid memory leakage (if the candidate does not survive, the associated cache
entry will also be garbage collected).
Caching of fitness scores is provided as an option rather than as the default Watchmaker
Framework behaviour because caching is only valid when fitness evaluations are
isolated and repeatable. An isolated fitness
evaluation is one where the result depends only upon the candidate being evaluated. This is
not the case when candidates are evaluated against the other members of the population.
Caching should not be used if it is possible for multiple evaluations of the same candidate
to return different scores.
Minimising the Search Space
An evolutionary algorithm is a type of non-deterministic search. The algorithm is
searching the space of all possible solutions to find one that is good enough. The
larger the search space, the longer it is likely to take to converge on a
satisfactory solution.
For this reason, anything we can do to constrain the search space, without
handicapping the algorithm, is likely to be beneficial. This includes choosing
an efficient candidate representation and using evolutionary operators that
avoid generating useless or invalid solutions.
A little intelligent design can go a long way.
Random Number Generators
random number generator
Random
RNG
SecureRandom
The random number generator (RNG) is a core component of any evolutionary simulation. It is
used for selection, for cross-over and for mutation. A slow random number generator can be
a bottleneck. Most programming languages provide a mechanism to generate random numbers.
Unfortunately, few of them are ideal. The Java standard library includes two RNGs,
java.util.Random and java.security.SecureRandom.
These should be avoided for statistical and performance reasons respectively.
MersenneTwisterRNG
The Watchmaker Framework comes bundled with three high-quality RNGs provided by the Uncommons
Maths project. Of these, the org.uncommons.maths.random.MersenneTwisterRNG
is the most suitable for the majority of evolutionary programs. Alternatively, you can use any
third party RNG that is a sub-class of java.util.Random.
JVM Options
Java Virtual Machine
JVM
The Java Virtual Machine (JVM) is a complex piece of software. It is designed to run a huge
variety different programs. As such, its default configuration is not optimised for the
particular needs of evolutionary computation. This section lists some of the JVM options
that you can tweak to try to achieve better performance.
Server VM
server VM
The Sun JVM provides two modes of operation, one optimised for client applications (the
default) and one for server applications. The server VM takes marginally longer to start
up but provides substantially better performance for long-running processes and is therefore
a better choice for most evolutionary algorithms. The server VM is enabled using the
-server switch.
Garbage Collection
garbage collection
Evolutionary algorithms create many short-lived objects. Modern JVMs, with their generational
garbage collectors, are typically well tuned for this usage pattern. However, you may find
that by modifying the settings you are able to improve throughput.
Garbage collectors make a trade-off between overall throughput and pause time. For
evolutionary algorithms we typically want to maximise throughput, even at the expense of
introducing noticeable pauses in the program's execution. What is most important is how soon
the program completes, not how smoothly it runs.
You can get information on what the garbage collector is doing by starting the JVM with the
-verbosegc switch. If you find that the program is spending a lot of time
collecting garbage, it may be because it is short of memory. If you have sufficient RAM,
increasing the maximum size of the Java heap (using the -Xmx switch) may
improve things.
Alternative JVMs
Sun Microsystems is not the only provider of virtual machines for Java. If your platform is
supported, you may also have the option of using a JVM from BEA, IBM or some other third party.
These virtual machines have different performance characteristics and different garbage
collector implementations. If you have tried everything else and still need something faster,
you may find that a different JVM will perform better. Then again, it may not.