Optimising for Performance This appendix lists some suggestions on how to get the best possible performance from your evolutionary Java programs. Much of the advice here applies whether or not you are using the Watchmaker Framework to develop your evolutionary programs. As with all optimisations in software development, the golden rule is don't do it unless you have a demonstrable need for improved performance. Optimisations often introduce complexity and make code harder to maintain. Before starting on any optimisations, always use a profiler to identify the bottlenecks in your application. This will pinpoint the areas where optimisations are most likely to beneficial. It is pointless to expend effort to try to speed up a routine that accounts for only 0.1% of the CPU time.
Optimising the Fitness Evaluator fitness functionoptimisation of For most non-trivial evolutionary algorithms, the bulk of the work is the evaluation of candidate solutions. For this reason the fitness function is often the obvious place to make improvements. A fitness evaluator should do no more work than is absolutely necessary on each invocation. If there is some initialisation that is repeated unnecessarily, consider moving it to the constructor. If similar calculations are performed every time, consider pre-computing the possible results and using a look-up table. When you consider that the evaluator may be invoked millions of times in a single run, it is clear that even small optimisations to the fitness function may add up to substantial reductions in running time.
The Caching Fitness Evaluator CachingFitnessEvaluator fitness functioncaching elitism In some evolutionary programs individuals can survive from generation to generation unmodified. The most obvious example of this is elitism. Individuals that are preserved through elitism will appear unaltered in the next generation and may survive for many generations. Individuals may also survive without modification if the evolutionary operators in use are probabilistic and don't always affect every candidate. If fitness evaluations are expensive, it is wasteful to repeatedly recalculate fitness values for unaltered individuals. The Watchmaker Framework provides the org.uncommons.watchmaker.framework.CachingFitnessEvaluator class to address this problem. It acts as a wrapper for your fitness evaluator and caches the results of fitness calculations. If the same candidate is evaluated twice, the cached value is returned the second time thus avoiding the cost of recalculating the fitness score. The cache uses Java's weak references to avoid memory leakage (if the candidate does not survive, the associated cache entry will also be garbage collected). Caching of fitness scores is provided as an option rather than as the default Watchmaker Framework behaviour because caching is only valid when fitness evaluations are isolated and repeatable. An isolated fitness evaluation is one where the result depends only upon the candidate being evaluated. This is not the case when candidates are evaluated against the other members of the population. Caching should not be used if it is possible for multiple evaluations of the same candidate to return different scores.
Minimising the Search Space An evolutionary algorithm is a type of non-deterministic search. The algorithm is searching the space of all possible solutions to find one that is good enough. The larger the search space, the longer it is likely to take to converge on a satisfactory solution. For this reason, anything we can do to constrain the search space, without handicapping the algorithm, is likely to be beneficial. This includes choosing an efficient candidate representation and using evolutionary operators that avoid generating useless or invalid solutions. A little intelligent design can go a long way.
Random Number Generators random number generator Random RNG SecureRandom The random number generator (RNG) is a core component of any evolutionary simulation. It is used for selection, for cross-over and for mutation. A slow random number generator can be a bottleneck. Most programming languages provide a mechanism to generate random numbers. Unfortunately, few of them are ideal. The Java standard library includes two RNGs, java.util.Random and java.security.SecureRandom. These should be avoided for statistical and performance reasons respectively. MersenneTwisterRNG The Watchmaker Framework comes bundled with three high-quality RNGs provided by the Uncommons Maths project. Of these, the org.uncommons.maths.random.MersenneTwisterRNG is the most suitable for the majority of evolutionary programs. Alternatively, you can use any third party RNG that is a sub-class of java.util.Random.
JVM Options Java Virtual Machine JVM The Java Virtual Machine (JVM) is a complex piece of software. It is designed to run a huge variety different programs. As such, its default configuration is not optimised for the particular needs of evolutionary computation. This section lists some of the JVM options that you can tweak to try to achieve better performance.
Server VM server VM The Sun JVM provides two modes of operation, one optimised for client applications (the default) and one for server applications. The server VM takes marginally longer to start up but provides substantially better performance for long-running processes and is therefore a better choice for most evolutionary algorithms. The server VM is enabled using the -server switch.
Garbage Collection garbage collection Evolutionary algorithms create many short-lived objects. Modern JVMs, with their generational garbage collectors, are typically well tuned for this usage pattern. However, you may find that by modifying the settings you are able to improve throughput. Garbage collectors make a trade-off between overall throughput and pause time. For evolutionary algorithms we typically want to maximise throughput, even at the expense of introducing noticeable pauses in the program's execution. What is most important is how soon the program completes, not how smoothly it runs. You can get information on what the garbage collector is doing by starting the JVM with the -verbosegc switch. If you find that the program is spending a lot of time collecting garbage, it may be because it is short of memory. If you have sufficient RAM, increasing the maximum size of the Java heap (using the -Xmx switch) may improve things.
Alternative JVMs Sun Microsystems is not the only provider of virtual machines for Java. If your platform is supported, you may also have the option of using a JVM from BEA, IBM or some other third party. These virtual machines have different performance characteristics and different garbage collector implementations. If you have tried everything else and still need something faster, you may find that a different JVM will perform better. Then again, it may not.