Quantcast aiQUANT » Biologically Inspired

Archive for the 'Biologically Inspired' Category

GA and GP: Similarities and differences.

Genetic Algorithm, Genetic Programming 1 Comment »

This paper argues that the Genetic Algorithm and Genetic Programming are pretty much the same - at least in as far as the differentiating features are considered:

GA or GP? That is not the question
Woodward, J.R.
Evolutionary Computation, 2003. CEC apos;03. The 2003 Congress on
Volume 2, Issue , 8-12 Dec. 2003 Page(s): 1056 - 1063 Vol.2

GaGpOrNot.gif

Key points emerging from the paper:

  • The greatest distinction between GAs and GP is that of fixed or variable length. In some cases, the size of the required solution sought may be known beforehand, but in other cases, such as evolution of trading rules, imposing a fixed size could reduce the search space hence neglecting optimal solutions.
  • GAs do not necessarily represent the evolution of algorithms, but rather a subset of all possible algorithms. GP typically does not evolve computer programs as we would expect, but typically logical or mathematical expressions are evolved, or if then type rules of a classifier system.
  • In GAs, the mutation of bit strings takes the form of flipping a single bit. Two main crossover operators have been used; uniform crossover uniformly selects a bit from each position from one of the two parents, one point crossover selects a point at random along the genome of two parents
    and takes the left part of parent one and combines this with the right part parent two. In GP, mutation takes the form of replacing a randomly selected sub tree in a program with a randomly generated sub tree. Crossover swaps randomly selected sub trees of two programs.

Essentially the paper states that the differences are unimportant because whether we are considering GAs or GP, whatever the high level representation is, it will ultimately be stored in the computer’s memory as a bit string. I think it’s a fair enough argument but really does not really our day to day understanding and application of the of the two algorithm variants. Because even more “artificial” forms of the algorithm such as Grammatical Evolution (GE) use bit string at lower levels, but this is irrelevant to the problem solver.

Trees in GP

It might be worth hinting at tree representation in GP here.  I have shown two examples of tree representations with their equivalent expression:

gpTree1.png

gpTree2.png

You might want to try drawing trees for the following two expressions. Note that there is a comma at the end of the first line of each expression:

1.
times(times(times(X1,plus(cos(cos(X1)),X1)), sin(plus(cos(X1),sin(plus(cos(X1),X1))))),plus(cos(minus(X1,X1)),X1))

2.
nor(nor(nor(or(X2,X2),or(X2,X3)),and(nand(X2,X3),X1)), and(nand(and(X1,X1),X3),and(nand(X2,X1),or(X3,X1))))

You can find the tree representations for 1 here and for 2 here. 

I shall be using such representations in future posts so this post serves as a reference for any unfamiliarities with tree representations/expressions that may emerge.

Statistical Arbitrage and Genetic Programming

Genetic Programming, Statistical Arbitrage 1 Comment »

The main idea in this presentation is that of co-evolving 2-branch type trees, where one branch represents buy rules and the other represents sell rules for trading. The intuition is that when the branches are evolved together, your final genetic program ends up with buy and sell rules that are duals of each other. Hence an optimum buy rule can be paired with an optimum sell rule - which forms a dual. This is unlike the approach where a sell rule is triggered only if certain criteria for a buy are not met (which may not necessarily be optimum condition for the sell). Although the results in the presentation are somewhat ambiguous, the author points out that it is possible to discover profitable trading rules in the presence of transaction costs under a statistical arbitrage framework. The author also cites a few papers which have applied GP techniques to Statistical Arbitrage. I shall dig up these papers to investigate the approach taken.

Immune System video

Artificial Immune Systems, Eye Candy No Comments »

I found this excellent video explaining very casually how the Immune System works. I thought it would be a great addition to one of my previous posts “Financial applications of Artificial Immune Systems“. I’ve copied it below for easy view.

Neural Net sensitivity to Lag

Neural Networks 10 Comments »

Smoothing data prior to training a neural net is important because it provides a way of filtering unwanted noise that may be present in the inputs. In addition to removing noise smoothing inevitably introduces time lag which results in degradation of any correlation the input may have with the target variable. If we over smooth then there is a possibility that the correlation deminishes so much that the neural net is unable to learn important relationships between the input and target.

Let’s consider the following example:

We want to predict the next value of a noisy sine wave. For this we shall use a 2:3:1 neural net

nnTrylr.png

Our noisy input data consists of 400 samples with a further 100 samples used for testing the neural net’s prediction.

sinNoisy.png

Experiment 1:

We first investigate the predictive quality with both input1 and input2 set to 15-bar moving averages of the training data.

1515.png

1515corr.png

MSE = 0.2342

Experiment 2:

We now investigate with input1 set to 5-bar and input2 set to 15-bar moving averages of the training data.

515.png

corr515.png

MSE = 0.1335

Key Points:

  • Excessive filtering of inputs in experiment 1 resulted in noise removal as well as degradation of correlation which shows up in the quality of forecasted output.
  • In experiment 2 we changed one of the inputs to a 5 bar MA and left the other at 15 bar MA. Because the 5-bar MA input suffered less correlation degradation the forecast quality was much better than that of experiment 1.
  • Essentially what the neural net does during training is to identify correlations between the inputs and the target data. If for instance we have reason to believe that the target is influenced by an N-bar moving average then supplying an N-bar smoothed input will do no harm. But if our purpose is solely to filter noise in the input then we have to be aware about change in correlation and it’s consequence.

Financial applications of Artificial Immune Systems

Artificial Immune Systems 9 Comments »

Isystems.pngAlgorithms inspired by biological immune systems also find their application in Finance. Although not as popular as the Neural Net or GA and their derivatives, recent research has shown that these algorithms can provide optimisation and classification capabilities comparable to conventional methods [Gonzalez, Dasgupta]. The aim in designing artificial immune systems (AIS) is not to produce exact models of the natural immune system, but rather extract ideas and metaphors that can help solve real-world problems. There are two main categories of the Artificial Immune system namely Negative Selection and Clonal Selection which lend their inspiration to Classification and Optimisation respectively.

immuApplic.png

Background

The natural immune system is comprised of a network of specialised cells and chemical molecules which work together to defend the host organism from foreign objects (Pathogens) such as bacteria, fungi, chemical compound etc. The three main capabilities of the Natural Immune system are:

  1. Recognise a foreign body (pathogens) such as bacteria, fungi, chemical compound etc.
  2. Destroy the pathogen.
  3. Remember an almost unlimited number of pathogens.

ImmuneImage.jpg

There are two types of cells comprising the immune system: phagocytes and lymphocytes. The first group belongs to the innate immune system while the latter group mediates adaptive immunity. A major component of the population of lymphocytes is made up of B and T cells, whereas that of phagocytes comprises of B cells only. These cells are capable of recognising and responding to certain antigen (foreign body) patterns present on the surface of pathogens. The picture above tries to illustrate this.

Although the real workings of the immune system is complicated, the immunity process can be summarised as follows:

  1. Antigen-secreting pathogen enters the body
  2. B cells are activated by the foreign antigen
  3. With the help of T cells, B cells undergo cloning and mutation
  4. Plasma B cells secrete immunoglobulins which attach to the antigen
  5. Marked antigens are attacked by the immune system
  6. Memory of the antigen is maintained by B memory cells

The key point is that immune systems demonstrate self-organisation and adaptive behavior. From a modelling perspective, it can be considered as a sophisticated information processing system which possesses powerful pattern recognition and classification abilities. It also has the capability to adapt to new circumstances (problems), and can remember solutions to problems it has previously encountered. In designing artificial immune algorithms we wish to draw metamorphical inspiration from the workings of the natural immune system to design algorithms to solve computational problems. The negative selection algorithm and clonal expansion and selection algorithm are examples of applications of such metaphors.

Negative Selection Algorithm

The basis of the negative selection algorithm is the ability of the immune system to distinguish between two system states i.e. normal and abnormal. [Forrest] developed the negative selection algorithm and has been modified since by a number of researchers. The negative selection algorithm can be summarised as:

  1. Detector set D is empty
  2. Repeat
  3. Create a random vector x drawn from the valid range
  4. For every self element s in the self vector
  5. Calculate the euclidean distance between s and x
  6. If distance < threshold go to step 2
  7. Else add x (a valid non-self detector) to set D.
  8. Until set D contains the required number of detectors (assume N)

The flowchart below shows the above steps

detCreaTrain.PNG

To understand this let us assume we would like to classify corporate bonds so as to differentiate investment grade from junk grade. Assume that the diagnosis will be made on the basis of ratio information drawn from the financial statements of the companies and that we also have a data-set containing example ratios for financially healthy companies and also for companies which later defaulted on their bonds.

Self is defined as financially healthy companies. Next a set of detectors (of size D) is randomly created. Each of these detectors consists of a vector of real numbers, corresponding to financial data obtained from the financial statements (such as accounting ratios). The negative selection process is then applied, whereby a series of vectors of ratios corresponding to healthy companies is presented to the detectors. Any detector which is identical or similar (the degree of similarity could be measured using the euclidean distance) to a data vector corresponding to a healthy company is discarded. As detectors are discarded, new detectors are created to to build up the size of the population D, which are subject to the same negative selection process. If a detector is created which fails to match any vector in the training set of healthy companies, it is a potentially useful detector of an unhealthy company. This process is repeated until all the detectors in the out-of-sample set is exhausted.

NIBRcomp2.png

The main challenges in applying this algorithm are:

  • It ignores the fact that examples of past failing corporate bonds exist
  • The task of generating a population of valid detectors will grow rapidly as the size of self increases.

These issues depend on on whether a good seed is used in generating the random number which depend on historical values of the accounting ratios. A good seed inevitably leads to early convergence which improves classification at steady state.

Clonal Expansion and Selection Algorithm

Unlike the negative selection algorithm, the clonal selection algorithm aims to mimic the creation of a large number of antibodies which will bind strongly to a specific antigen. This metaphor is adopted to design an optimisation algorithm in which the antibody can be considered as a potential solution, where the degree of the binding or fit between the antibody and the antigen represents the quality of the solution. The objective, therefore, is to start from an initial population of solutions, rest them against the antigens, using the algorithm iteratively. The steps in the clonal expansion and selection algorithm can be summarised as:

  1. Create an initial random population P of solution vectors (antibodies)
  2. Select a subset F of the solutions from P, biasing the selection process towards better solutions
  3. For each member of F (parents), create a set of clones, with better members of F producing more clones
  4. Mutate each of these clones, in inverse proportion to the parent’s fitness. Better solutions are mutated less (hypermutation)
  5. Select a subset of newly generated solutions S
  6. Create a number of newly created random solutions R
  7. Replace poorer members of P with better solutions from S and R
  8. Repeat steps 2 - 7 until termination condition is reached

The key difference between the clonal selection algorithm and other evolutionary algorithms is that this method in which variety of candidate solutions is generated when seeking to iteratively improve solutions.

Analogy.png

As an example I apply the clonal selection algorithm to finding the maximum value of a two dimensional multi-modal function. The multi-modal function could for instance describe the fitness of an asset allocation problem. I set my parameters as follows:

  • Number of iterations: 25
  • Number of antibodies: 100
  • Number of clones per antibody: 10
  • Clone Mutation rate: 0.010

Our initial population of 2-dimensional solution antibodies occupies different locations of the function:

initiPopuCS.png

After 25 iterations we see the clonal algorithm has managed to converge to the maximum points on the function

finalPopuCS.png

We have shown that the clonal expansion and selection can be used to optimise a 2-dimensional multi-modal function. This example can be extended to problems of n-dimension such as optimising trading rules or tactically allocating a portfolio of stocks to meet certain risk /return requirements. We will hopefully discuss such problems in due course. I hope this post gives the reader a rough idea of how Artificial Immune Systems can be applied to finance. Off-course complete understanding of these algorithms comes from the workings of the actual biological process from which the metaphors are drawn, which the reader may be keen to explore further before attempting to apply these algorithms.

    Forecasting via Genetic Algorithm: Part 6

    Genetic Algorithm No Comments »

    Part 6: Simulation Results and wrap-up

    A fitness landscape is a 3-D (or sometimes 4-D) plot which shows how the fitness value varies with respect to other variables that it depends on. To obtain the fitness landscape for our problem I altered the population size from 2 to 20 for mutation probabilities within the range 0 to 1. I then ran the GA for 25 generations and recorded the average fitness of the entire population so as to ascertain optimum mutation rate and population size to use. Alternatively we could have recorded the fitness of the fittest organism within the population, but this would not have given us an indication of the overall quality of solutions evolved by the algorithm. Hence the reason for taking the overall population fitness into account. The plot below shows how the average fitness changes for different population size and mutation rate.

    fLandSurf.png

    We can also view the 3-D plot in the form of a contour plot:

    fLandContour.png

    We are interested in the dark regions because the value of the fitness function in these locations is closest to zero. It is important to point out that ideally the fittest organism in our problem should have a fitness of 0 as this represents that there in no difference between the predicted value and the actual value. We see that a mutation rate of alteast 0.5 and population size greater than 9 will yield fittest solutions within the first 25 generations.

    mRate2.pngUsing the minimum mutation rate and population size (0.5 and 10 respectively) we run the GA for mating rates varying from 0.1 to 1 for 25 generations. This time we are interested in seeing how the fitness value depends on the mating rate The graph shows that for mating rates upto about 0.7, the algorithm produces fit organisms. For rates above 0.8 we see poor solutions as the overall fitness rapidly increases. The explanation for this is that organisms that high mating rates almost always results in child genes being dominated by that of the parents which gives little room for population diversity. Hence the population remains generally unfit for many generations into the future.

    Hence for our experiment we use the following parameters:-

    Population size: 10
    Mutation rate: 0.5
    Mating rate: 0.5

    We let the GA run for 500 generations and get the fittest genome which we use in our prediction function. Recall that our prediction function is:

    eq1.PNG

    The paremeters of the fittest genome evolved by the GA after 500 generations are:

    \alpha=0.1,\hspace{1}\beta=-0.2,\hspace{1}\gamma=0.3,\hspace{1}\delta=-0.2,\hspace{1}\theta=0.1,\hspace{1}\tau=-0.1,\hspace{1}\mu=6.6

    a=25,\hspace{1}b=22,\hspace{1}c=7,\hspace{1}d=11,\hspace{1}e=55,\hspace{1}f=47,\hspace{1}

    We can now apply our evolved prediction function to the 3 data sets of the FTSE 100 time series. However, to give more meaning to our results we can compare with the Random Walk model of an asset price, which takes the form:

    P_{t+1}=P_{t}+\mu+Z_{t}

    where \small Z_{t} is a normally distributed random variable and \small\mu is drift.

    Comparative results are summarised below:

    results.png

    Data set 2 is the true out-of sample range with we intended to forecast via our prediction function. Clearly it under-performs relative to the random walk model suggesting that our prediction function provides a poor approximation of the FTSE 100 index. Note that it is not the GA which is at fault here, but the prediction function.

    Key Points:

    • We have shown that the GA can evolve optimum parameters for a pre-defined prediction function
    • The performance of the GA is totally independent to the forecasting ability of the prediction function. The GA will almost always find the best solutions for the prediction function, which might be a poor approximator of the asset price as in our case.
    • GAs are very powerful tools for global search and optimisation. However, there is a trade-off between search coverage and simulation time. Analysing fitness landscapes based on a small number of evolved generations can help estimate suitable population size as well as mutation rate and mating rate hence reducing simulation time for larger generations.

    Possible improvements:

    • Let the algorithm search for its own prediction function. This leads to Genetic Programming which helps generate very sophisticated prediction functions.
    • Use a range of other fundamental and technical indicators as inputs to the prediction function.
    • Apply filtering so as to reduce noise over-fitting.

    Forecasting via Genetic Algorithm: Part 5

    Evolutionary Computation, Genetic Algorithm, Matlab No Comments »

    Part 5: Programming Issues

    I wrote code for the Genetic Algorithm in MATLAB and it is terribly slow.  A run for 50 Generations averages to about 1 second per generation.  Here are some performance measures obtained using Matlab Profiler:

    prof1.png

    The FitnessFunction function is the biggest culprit and takes almost 100% of the total simulation time.  Evaluation of the fitness function is the most time consuming part in the Genetic Algorithm.  This is generally the case even with simpler fitness functions because on a relative scale, operators such as crossover and mutation require very little algebraic manipulation.  Looking at the function itself we get more detail:

    prof2.png

    We find that evaluations for alpha and beta take the most time.  I shall attempt to speed up the algorithm by writing a MEX function.  In doing so we expect the simulation time to drop by many orders of magnitude as compiled MEX code would not need to run under Matlab’s interpreted mode.

    Forecasting via Genetic Algorithm: Part 4

    Forecasting, Genetic Algorithm, Model Development No Comments »

    Part 4: Roulette Selection

    So we have initialised our population of chromosomes each of which is made up of 13 genes.  We can represent our population in the from of a matrix where individual chromosomes are represented vertically and it’s genes horizontally.  We have something like:

    matrixFitn.png

    Each member of the population is assigned a fitness value based on the fitness function, which in our case is defined as

    eq2.PNG

     

    Why Roulette Selection?

    Roulette wheel sampling was first suggested by Goldberg and provides a method of global selection when applied to members of a population.  In order for genes of fit chromosomes to be passed on to future generations, there must be some form of selection method to decide which members of the population will be selected to breed and which ones will be replaced.  It seems sensible to breed from genes belonging to the fittest chromosomes and replace the least fit ones, but this has a draw back: the GA can converge to local minimum as the search space is restricted to the genes covered by the fittest chromosomes only.  One of the mechanisms that enable genes of weaker organism to be passed on while at the same time favouring fitter chromosomes is Roulette Selection.  To understand this consider the following example:

    Assume we have a population of 4 chromosomes each of which has their fitness value evaluated. To construct a Roulette wheel for this population we calculate the Roulette probability of each chromosome as a ratio of its fitness to the fitness of the entire population. That is
    RouletteProbability=\frac{chromosomeFitness}{populationFitness}
    The illustration below shows the Roulette wheel for our hypothetical 4-chromosome population:

    rouWeeEg.png

    Extending the method to the problem at hand, we have:

    rouWee.png

     Note that our Roulette wheel represents a population size of 10.  Chromosome 2 for instance has a higher probability of being picked compared to to Chromosome 6.  This off-course doesn’t mean that Chromosome 2 will always be picked.  It all depends on where the Roulette wheel “stops”.  Also note that different population size would have a different representation to its Roulette wheel because every chromosome is allocated a section corresponding to its fitness value.

    Main Points:

    • Roulette Selection is an efficient way of allowing excellent genes embedded in a poor performing chromosome to be passed on to future generations.
    • Roulette Selection enables convergence to global minimum.
    • Think of this method as chromosomes represented on a Roulette wheel which is “spun” whenever a parent needs to be selected for crossover/breeding.

    Forecasting via Genetic Algorithm: Part 3

    Forecasting, Genetic Algorithm, Model Development No Comments »

    Part 3: Constraining the Search Space

     We noted previously that \small\displaystyle\alpha,\beta,\gamma,\delta,\theta,\tau,\mu,a,b,c,d,e,f are integer values which will be evolved by the algorithm.  If we are to assuming that the range of possible values for each one of these parameters range from -infinity to +infinity then our algorithm will take an eternity to evaluate each and every permutation.  In order to improve convergence to an optimum solution it is useful to specify a range of values that encompasses optimum parameter values.  For instance \small\displaystyle\\a,b,c,d,e,f can only have a minimum value of 2 and a maximum value less than or equal to 800.  But clearly numbers closer to 800 seem way off because it is very unlikely that patterns will persist that far back in time given the nature of our prediction function.  Parameters \small\displaystyle\alpha,\beta,\gamma,\delta,\theta,\tau,\mu will be decimal numbers with a resolution of one decimal place and will fall over a wider range.

    • \small\displaystyle\\a,b,c,d,e,f  fall in the integer range 2 to 65  (We will refer to these as Integer Genes)
    • \small\displaystyle\alpha,\beta,\gamma,\delta,\theta,\tau,\mu fall in the decimal range -1000 to 1000  (We will refer to these as Decimal Genes)

    \large\displaystyle\overbrace{\alpha,\beta,\gamma,\delta,\theta,\tau,\mu}^{\text{Decimal\hspace{1}Genes}},\large\underbrace{a,b,c,d,e,f}_{\text{Integer\hspace{1}Genes}}

    Other constants shall be set to

    • Population Size = 20
    • Mutation Probability = 0.7
    • Mate Probability = 0.8

    We don’t need to limit the algorithm to a fixed set of constants.  In-fact we will see later on the effect of altering each of these constants with respect to other parameters and constants.

     1.  Initialising the Population

    We know that there are 13 genes in total that form a chromosome and we also know the boundary values individual genes may take.  What is the best way to create the required population of 20 chromosomes in-order to get our algorithm started?  We shall do so by selecting random numbers and scaling them to the permitted range

    \large\displaystyle\overbrace{\alpha,\beta,\gamma,\delta,\theta,\tau,\mu}^{\text{\small\\initialisation=[1000-(-1000)]*randomNumber+(-1000)}},\large\underbrace{a,b,c,d,e,f}_{\text{\small\\initialisation= round([65-(2)]*randomNumber+(2))}} 

    Note that \small\displaystyle\\randomNumber is a decimal number between 0 and one.  \small\displaystyle\\round() means that the number is rounded to the nearest integer value since the corresponding parameters are Integer Genes.

    Essentially what we are saying is that Integer genes will be random numbers from the range 2 to 65 and Decimal genes will be random numbers from the range -1000 to 1000.

    Forecasting via Genetic Algorithm: Part 2

    Forecasting, Genetic Algorithm, Model Development No Comments »

    Part 2:  Algorithm Overview

     The Genetic Algorithm is a subset of Evolutionary Algorithms that model biological processes to optimise highly non-linear cost functions.  They require multiple initialisation points on the cost surface hence giving the algorithm a global outlook towards the minimisation or maximisation task.  A few other advantages of these algorithms over traditional local search algorithms include

    • Don’t need to compute the derivative of the function
    • They can escape local minima
    • They can provide a list of suitable solutions and not just one
    • Simultaneous searching due to wide sampling of the cost surface
    • etc etc

      The figure below shows the main steps in the Genetic Algorithm.

    genAlgo.png

    1.  Set Constants

    Constants are the parameters that drive the rate and quality of evolution in the Genetic Algorithm.  In out example we have three constants namely

    1. Population Size
    2. Mutation Probability
    3. Mating Probability

    There is no fixed rule on how to choose the best values for these constants since they can differ from problem to problem.  The Population Size is the number of organisms or sampling points on the cost surface.  The fittest organisms of this population, which is calculated via a fitness function are allowed to breed via crossover with the premise that they will produce fitter organisms.  Mutation adds random diversity to future generations of the population.  Mating Probability ensures certain genes during crossover are unchanged hence adding more diversity.

    2.  Initialise Population

    In our example the population is composed of different samplings of the parameters we want to evolve.  A single member organism or chromosome is shown below: 

    chromo.PNG

    A gene refers to an individual entity or parameter belonging to the the prediction function.  Recall that our prediction function is:

    eq1.PNG

     Note that the words organism and chromosome are used interchangeably.

    3.  Evaluate Fitness

    The fitness function determines how fit individual members of the population are with regards to the prediction function.  In our case we shall use a fitness function that compares the predicted price movement with the actial price movement, averaging the absolute value of the difference over a time period of n days.   

    eq2.PNG

    I shall explain later the variable n.

    4.  Sort by Fitness

    Each member of the poplation is assigned a fitness value.  If the fitness is above a pre-defined threshold the algorithm is terminates.  Otherwise the algorithm proceeds to inter-breed the population so as to produce fitter organisms.

    5.  Select Parents

    There are many ways of selecting parents, which for instance can be based on their fitness ranking within the population.  However the actual selection mechanism may be random such as Roulette Selection.  We will use Roulette Selection in our example and I shall talk more about this later on.

    6.  Breed

    The genes of the selected parents are crossed over with the aim of producing a fitter organism/chromosome.  The Mating Probability ensures that certain genes are unaltered, hence adding diversification to the resulting generation.

    7.  Mutate

    The downside of cross over during breeding is that values are exchanged between parents but no new ones are introduced.  Mutation allows new genes to show up, hence adding even more diversification to the population.   We are emphasising on diversification because this prevents the algorithm from getting stuck at a local minimum as different members of the population represent a wider range of sampling points on the cost surface.

    8.  Remove Twins

    The existance of two organisms with identical genomes has limited utility in a Genetic Algorithm.  In our case, we search for two similar organisms and replace the genes of one of the organism with random values.

    Close
    E-mail It