Quantcast aiQUANT » Financial Mathematics

Archive for the 'Financial Mathematics' Category

Signal processing with heavy-tailed distributions

Fat Tails, Financial Mathematics, Signal Processing No Comments »

While reading on the subject of hard to forecast events I decided to investigate what the Signal Processing domain had to offer in terms of models that account for “fat-tail” distributions. Thankfully I came across a set of papers that specifically address the nature of such distributions in their modeling approach - which I think is insightful and worth having a look at despite being off-topic to quant work.

The set of papers was collectively published in a special edition journal under the theme “Signal processing with heavy-tailed distributions” and can be found here. I have uploaded the preface to the journal which gives a good overview of applications. You can download it from here.

Using Simulink to implement and test models

Hilbert Transform, Matlab, Signal Processing, Transform Algebra 4 Comments »

In addition to Matlab, The Mathworks offer a product called Simulink which is a design platform for implementing and testing model-based systems. Most of my experience using Simulink borders on modelling Control Systems and DSP architectures. However, as the Mathworks have pointed out in a recent webinar, Simulink can be extended to develop and test finance based models such as a trading system or sub-systems that form part of a larger model.

I decided to implemented a Hilbert Transformer model for price data using Simulink which is shown below:

HilbertDiagram.jpg

The model has input “simin” which takes the price data from the matlab workspace as discrete points one at a time and applies unit delay (represented by “1/Z”) and gain which is basically multiplication by the shown number. Individual Real and Complex parts are combined to create complex numbers which is then outputted to the workspace.

Simply put, the Hilbert Transform converts price data into complex number form so that one may go about calculating different measures that are not possible to calculate using just the price itself. My previous post mentions measures that are calculated using hilbert transformed price. The relationship between the actual price and the hilbert transformed price is

\small \text{Actual Price = sqrt{a^2 + b^2}} for a hilbert price \small\text{a + ib}.

As one would expect the model implemented above suffers lag. This is confirmed by comparing the Actual price with the price reconstructed from the hilbert complex numbers:

HilbertSimulinkModel.png

The model does not induce any loss or gain in magnitude, but there is a lag of 4 bars. This implies that the hilbert price for the current bar actually corresponds to the price 4 bars ago - something we should consider when using hilbert prices to deduce other measures.

Other Links:

  1. The wikipedia page covers basics of the hilbert transform.
  2. The Mathworks have a webinar which has a section on using Simulink for designing and testing Algorithmic Trading models. You can view this webinar here. Its round about 40:00mins into the presentation where the section on Simulink begins.
  3. Simulink totorial can be found here.
  4. There a number of webinars and example models at the Mathworks website for one to get started using simulink.

Transform Algebra in Mathematical Finance

Transform Algebra 2 Comments »

I have hinted at Z-transforms and Hilbert transforms and how they are applied to financial time series. But generally speaking, there are a number of other mathematical transforms which find their use in financial modeling problems.

Mathematical Transforms are important because they help convert functions into other domains thereby making them easier to understand and solve.  Here I give a qualitative description of common transforms and some examples of how they are applied to problems in finance.

traformTable2.png

Obviously the choice for invoking a particular transform depends on the problem being solved.

1. Fischer Transform

The Fischer Transform converts any data set into a modified data set whose probability density function is approximately gaussian. An immediate benefit is that one can then analyze the transformed data set in terms of its deviation from the mean - something which might not have been possible prior to the transformation. Consider for instance absolute prices on a bar chart. Are they normally distributed (i.e. bell shaped)? No. The returns density of the time series are usually near bell shaped, but the actual prices are no where close to bell shaped. This means attaching Gaussian confidence intervals to absolute price data is impossible, but Fischer transformed data will modify the data so that they become bell shaped hence allowing the majority of Gaussian statistical functions to be applied. The benefit is seen in modified technical indicators that are similar to, but more responsive than conventional oscillators such as the Commodity Channel Index (CCI) or Moving Average Convergence Divergence (MACD).

2. Fourier Transform

The Fourier Transform enables conversion of functions or data sets from the time-domain to the frequency domain and vice-versa. In Signal Processing this is essentially what describes the relationship between the time domain and the frequency domain. In terms of option pricing, the Fourier Transform provides a framework for fast price calculation compared to Monte Carlo methods. As shown in [Szymon et all] the Fast Fourier Transform (FFT) algorithm is about 3000 times faster than Monte Carlo simulation.

fftSimTime.png

It is worth noting that FFTs are inappropriate for direct time series analysis because they deliver poor resolution in terms of cycle length. They are only capable of recording integer number of cycles, thereby missing any cycles that have a length falling in between two integer boundaries. A good algorithm which performs better than the FFT for measuring cycle period in financial time series is the Pisarenko harmonic decomposition.

3. Hilbert Transform

The Hilbert Transform is a procedure to create complex signals from the real price data that is plotted on the bar chart. With complex signals available one can compute more accurate and responsive indicators as well as create other indicators which cannot be calculated without the Hilbert Transform. Computations such as Signal-to-Noise ratio, Power Spectral Density and Cycle Period measurement can only be achieved by calculating the Hilbert Transform.

4. Laplace Transform

Laplace Transforms are useful for solving transient state systems that are described by differential equations. The transform helps simplify complex differential equations; notably in the case of partial differential equations used in option pricing formulas. Simplified equations in the Laplace domain are solved algebraically before the inverse Laplace transform is applied to return the solution to the original domain.

5. Wavelet Transform

The Wavelet Transform is similar to the Fourier Transform in that it converts time domain data into frequency domain representations. The main difference is that wavelets provide a number of ways for doing the conversion through a series of wavelet families. Each wavelet family has its own property which is exploited for doing more specific decomposition tasks such feature extraction, noise removal or cycle period estimation of time series. Decomposed time series can be analyzed in several ways to provide more detain into it’s intrinsic properties.

6. Z - Transform

The Z Transform converts discrete time-domain data points into a complex frequency-domain representation. Unlike the Laplace and Fourier Transforms which are applicable to differential equations, the Z Transform provides an excellent framework for working with difference equations in the frequency domain. One can go about describing filters in terms of their transfer functions and apply a host of algebraic techniques found in Digital Signal Processing to design better filters for time series analysis.

Summary

  • Transforms are performed to make problems easier to understand and solve.
  • Any Transform has an accompanying Inverse Transform algorithm so that a dual relationship is maintained between two domains.
  • The Z, Wavelet, Fourier and Laplace transforms are related in that they maintain a relationship between the time domain and the frequency domain.
  • Laplace Transforms are useful for analyzing transient systems where as Fourier Transforms are more suitable for steady state systems.
  • aiQUANT thinks technical indicators and oscillators should be designed in the Z domain. This gives greater understanding of the frequency characteristics - something which is not obvious with just the time domain representation of the indicator.

Rescaled Range Analysis

Fractal Analysis, Statistics 5 Comments »

In my previous post I mentioned the Hurst Exponent as a measure of the predictability of a time series. I drew a data flow diagram showing the Rescaled Range algorithm, however, thanks to foquant, it has come to our attention that the paper I got the description of the algorithm from does not mention clearly that the algorithm needs to iterate for different sub-periods. foquant attempted to recreate the results in excel and found dissimilar values to the ones shown in the paper [Rasheed et. al.]. Doing a quick google search I came across this excellent presentation which confirms what really needs to be done in order to calculate the Hurst Exponent.

I have re-done the data flow diagram for the algorithm and the steps that need to be performed are shown below:

hurstAlgo2prop.png

I have provided an example below which steps through each stage of the algorithm. Hopefully this would make the algorithm easier to understand.

Algorithm Steps

Here I use a returns time series X = X1, X2, …, X1023, X1024 as an example. We shall operate under log2 domain for regression. Matrix dimensions are shown in square brackets [m x n], where m is the number of rows and n is the number of columns.

1. Calculate number of data points

We do this step so that we have a time series with an integer number of sub-periods over which to calculate the rescaled range.

dataPoints = floor(log2(length(data)));

In this example dataPoints = 10. Hence we have 10 sets of sub-periods over which to iterate the rescaled range algorithm. I shall explain what sub-periods are below.

2. Calculate sub-period boundaries

Since we have 10 data points all we need to do to calculate sub-period boundaries is to raise the base to the power of each of the data point numbers. So in this example the sub-period boundaries are 2^(1 through to 10) = 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024. These points are essentially the t values on the regression graph. So taking the log of these values reverses the procedure and gives us values 1 through to 10 which shall be the x-axis points on the regression graph.

3. Select sub-period boundary

From this point onwards we iterate through each of the sub-period boundaries. In this case we shall iterate 10 times. Select only 1 sub-period at a time.

4. Calculate the sub-period

Having selected a sub-period boundary we now calculate the sub-period.

subPeriod=floor(N/selectedBoundary);

Here N = 1024 and the selectedBoundary will be from step 2. So if selectedBoundary = 2 then subPeriod = 512. If selectedBoundary = 64 then subPeriod = 16. The table below shows this.

subPers.png

5. Calculate sub-period matrix.

Having calculated the sub-period we now evaluate the sub-period matrix. This is hard to explain using the entire time series X = X1, X2, …, X1023, X1024. But the illustration below tries to explain this using a twelve element time series.

subPeriod.png

An n sub-period matrix has n columns. So in our first iteration we will create a 512 sub-period matrix. In the second we create a 256 sub-period matrix. Notice that the table shown in step 4 is basically the dimensions of the sub-period matrix, with selectedBoundary being the number of rows and subPeriod being the number of columns. Rows are horizontal, columns are vertical. So sub-period matrix is of dimension [selectedBoundary x subPeriod].

6. Calculate mean of each sub period

For the matrix created in step 5, we calculate the mean of each column i.e. mean of each sub-period. This gives is a matrix of size [1 x subPeriod].

7. Remove mean from each sub-period

Subtract the calculated mean for each column from each row under that column. Do this step for all columns. This procedure results a matrix [selectedBoundary x subPeriod].

8. Calculate cumulative deviation from sub-period mean.

It’s hard to explain, but here is an example with a 3×6 matrix:

cumsum.png

 

This procedure does not alter the size of the matrix and remains as [selectedBoundary x subPeriod].

9. Calculate min and max of cumulative deviation

Take the min of each column of the cumulative deviation and subtract it from the max of each column. This gives a matrix [1x subPeriod] large.

10. Calculate standard deviation of each sub-period.

Here we calculate the standard deviation of each column of the sub-period matrix. We end up with a matrix of size [1 x subPeriod] large.

11. Calculate rescaled range

Divide matrix from step 9 with the standard deviation from step 10. Note that the matrices from steps 9 and 10 are of similar dimension.

12. Take logarithm of mean of rescaled range

To get the y-axis log(R/S) point for the particular sub-period we average the rescaled range matrix from step 11 and take log2 of the calculated average.

13. Take logarithm of sub-period boundary

To complete we take the log2 of the selected sub-period boundary from step 3. This is the x-axis log2(t) point. We now have a log2(R/S) and log2(t) pair to plot on a graph. We go back to step 3 and repeat for a newly selected sub-period boundary.
Having iterated through all sub-period boundary values and plotted them, we perform a regression to get the line of best fit. The slope of this line corresponds to the Hurst Exponent of the returns time series.

I’m not too sure how this procedure would be implemented in Excel, but I imagine it requiring some custom VBA code, particularly for evaluating the sub-period matrix.

Measuring the Predictability of a time series

Fractal Analysis, Statistics 13 Comments »

It is reasonable to ask whether a given financial time series is predictable before going about creating a model to predict it. One of the techniques available to do this is the Rescaled Range algorithm, which provides a numerical estimate of the predictability of a time series known as the Hurst Exponent. The reason the Hurst Exponent is an estimate and not a definitive measure is because the algorithm operates under the assumption that the time series is a pure fractal, which of course is not entirely true for most financial time series. This however is of low importance and what really makes the Hurst Exponent appealing is that it provides a means of classifying time series. This is a very useful statistical measure for comparing a model’s performance across different sections of financial data. Here I describe the R/S algorithm and provide an example of a time series whose hurst we calculate.

In terms of data flow, the Rescaled Range algorithm applied to a financial time series is as follows:

hurstAlgo3.png

The value of the Hurst Exponent is in the range 0 to 1 where 0 means the returns time series is unpredictable and 1 means the returns time series is predictable. It is worth pointing out that the Hurst exponent is calculated for the returns time series, and not the actual time series showing absolute prices. More formally we have

hurstTable2.png

  • H < 0.5 indicates an anti-persistent returns time series which means future values will always have a tendency to return to a longer term mean value. The strength of this mean reversion increases as H approaches 0.
  • H = 0.5 indicates a random walk and so there is a 50% probability that future return values will go either up or down.
  • H > 0.5 indicates the returns time series is trending the strength of which increases as H approaches 1. Series of this type are easier to predict than series falling in the other two categories

Calculating the Hurst Exponent

As outlined in [Rasheed, K. et al], for a returns time series

_1.png

We apply the following steps with

0.png

1. Calculate the mean

1.png

2. Calculate the mean adjusted series

2.png

3. Calculate the cumulative deviate series

3.png

4. Calculate the range series

4.png

5. Calculate the standard deviation series

5.png

6. Calculate the rescaled range series

6.png

7. The rescaled range scales by a power-law as time increases, such that

7.png

8. In order to calculate the hurst exponent we can take the base-n logarithm of the above equation, which gives

last.png

Recall that the form of the above equation is that of a straight line. We can obtain the value of H by calculating the slope of the linear relationship between log(R/S) and log(t). This requires a regression as the line of slope will not always be perfectly straight.

Example

Lets consider the following 1024 bar time series for an equity index. This is a reproduction of results from [Rasheed, K. et al]

 

actual.png

In order to calculate the Hurst for this interval, we first need to obtain the returns time series:

 

returns.png

Applying the algorithm to the returns time series we find individual points align in a near straight line fashion, the slope of which gives us the Hurst exponent. Note that we chose 1024 points so that we can easily apply base 2 logarithm. A different base can be used instead, but base 2 allows calculation with fewest data points.

 

hurstRegress.png

Practical Applications of the Hurst Exponent

  1. The hurst provides a method of classifying time series, which can be beneficial in identifying for instance which stocks have greater short term predictability. We could create a portfolio consisting of stocks with particular hurst values and investigate their profit generating characteristics.
  2. An application involving automated trading could be something like this: If a particular asset has it’s hurst drop below a threshold value, all investment positions in this asset could be closed in response to a “regime shift”.
  3. In conjunction with biologically inspired algorithms, hurst classification can help determine which assets to forecast and which ones to ignore. This can be particularly useful in neural nets where models can focus more on time series with higher predictability.

In a future post I will show that Evolutionary Algorithms generate greater profit when applied to time series with hurst greater than 0.5 than time series with hurst below 0.5.

Oscillator design update

Filter Design, Statistics 2 Comments »

I’ve been struggling to derive a suitable zero-lag high-pass filter.  It is not as easy as deriving a zero-lag low-pass indicator.  The design approach is the reverse of what we did in the previous case, and there comes a stage where you need to reduce the order of the filter whilst maintaining lag at high frequencies.  The algebra is overly complicated, and I’m not sure if I’m doing the right thing.  This is the best I could manage, but if far from perfect.

HPlagTryOne.png

Two things I could do here.  Either give-up on this task or consult someone specialised in filter design.  Maybe I shall leave it for now and focus on other things.

Forecasting a Brownian path

Neural Networks, Stochastic Processes 5 Comments »

A Neural Network can help forecast future values of a brownian path.  For this purpose we use a multi-layer perceptron network which usually consists of multiple layers of interconnected computing units called nodes. The MLP is feed-forward as the pattern of activation of the network flows in one direction only, from the input to the output layer.    During training, the network aims to learn patterns that are present in the input data by minimising a mathematical function describing the quality of network performance.  They are trained using a supervised learning paradigm.  In supervised learning a set of input data vectors for which the output is already known are presented to the MLP.  An algorithm maps the input vectors to the known output vector by systematically adjusting network weights, hence minimising its cost function.

trainSetBrowniian.pngSo for our experiment I divide the time series into training set and a test set.   As inputs within the training set I use moving average vectors which are derived from values of the brownian time series.  For solving real world problems of this nature I would not recommend using moving averages.  This is because they have lag and would result in your output having lag.  Lag is a major issue with neural networks.  If a network is not properly trained, future forcasted values may appear offset towards the right side of the actual timeseries.  The real challenge is to ensure your inputs are not delayed with reference to your target vector.  Atleast this is true for te case where inputs are derived from the target.  I could have used methods other than the MA equations to derive input vectors.  But here I shall stick with moving averages because our aim is not to develop a reliable forecasting system, but to show that a time series can be learned.

 The MLP takes a 4:5:1 structure.  There are 4 inputs, 5 hidden nodes and 1 output.  There is no rule as to the number of nnTry.pnghidden nodes to use.  I have not found the number of hidden nodes particularly concerning, although it must be pointed out that using too many may result in the network memorising rather than learning input-output relationships.  Each node is realted to nodes in the previous layer via a weight value, which changes during network training.  A popular method called Network Pruning improves the functionality of the NN by removing memorised patterns embedded within the network structure.  For this task I shall not prune the network as there are specific reasons of doing so, which do not crop up here.

learnCurveBrowniian.png

The graph shows the network learning curve over 50 epochs.  The MSE drop for the earlier parts of the training process is far greater than the drop many epochs later.  The challenge is chosing a number of epochs that would allow the network to have learned generalisations rather than specific relationships.  The graphs below show the network forecasting ability after being trained over a different number of epochs.

epochVariatBrowniian.png

The use of NNs for forecasting a Brownian motion requires a lot of care, particularly when deciding the nature of input vectors to use and number of training epochs to stop at.  This experiment hopefully highlights some of these issues.

Lets get started…

Stochastic Processes No Comments »

Okey… so this is the first experiment I shall perform using BIAs.  The purpose of this experiment is to talk about a method of  forcasting a non-linear timeseries using a BIA.  I have simulated a 500 step brownian motion with drift=10 and diffusion coefficient=10.  The graph below shows the time series we will be dealing with.

brownianMotion.pngAs you can see this timeseries exhibits a fair mix of trend and cyclic components, aspects that will enable me to explain a few things later on. 

In my next post I shall explain exactly what I intend to do.  For now lets just admire our nice little brownian motion!

Close
E-mail It