Quantcast aiQUANT

Fourier Transforms and financial time series data

Fourier Transform, Signal Processing 2 Comments »

The purpose of this post is two fold:

  1. Discuss aspects of the Fourier Transform that will form necessary grounding for subsequent understanding of the Wavelet Transform.
  2. State why Fourier Transforms are inappropriate for analyzing financial time series data.

The Fourier Transform

In the previous post I hinted at the relationship between the time domain and the frequency domain representations of a signal. The figure below gives a clearer picture of that

fourierTdFd.png

A signal that oscillates rapidly with respect to time has a higher frequency relative to one that oscillates slowly. A pure sine wave has only one frequency component, but not all signals are sinusoidal - some may be composed of a number of different frequency signals added together, others such as stock prices can have signal frequency components that change with time. In order to determine the frequency content of a signal, one needs to perform a conversion into the frequency domain, there by giving us a view in the direction shown by the yellow arrow in the figure above (the spectrum of the signal).

The Fourier transform is one of a number of algorithms that provide a way evaluating the frequency content of a signal. The transformation into the frequency domain follows the equation

[tex]X(f)=\int_{\small-\infty}^\infty\text{x(t)}e^{-2j\p\text{ft}}dt[/tex]

and the inverse Fourier transform is

[tex]\text{x(t)}=\int_{\small-\infty}^\infty\text{X(f)}e^{2j\p\text{ft}}df[/tex]

Now the equations listed above belong to one of the many variants of the Fourier transform - the Continuous Time Continuous Frequency Fourier transform. There are few general points to note:

  1. The equation of the Fourier transform tells us that the integration needs to occur for all times within the range [tex]\small-\infty[/tex] to [tex]\small\infty[/tex]. The implication of this is that the occurrence of time domain signal [tex]\small\text{x(t)}[/tex] needs to exist for any value of time within the said range.
  2. For the given integration bounds, frequency [tex]\small\text{f}[/tex] needs to be a constant value. This implies that signal [tex]\small\text{x(t)}[/tex] needs to be time-invariant i.e. stationary.

ftFamFigure.png

I have drawn up a chart showing the different variants of the FT and their relationship with each other. The diagram is work in progress so please feel free to recommend any changes that can be made. I intend to extend the family tree to show other transforms relevant to our discussions.

The choice of invoking a particular transform depends on the nature of the signal undergoing the transformation (continuous or discrete in time) and the level of detail we wish to obtain in the frequency domain (continuous or discrete frequency bins). This also applies to the inverse Transform.

I do not wish to talk any more than this about the FT as I have provided a few good resources at the end of this post to complement what has already been discussed.

Application to financial time series data

There are two main reasons why FTs are inappropriate for analyzing financial data:

  1. Bullet points (1.) and (2.) above do not hold for stock prices or any other asset price. Particularly with regard to bullet point (2.) stock prices do not have a constant frequency with respect to time. This points to another short-coming of the algorithm - it tells us nothing about the points at which a frequency change occurs in the time domain, which is essentially one of the things we want to know.
  2. The other reason is that FTs deliver poor resolution for cycle period measurement. Suppose we perform a 128-point FFT. The largest cycle the algorithm can record is 128 bars. Cycles larger than 128 cannot be recorded because the window length is not large enough for it to be captured fully. One may wish to increase the window length, but it is likely that stationarity is not maintained. And also, only cycles that are an integer power of two (2, 4, 8, 16, 32, 64) can be recorded (in the case of the DTDF transform). Anything that falls between these boundaries cannot be measured accurately as it is mathematically impossible for the algorithm to do so.

Further Reading:

Waves and Wavelets

Fourier Transform, Signal Processing, Wavelet Transform 1 Comment »

Here I write about things that may aid those who are new to Signal Processing to grasp a few concepts without necessarily referring to a textbook.

The table below shows some the differences between a Wave and a Wavelet.

waveLetComp.png

Essentially a Wavelet is a small wave with pretty much the opposite characteristics to that of a wave. This is exactly what makes their application appealing to time variant signals such as stock prices. Accompanying wavelets is the Wavelet transform algorithm, of which there are many variants, but all essentially do the same thing using different ways. The table below shows the difference between two contrasting analytical algorithms.

comp2.png

So what are these “domains” we are referring to? Any signal that is a function of time is in the time domain. Stock prices are in the time domain because the each point in time has a price associated with it. Now consider another time domain signal such as a sine wave. Lets assume that this signal has amplitude 2 and takes 5 seconds before it begins to repeat itself i.e. it’s period is 5 seconds. Now the frequency of this signal is the inverse of the period which is 0.2 hertz in our case. Also the amplitude of the signal is the in the frequency domain our sine wave will look like a single spike of height 2 at the point 0.2 hertz. This is because the frequency domain shows the amplitude or power of the signal (y-axis) for different frequencies (x-axis), which is referred to as the spectrum. The spectrum of a stock price takes a more complicated form and is more difficult to determine, but the wavelet transform takes care of that.

Now it is worth pointing out that wavelets are usually combined with other signal processing algorithms to form “systems” that enable the transformation from the time domain into the frequency domain to happen in particular ways, depending on what you want to achieve. For instance instead of determining the spectrum of the stock price one may want to develop a filtering scheme or a trend/cycle estimation scheme. We shall see more of that later.

Notes on Sampling rate conversion

Fourier Transform, Signal Processing, Wavelet Transform 5 Comments »

For those not too familiar with wavelet theory I have attached some notes here for easy view. The idea of a non-fixed sampling rate is fundamental to the Wavelet Transform and it is very important to understand these concepts before delving into actual applications. The newbie may find some of the material heavy going in terms of the maths involved, but I believe understanding ideas are more important than being able to evaluate intagrals.

The first set of notes presents basic multirate concepts. The notion of upsampling and downsampling are discussed in terms of time and frequency domain characteristics.

The second set of notes gives a very friendly introduction to the Wavelet Transform and shows it’s advantages over the Fourier Transform.

I have also attached a very readable paper titled “Wavelets for kids“. You can download it from here

Over the next few posts we shall look at some wavelet decomposition models for financial time series.

Wavelet textbooks

Signal Processing, Wavelet Transform 2 Comments »

I wish to start discussing wavelets, which is yet another branch of signal processing that has had success in feature extraction and volatility forecasting for time series. I suggest the following three excellent books for the reader who is interested in wavelets:

Additionally there are numerous web resources on wavelets and the like, with about two million hits via google search.

Using Simulink to implement and test models

Hilbert Transform, Matlab, Signal Processing, Transform Algebra 4 Comments »

In addition to Matlab, The Mathworks offer a product called Simulink which is a design platform for implementing and testing model-based systems. Most of my experience using Simulink borders on modelling Control Systems and DSP architectures. However, as the Mathworks have pointed out in a recent webinar, Simulink can be extended to develop and test finance based models such as a trading system or sub-systems that form part of a larger model.

I decided to implemented a Hilbert Transformer model for price data using Simulink which is shown below:

HilbertDiagram.jpg

The model has input “simin” which takes the price data from the matlab workspace as discrete points one at a time and applies unit delay (represented by “1/Z”) and gain which is basically multiplication by the shown number. Individual Real and Complex parts are combined to create complex numbers which is then outputted to the workspace.

Simply put, the Hilbert Transform converts price data into complex number form so that one may go about calculating different measures that are not possible to calculate using just the price itself. My previous post mentions measures that are calculated using hilbert transformed price. The relationship between the actual price and the hilbert transformed price is

[tex]\small \text{Actual Price = sqrt{a^2 + b^2}}[/tex] for a hilbert price [tex]\small\text{a + ib}[/tex].

As one would expect the model implemented above suffers lag. This is confirmed by comparing the Actual price with the price reconstructed from the hilbert complex numbers:

HilbertSimulinkModel.png

The model does not induce any loss or gain in magnitude, but there is a lag of 4 bars. This implies that the hilbert price for the current bar actually corresponds to the price 4 bars ago - something we should consider when using hilbert prices to deduce other measures.

Other Links:

  1. The wikipedia page covers basics of the hilbert transform.
  2. The Mathworks have a webinar which has a section on using Simulink for designing and testing Algorithmic Trading models. You can view this webinar here. Its round about 40:00mins into the presentation where the section on Simulink begins.
  3. Simulink totorial can be found here.
  4. There a number of webinars and example models at the Mathworks website for one to get started using simulink.

Links for 31-01-2008

Links 1 Comment »

I have added a few links to my blogroll:

  1. Tom at Neural Market Trends shows how develop AI based financial models using an open source tool called Rapid Miner. He provides a number of tutorials and even has a member’s section where he shares his models through videos and examples. I think this is an excellent way for one to get started developing AI financial models. From what I hear the Rapid Miner tool has a steep learning curve, but Tom alleviates this a great deal through his easy to understand tutorials.
  2. Ali’s blog talks about mostly quant stuff with occasional digressions. I recommend this site for your daily food for thought.
  3. foquant runs a quant blog. He recently developed an SMS order confirmation tool for trading which can be downloaded from the models page at his site. He provides a number of other quant finance models and is always adding newer ones.
  4. Two fish’s blog covers a wide variety of topics ranging from quant finance to ghosts and immigration. Fascinating stuff!
  5. Visit the One Unified Blog for your dose on programming and IT related issues. The blog shares a model for Trade Optimization by Genetic Programming, written for the QuantDeveloper platform.
  6. The Econophysics blog provides a rare blend philosophical issues surrounding quantitative finance and quantitative social science. A must read!

Thoughts on GAs and the like

Artificial Intelligence 8 Comments »

Every once in a while I receive an email from someone somewhere asking me how to develop killer Genetic Algorithm trading model. I find there is a some confusion surrounding the GA, NNs and the like so in this post I wish to clear misconceptions that may lead one to believe that Evolutionary Algorithms will give answers straight away.

  • Lets be clear about it, Genetic Algorithms may sound sexy but under certain cases they are no silver bullet. I say this based experience from applying them to solve different kinds of problems. Under such cases it is not the optimization algorithm which fails you - its a simple case that there is no good solution to the problem at all.
  • You cannot use the GA or its derivatives directly to predict the market, but rather you use it to optimize a pre-defined, or nearly defined model in order to improve its performance. In other words one aims to find the best parameters to the model, and it’s the emerging model which goes about doing the predicting or trading and not the GA itself.
  • On training Neural nets you want to be careful not to use a local search algorithm e.g. the Back propagation algorithm, but rather a global search mechanism such as swarm, simulated annealing, the GA or its derivatives. The reasoning behind this is that local search algorithms will converge to any minimum (which may be local or global). A global search algorithms on the other hand will have a near accurate view of the optimization surface and as a result are more likely to converge to global solutions.
  • Neural Nets have received some bad press about over fitting problems. There are a number of ways of dealing with this, the most notable one being filtering. If the data you are fitting to is clean then your solution will not over fit. If your data is noisy then filtering the noise will ensure that over fitting does not occur.

If there are good solutions to find, then Evolutionary Algorithms are good at discovering these solutions, but if there are no good solutions (which may be an inherent characteristic of the problem), then it is impossible for Evolutionary Algorithm to discover any. The message to take away from here is that one should focus on problems where there is a strong likelihood of something to be discovered, rather than believe that a solution will be discovered out of the blue for any arbitrary problem.

A filter with negative lag

Filter Design, Technical Indicators 6 Comments »

The problem with technical indicators is that they suffer lag, resulting in signals appearing after events have occurred in time series context. Recently I’ve been investigating ways of reducing lag in these filters, building upon the results from a previous post which shows the workings of a zero-lag indicator derived from first principles. It seems something can be done, although the technique as it stands needs further refining. The graph below shows how it compares with the Exponential Moving Average (EMA) and the Moving Average (MA):

predictFil.png

The thing with this filter is that it has negative lag when the market is trending but has high gain when the market is in cycle mode. The reason for this is that in the filtering algorithm there is a part where I multiply the actual price by itself which results in noise components being amplified. To understand this let us consider the following:

Price data can be seen as comprising of two components: a signal part which is useful and a noise part which is not useful.

[tex]\small \text{Price=Signal+Noise}[/tex]

The effect of multiplying Price by itself is a quadratic equation with signal and noise terms:

[tex]\small \text{Price*Price}=\left\{\text{(S+N)(S+N)} \\ \text{S^{2}+SN+NS+N^{2}}\right.[/tex]

We have three noise terms, two of which affect our signal. I think this is the cause of the laggish behavior when the market is cyclic because the high frequency noise components are amplified.

Now there are two ways I can think of improving the performance of this filter

  1. Find a means of modifying the value of alpha when the filter detects that the market is no longer trending. This will require an additional algorithm to estimate whether the market is in trend mode or cycle mode - something which the Hilbert Transform can take care of.
  2. Find an alternative to multiplying the price by itself so that additional noise terms are not introduced.

I am biased towards 1 so shall spend more time on that. Because I consider this work in progress, I am tempted to discuss the actual workings of the filtering algorithm at a future date when we have some acceptable performance.

Transform Algebra in Mathematical Finance

Transform Algebra 2 Comments »

I have hinted at Z-transforms and Hilbert transforms and how they are applied to financial time series. But generally speaking, there are a number of other mathematical transforms which find their use in financial modeling problems.

Mathematical Transforms are important because they help convert functions into other domains thereby making them easier to understand and solve.  Here I give a qualitative description of common transforms and some examples of how they are applied to problems in finance.

traformTable2.png

Obviously the choice for invoking a particular transform depends on the problem being solved.

1. Fischer Transform

The Fischer Transform converts any data set into a modified data set whose probability density function is approximately gaussian. An immediate benefit is that one can then analyze the transformed data set in terms of its deviation from the mean - something which might not have been possible prior to the transformation. Consider for instance absolute prices on a bar chart. Are they normally distributed (i.e. bell shaped)? No. The returns density of the time series are usually near bell shaped, but the actual prices are no where close to bell shaped. This means attaching Gaussian confidence intervals to absolute price data is impossible, but Fischer transformed data will modify the data so that they become bell shaped hence allowing the majority of Gaussian statistical functions to be applied. The benefit is seen in modified technical indicators that are similar to, but more responsive than conventional oscillators such as the Commodity Channel Index (CCI) or Moving Average Convergence Divergence (MACD).

2. Fourier Transform

The Fourier Transform enables conversion of functions or data sets from the time-domain to the frequency domain and vice-versa. In Signal Processing this is essentially what describes the relationship between the time domain and the frequency domain. In terms of option pricing, the Fourier Transform provides a framework for fast price calculation compared to Monte Carlo methods. As shown in [Szymon et all] the Fast Fourier Transform (FFT) algorithm is about 3000 times faster than Monte Carlo simulation.

fftSimTime.png

It is worth noting that FFTs are inappropriate for direct time series analysis because they deliver poor resolution in terms of cycle length. They are only capable of recording integer number of cycles, thereby missing any cycles that have a length falling in between two integer boundaries. A good algorithm which performs better than the FFT for measuring cycle period in financial time series is the Pisarenko harmonic decomposition.

3. Hilbert Transform

The Hilbert Transform is a procedure to create complex signals from the real price data that is plotted on the bar chart. With complex signals available one can compute more accurate and responsive indicators as well as create other indicators which cannot be calculated without the Hilbert Transform. Computations such as Signal-to-Noise ratio, Power Spectral Density and Cycle Period measurement can only be achieved by calculating the Hilbert Transform.

4. Laplace Transform

Laplace Transforms are useful for solving transient state systems that are described by differential equations. The transform helps simplify complex differential equations; notably in the case of partial differential equations used in option pricing formulas. Simplified equations in the Laplace domain are solved algebraically before the inverse Laplace transform is applied to return the solution to the original domain.

5. Wavelet Transform

The Wavelet Transform is similar to the Fourier Transform in that it converts time domain data into frequency domain representations. The main difference is that wavelets provide a number of ways for doing the conversion through a series of wavelet families. Each wavelet family has its own property which is exploited for doing more specific decomposition tasks such feature extraction, noise removal or cycle period estimation of time series. Decomposed time series can be analyzed in several ways to provide more detain into it’s intrinsic properties.

6. Z - Transform

The Z Transform converts discrete time-domain data points into a complex frequency-domain representation. Unlike the Laplace and Fourier Transforms which are applicable to differential equations, the Z Transform provides an excellent framework for working with difference equations in the frequency domain. One can go about describing filters in terms of their transfer functions and apply a host of algebraic techniques found in Digital Signal Processing to design better filters for time series analysis.

Summary

  • Transforms are performed to make problems easier to understand and solve.
  • Any Transform has an accompanying Inverse Transform algorithm so that a dual relationship is maintained between two domains.
  • The Z, Wavelet, Fourier and Laplace transforms are related in that they maintain a relationship between the time domain and the frequency domain.
  • Laplace Transforms are useful for analyzing transient systems where as Fourier Transforms are more suitable for steady state systems.
  • aiQUANT thinks technical indicators and oscillators should be designed in the Z domain. This gives greater understanding of the frequency characteristics - something which is not obvious with just the time domain representation of the indicator.

Matlab Compiler notes

Matlab No Comments »

This is a collection of notes made when using the Matlab compiler under Matlab 6 (2003) and Matlab 7 (2006) which may be of use to Matlab compiler users.

Matlab 6 notes

The Matlab compiler is a useful utility which enables you to turn your Matlab code into C/C++ and compile it. The compiler doesn’t really compile your code at all… it translates your Matlab code into C line for line. The C/C++ code produced is unreadable do don’t count on being able to modify it or even understand it! Mex files can be included, watch out for relative paths names though, some documentation on this on Matlab homepage.

Don’t use the remez filter generation command in Matlab as the compiler can’t compile it. There is a fix but it is complicated, see this.

Other functions such as eval and keyboard are unsupported by the compiler. The save command only supports .mat format, not ascii format.

Make sure you test the executable you’ve produced on other machines especially if you intend to deploy the code for other users! I have found that code would work fine on the machine it was compiled on but produce run-time errors on other machines due to missing dlls & incorrect relative paths.

When you supply a command line argument to the executable when running from the DOS command line, it is passed to the internal code as a string even if it is a number. You need to use a command such as

matComp6.PNG

to convert these into numbers. Of course now if you run this code under Matlab, it doesn’t work… I used a compile variable to execute different parts of code depending on whether we were running under Matlab or in a compiled version of the code:

matComp7.PNG

If you run the compiler command (mcc) from a DOS prompt or Perl script etc, you need to make sure you’ve got the Matlab path saved as a Windows variable or else the compiler can’t find any libraries such as the wavelet toolbox. Use the mccsavepath utility to do this.

When giving executable models to other users, you also need to supply the Matlab run-time libraries. These are supplied with matlab compiler, but can also be downloaded from the mathworks homepage.

Matlab 7 notes

For Matlab 7, the compiler has been revised extensively. It now encrypts your code along with the built-in Matlab functions you’ve used. It presumably decrypts these at run-time.

A few more issues:

You need to be VERY CAREFUL here as the compiler makes no effort to change either directory structures or filenames. If you are using some bespoke algorithm/ method in your system which you regard as proprietary and you name your filename after this, e.g. 16TapRLS.m then the one who uses your deployed application is going to see it!

Compiler doesn’t like working over networked file stores, sometimes it fails for strange reasons.

C files with a mex interface used to be called .dll files are now called .mexw32. Matlab 7 only uses .mexw32 file and will helpfully rename your .dll files to .dll.old when you recompile your .c files rendering them useless under Matlab 6.

Some things have been improved since version 6:

Introduction of isdeployed function to indicate if you are running in Matlab or compiled environment. The following code can be used to handle this whilst retaining backwards run compatability to Matlab 6.

matComp1.PNG

Compiled code still can’t handle numerical inputs from command line.

Compiled code can now load .mat files in same way as native code i.e.

matComp2.PNG

The mex function has the following format:

mexInt.jpg

The Matlab mex documentation gives a full explanation of how to access arguments and return values. The mex function allows multiple values to be returned from a C-function. However, the function arguments are passed by address and can therefore be changed within the C-code; these changes are reflected in the matlab environment.
An example shows a mex function called product, written in a file called product.c as:

matComp3.PNG

The mex call in matlab could be:

matComp4.PNG

The value of c will be 200, b will be 20 but a will now be 200 also. The example seems obvious but can be easily overlooked when dealing with more complex functions.

Other notes:

  • If the mex function creates static variables, these are created dynamically the first time the function is called and remain persistent for future calls.
  • The matlab clear function will clear only matlab variables from the workspace; clear all will properly clear any variables, such as static variables, allocated by the mex function.
  • The mex compiler error LINK : fatal error LNK1168: cannot open your_mexfunction.dll for writing is caused by static variables persisting after your_mexfunction has been called in the Matlab environment. The clear all command will remedy this.
  • The compiled mex function has a .dll extension under Matlab 6 (Matlab 7 uses .mexw32). If the function has been called, the DLL file cannot be deleted until clear your_mexfunction or clear all is used.
  • Creating a mex build script is difficult since matlab string variables cannot be referenced within the mex command. However, the mex command can be simplified by using the triple dot ‘…’ directive to separate the command. Particularly useful for very large build commands, e.g.

matComp5.PNG