How to avoid "overfitting" in programmed trading?

tech

In the process of establishing a quantitative trading model, many people will experience the situation of overfitting. Overfitting is actually a concept in the field of machine learning and statistics. It is generally used to indicate that a model performs very well during testing, but the results in practice are not as expected.

For traditional machine learning, the impact of overfitting is not very obvious, but the time series characteristics of financial data and the high noise characteristics of data determine the huge impact that overfitting will bring. Therefore, we must be rigorous in avoiding overfitting when building models.

Causes of "Overfitting"

The design process of a systematic trading system includes two parts, both of which can lead to overfitting.

The first part of the trading system design is to form a complete set of trading rules. There are generally two methods to form trading rules: top-down and bottom-up. The top-down method is based on long-term observation and summarization of market conditions to form rules, and then form quantitative trading strategies based on these rules. This process requires a long accumulation of trading experience.

Advertisement

The bottom-up method starts from market data, conducts statistical analysis to identify market characteristics, and forms trading strategies.

When traders backtest the trading system with historical data, they often retrain the trading rules based on the test results to form new trading rules, or combine these rules, which can easily lead to a fitting of the market data.

At the same time, in the quantitative implementation of the trading system, parameters are generally used to describe the system. Designers will increase the number of parameters and optimize these parameters to find the best trading system.

If there are too many parameters or excessive optimization of parameters, it often leads to a perfect overfitting of historical market conditions, while the future performance is greatly reduced.How to Avoid "Overfitting"?

The goal of designing a trading system is to generate profits in the future real market conditions, rather than pursuing a beautiful historical test curve. An overfitted trading system is a "beautiful trap." How can we escape this trap? We believe that we can start from two aspects: the formation of trading rules and the development of the trading system.

Modern mathematics' data analysis of the financial market shows that the time-price series includes two parts:

The first part is the deterministic term, from which certain patterns can be identified;

The second part is the random term, which has no determinable patterns, and the occurrence of a certain phenomenon is only probabilistic.

When we extract trading rules from historical market conditions, we need to analyze the logic and regularity of the rules. Trading rules need to reflect the regularity of the market and have a certain rationality.

After traders form trading rules through various means, in the specific process of trading system design, the following issues should be noted:

First, increase the sample size of historical test data to avoid too few transactions.

Friends who do futures know that if you backtest by classifying varieties, inactive varieties may not have many transactions in a year, and even after a few years, the number of transactions may not reach 100. Such a small number of transactions is meaningless and very prone to overfitting. Especially for arbitrage strategies, the holding strategy will be even longer, with only a few transactions a year, making it even less reliable.

So when we backtest the strategy, we should increase the number of transactions of the trading strategy. Generally speaking, more than 300 transactions can prove that the strategy is effective.Secondly, during testing, the data samples for testing should be divided into in-sample and out-of-sample data.

When designing the system, use the in-sample data, and then test the system with out-of-sample data. If the performance is greatly reduced, then this system is very likely to be overfitted.

Thirdly, the core parameters should not be too many.

A system with too many parameters is a system with multiple degrees of freedom. After optimizing multiple parameters, a beautiful system will always be obtained, but the reliability of this system is questionable.

Fourthly, when optimizing the parameters of the trading system, we need to examine the parameters near the optimal parameters.

If the performance of the system with nearby parameters is far worse than the performance of the optimal parameters, then this optimal parameter may be a result of overfitting, which is mathematically called a singular solution and is unstable. If the characteristics of the market change slightly, the optimal parameter may become the worst parameter.

Fifth, ensure a certain average profit.

Some strategies, after verification, will be found to have a large number of transactions and good performance, but the average profit is too low. Some friends may think, what does it matter if the average profit is low, as long as it is profitable. However, in addition to paying attention to profitability, there should also be awareness of slippage. If the average profit is too low, it is very likely to be affected by slippage. A stable and profitable quantitative trading strategy will eventually become a stable and losing strategy.

Comment