N2NQuant User's Manual

Factor Research Platform

由Unknown创建,最终由n2nadm 被浏览 2848 用户

English/简体中文

Factor research

In the field of financial investment, factor research is an important component of quantitative investment. This is a key means of studying and analyzing the efficiency and risk of financial assets such as stocks and bonds, in order to reveal the fundamental factors that affect investment returns.

The core value of factor research lies in its ability to reveal variables that have a sustained impact on investment returns, such as market value, quality, momentum, low volatility, and returns. These factors have shown significant impact on investment returns in history, therefore, a deep understanding and application of these factors is crucial for establishing quantitative investment strategies.

Through quantitative methods such as statistics and mathematical models, factor studies can help investors better understand the efficiency and risk of assets, thereby optimizing investment portfolios and achieving a balance between risk and return. The results of factor research can also help investors predict future market trends to a certain extent, thereby making more scientific and rational investment decisions.

In addition, factor research can also help investors implement multi factor investment strategies. By studying and analyzing the impact of multiple factors on investment returns, investors can construct investment portfolios between different factors to achieve better risk adjusted returns.

Overall, the importance of factor research in quantitative investment is self-evident. By conducting in-depth research on various factors that affect investment returns, investors can better understand the market, develop effective investment strategies, and ultimately achieve their investment goals. Therefore, whether you are a beginner or an experienced investor, factor research should be an important component of your quantitative investment toolbox.

\

In depth usage guide

API submission factor analysis

In the aistudio development environment, use the following code to submit factors to the factor library:

N2nqant supports researchers to analyze, output, and display data and charts according to their own factor analysis framework. The corresponding data and charts are submitted to the front-end page for display through the factors.submitt_factor interface.

datasource_id = "cn_stock_factors_test"
datasource_column = "volume"

import dai
import bigcharts
from bigalpha import factors
import uuid

factors.submit_factor(
    id = "test_factor_0001",
    performance_index = {"sharpe": 0.25, "xxx": 1.24},
    performance_report = "abcd",
    metadata = {},
    docs = {},
    name = "因子提交示例",
    desc = "这是一个简单的测试因子",
    datasource_id = "cn_stock_factors_test",
    datasource_column = "volume",
    space_id = "",
)

My factor

The factors submitted through the factors. submitt_factor interface in the development environment can be viewed in "My Factors".

{w:90}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}


You can choose one or more factors to submit, and the researcher can submit the factors to my factor library or the team factor library

\

Create factor library

  • On the right side of 'My Factor Library', there are three buttons with dots. Click to create a new factor library:

{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}

\

  • To create a new factor library, you need to enter the name and description of the factor library:

{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}

\

Factor library management

There are three buttons on the right side of the factor library, which can be clicked to edit the factor library, manage members, and delete the factor library.

Edit factor library

{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}

\

Member Management

Click the "Add Member" button to add other researchers on the platform to the factor library.

{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}


After joining, the added members can see a list of authorized factor libraries on the "Team Factor Library" page of the platform.

{w:90}{w:100}{w:100}

\

Delete factor library

{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}

\

Factor library listing

{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}


Click the "Add" button to upload to the specified directory

{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}

\

Factor Dashboard

After listing, you can view the listed factor library in the corresponding directory.

{w:80}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}


Click on the name of the factor library to view in detail the factors submitted by the library

{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}

Click on the factor name to view the detailed factor analysis for that factor.

Note: The logic and presentation output of factor analysis are entirely determined by the researcher's own code.

{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}


{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}{w:100}

\

Factor library review

Click the "Factor Library Review" button on the left to review. Click the "Review" button to publish the factor library that has been applied for listing for tracking by others on the platform. Click the "Remove" button to remove the listed factor library.

{w:100}{w:100}{w:100}{w:100}{w:100}

\

Practice of Factor Analysis Using Bollinger Bands as an Example

Principle of Factor Analysis:

Factor analysis is based on the idea of dimensionality reduction, which aggregates complex and numerous variables into a few independent common factors without losing or minimizing the loss of original data information. These common factors can reflect the main information of the original numerous variables, while reducing the number of variables and reflecting the inherent connections between variables.

Function of factor analysis:

Factor analysis usually has three functions: first, it is used for factor dimensionality reduction; second, it calculates factor weights; and third, it calculates weighted factor aggregation comprehensive scores.

Factor dimensionality reduction: Using factor analysis to reduce the dimensionality of multiple observed variables effectively improves data processing efficiency, such as reducing multiple questionnaire questions into several common factors to analyze users' impressions, attitudes, etc. towards the product.

Calculate factor weights: Use factor analysis to calculate factor weights, converting multiple observed variables into several common factors to better understand the relationships between observed variables, such as analyzing factors that affect stock prices.

Calculate the weighted factor summary comprehensive score: Use factor analysis to calculate the weighted comprehensive score, convert multiple observed variables into several common factors, and use factor loadings to calculate the weighted score, such as evaluating the comprehensive risk level of enterprises.

Factor analysis process:

The first step is to import the corresponding library

import pandas as pd
import numpy as np
import warnings
import empyrical
import dai
import bigcharts
import time 
warnings.filterwarnings('ignore')
print('导入包完成!')

The second step is to establish factor data. High frequency factors are generally established using Python, while other factors are generally established using SQL. The following simple Bollinger Bands factor is boldlower

  • Factor data import (currently only supports single factor), see DAI Data Platform for more factors
params = {'group_num':10, 'factor_field':'ma_amount_60', 'instruments':'全市场', 'factor_direction':1, 'benchmark':'中证500', 'data_process':True} # instruments支持选项:沪深300、中证500、中证1000、全市场;benchmark支持的选项:沪深300、中证500、中证1000

sql = """
SELECT
    date, 
    instrument, 
    m_AVG(amount, 60) AS ma_amount_60
FROM
    cn_stock_bar1d
ORDER BY
    date, instrument;
"""

start_date = '2018-01-01'
end_date =  '2024-01-01'
factor_data = dai.query(sql, filters={"date": [start_date, end_date]}).df()
  • Set parameters
# params:字典格式。
# factor_field:因子在表中所对应的字段名称。 参数类型:str
# instruments:标的池,支持选项:沪深300、中证500、中证1000、全市场。 参数类型:str
# factor_direction:因子方向,取值为1、-1;1表示因子方向为正,因子值越大越好,-1表示因子值为负,因子值越小越好。 参数类型:int
# benchmark:基准对比指数,支持选项:沪深300、中证500、中证1000。 参数类型:str
# data_process:是否进行数据处理(包括去极值、标准化、中性化)。 参数类型:bool 

params = {'group_num':10, 'factor_field':'boll_lower', 'instruments':'全市场', 'factor_direction':1, 'benchmark':'中证500', 'data_process':True} 
  • Factor data preprocessing
# 因子数据处理
factor_data.dropna(subset=[params['factor_field']], inplace=True)
factor_data = factor_data[['instrument', 'date', params['factor_field']]]
  • Data viewing
# factor_data:pandas.DataFrame格式,需要满足以下:
# instrument:str,以股票代码+.SH(沪市)或+.SZ(深市)或+.BJ(北交所)
# date:datetime64 
# factor:float64

factor_data

Step 3, call the factor analysis module

alpha_instance = AlphaMiner(params=params, factor_data=factor_data)
report_html = alpha_instance.render() # 图表展示因子分析结果

Step 4: Evaluation of Factor Analysis Results

  • Parameter Description&Evaluation Indicators

portfolio

Long long combination refers to the group with the highest factor value in factor analysis, usually the first group.

A short short portfolio refers to the group with the smallest factor value in factor analysis, usually the last group.

The long short combination refers to the combination in factor analysis that is generally the combination with the largest factor value minus the combination with the smallest factor value.

ic:Information Coefficient

 Information Coefficient,
 Represents the correlation between predicted and realized values, typically used to evaluate the ability of a factor to predict stock returns.

I C ∈ [ -1 , 1 ],
 The larger the absolute value, the better the predictive ability.

There are two calculation methods: normal IC (IC) and rank IC, and the calculation formula is shown below.

 其中normal IC必须满足数据服从正态分布的前提条件,但现实往往不理想,所以实际中更多人采用 rank IC(秩相关系数)来判断因子的有

 效性。两者分别对应Pearson 或Spearman 相关系数。
# 代码展示:
df['daily_ret'].corr(df['factor'], method='spearman')

**ir:**Information ratio (factor)

 Information Ratio,
 Represents the ability of a factor to obtain stable alpha.
 The entire backtesting period consists of multiple adjustment cycles, each of which calculates a different IC value.
 IR is equal to the average of multiple IC adjustment cycles divided by the standard of these ICs

Variance.
 So IR takes into account both the factor's stock selection ability (represented by IC) and the stability of the factor's stock selection ability (represented by the reciprocal of IC's standard deviation).

# 代码展示:
ic_mean = np.nanmean(IC_data['g_ic'])
ir = np.nanmean(IC_data['g_ic']) / np.nanstd(IC_data['g_ic'])

return_ratio

# 代码展示:
# series是日收益率数据
return_ratio =  series.sum() # 总收益
annual_return_ratio = series.sum() * 242 / len(series)  #  年度收益

ex_return_ratio

Excess return rate, also known as excess rate of return or excess profit margin.
 It is typically used to measure the performance of an investment or asset relative to a benchmark, such as the market average or a specific index.

Excess return rate is the difference between investment returns and benchmark returns, expressed as a percentage.
# 代码展示:
# series是日收益率数据,bm_series是基准日收益率数据
ex_return_ratio =  (series-bm_series).sum() # 超额总收益
ex_annual_return_ratio =  (series-bm_series).sum() * 242 / len( (series-bm_series))  #  超额年度收益

sharp_ratio

 The Sharpe ratio represents the excess return generated for each unit of total risk undertaken, which can comprehensively consider both the returns and risks of the strategy.

return_volatility

There are various formulas for calculating return volatility, and common methods include standard deviation method, mean absolute deviation method, and GARCH model.
 The calculation steps of standard deviation method are as follows:
a. Calculate the rate of return for each period, which is the difference between the previous period's price and the previous period's price.
b. Calculate the mean of the series of returns, which is the average of all returns.
c. Calculate the square of the difference between each period's rate of return and the mean.
d. Calculate the average of the squared differences.
e. Take the square root of the average to obtain the return volatility.

information_ratio

 Describe the worst-case scenario of the strategy and the most extreme possible loss situation.

max_drawdow

 Describe the worst-case scenario of the strategy and the most extreme possible loss situation.

win_percent

The proportion of profitable transactions in the total number of transactions.
# 代码展示:
# series是日收益率数据
sharp_ratio = empyrical.sharpe_ratio(series, 0.035/242)
return_volatility = empyrical.annual_volatility(series)
max_drawdown  = empyrical.max_drawdown(series)
information_ratio=series.mean()/series.std()
win_percent = len(series[series>0]) / len(series)
  • Overall performance indicator display

  • Annual Performance Indicator Display (Multi Head Combination)

  • Display of cumulative return curve for factor combinations (including long short and benchmark market combinations)

  • IC curve display

  • The target with the highest/lowest factor value

\

标签

量化投资风险投资回报收益率
{link}