Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stock_data_handle.py存在未来函数 #21

Open
sysy007uuu opened this issue Dec 5, 2024 · 2 comments
Open

stock_data_handle.py存在未来函数 #21

sysy007uuu opened this issue Dec 5, 2024 · 2 comments

Comments

@sysy007uuu
Copy link

对整个数据集做标准化存在轻度的未来函数,加入新的数据会导致特征标准化的值发生变化

@weituo2002
Copy link

这直接就是造成了数据泄漏吧,标准化用到了均值和标准差,直接就偷看了后面的所有数据了

@Nitasurin
Copy link

Nitasurin commented Dec 12, 2024

数据处理方法在别的issue有人介绍过是当前值除以OCHL中的最大值,我记得我之前有验证过确实是这样。这种处理方法对未来泄漏的信息严重程度反而与历史数据相关,有违常理:只要模型越确信某个标的接近1的价格不在过去,那必然在未来。极端点,按这种数据处理方法,只要检查标的历史数据,做多价格越低的标的,做空价格越高的标的就行
具体程度应该检查代码,但光凭这一点,我就觉得这玩意没有任何价值

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants