Curriculum Learning and Imitation Learning for Model-free Control on Financial Time-series
ArXiv ID: 2311.13326 “View on arXiv”
Authors: Unknown
Abstract
Curriculum learning and imitation learning have been leveraged extensively in the robotics domain. However, minimal research has been done on leveraging these ideas on control tasks over highly stochastic time-series data. Here, we theoretically and empirically explore these approaches in a representative control task over complex time-series data. We implement the fundamental ideas of curriculum learning via data augmentation, while imitation learning is implemented via policy distillation from an oracle. Our findings reveal that curriculum learning should be considered a novel direction in improving control-task performance over complex time-series. Our ample random-seed out-sample empirics and ablation studies are highly encouraging for curriculum learning for time-series control. These findings are especially encouraging as we tune all overlapping hyperparameters on the baseline – giving an advantage to the baseline. On the other hand, we find that imitation learning should be used with caution.
Keywords: Curriculum Learning, Imitation Learning, Reinforcement Learning, Time-Series Control, Policy Distillation
Complexity vs Empirical Score
- Math Complexity: 6.5/10
- Empirical Rigor: 8.0/10
- Quadrant: Holy Grail
- Why: The paper utilizes advanced mathematical frameworks including Markov Decision Processes, signal/noise decomposition, and reinforcement learning theory, while also featuring substantial empirical validation through out-of-sample testing, ablation studies, and hyperparameter tuning on multiple datasets.
flowchart TD
A["Research Goal<br>Model-Free Control on Financial Time-Series<br>Using Curriculum & Imitation Learning"] --> B["Methodology<br>Data Augmentation (Curriculum)<br>Policy Distillation (Imitation)"]
B --> C["Input Data<br>Complex Financial Time-Series"]
C --> D["Computational Process<br>Model-Free Control via RL"]
D --> E["Comparative Analysis<br>Curriculum vs. Imitation vs. Baseline"]
E --> F["Key Findings<br>Curriculum Learning improves performance<br>Imitation Learning requires caution"]
F --> G["Outcome<br>Novel Direction for Time-Series Control"]