CNN-DRL for Scalable Actions in Finance

ArXiv ID: 2401.06179 “View on arXiv”

Authors: Unknown

Abstract

The published MLP-based DRL in finance has difficulties in learning the dynamics of the environment when the action scale increases. If the buying and selling increase to one thousand shares, the MLP agent will not be able to effectively adapt to the environment. To address this, we designed a CNN agent that concatenates the data from the last ninety days of the daily feature vector to create the CNN input matrix. Our extensive experiments demonstrate that the MLP-based agent experiences a loss corresponding to the initial environment setup, while our designed CNN remains stable, effectively learns the environment, and leads to an increase in rewards.

Keywords: Deep Reinforcement Learning, CNN vs MLP, Market Making, Action Scaling, Time Series Analysis, Equities

Complexity vs Empirical Score

  • Math Complexity: 3.0/10
  • Empirical Rigor: 8.0/10
  • Quadrant: Street Traders
  • Why: The paper relies on standard deep learning architectures (MLP/CNN) and basic RL formulations (MDP) with minimal complex derivations, but it details a full trading environment, specific datasets, and reports performance metrics, indicating a high degree of empirical testing.
  flowchart TD
    A["Research Goal:<br>Address Action Scaling<br>in Finance DRL"] --> B["Input Data:<br>90-day Market Features"]
    B --> C["Methodology:<br>CNN vs MLP Agents"]
    C --> D["Computation:<br>Deep Reinforcement Learning<br>with Scaled Actions"]
    D --> E{"Comparison"}
    E --> F["Outcome: MLP Agent<br>Loss & Instability"]
    E --> G["Outcome: CNN Agent<br>Stability & Reward Increase"]