Accounting statement analysis at industry level. A gentle introduction to the compositional approach
ArXiv ID: 2305.16842 “View on arXiv”
Authors: Unknown
Abstract
Compositional data are contemporarily defined as positive vectors, the ratios among whose elements are of interest to the researcher. Financial statement analysis by means of accounting ratios a.k.a. financial ratios fulfils this definition to the letter. Compositional data analysis solves the major problems in statistical analysis of standard financial ratios at industry level, such as skewness, non-normality, non-linearity, outliers, and dependence of the results on the choice of which accounting figure goes to the numerator and to the denominator of the ratio. Despite this, compositional applications to financial statement analysis are still rare. In this article, we present some transformations within compositional data analysis that are particularly useful for financial statement analysis. We show how to compute industry or sub-industry means of standard financial ratios from a compositional perspective by means of geometric means. We show how to visualise firms in an industry with a compositional principal-component-analysis biplot; how to classify them into homogeneous financial performance profiles with compositional cluster analysis; and how to introduce financial ratios as variables in a statistical model, for instance to relate financial performance and firm characteristics with compositional regression models. We show an application to the accounting statements of Spanish wineries using the decomposition of return on equity by means of DuPont analysis, and a step-by-step tutorial to the compositional freeware CoDaPack.
Keywords: Compositional Data Analysis, Financial Ratios, Accounting Ratios, DuPont Analysis, Financial Statement Analysis, Corporate Equity (Equities)
Complexity vs Empirical Score
- Math Complexity: 4.0/10
- Empirical Rigor: 2.5/10
- Quadrant: Philosophers
- Why: The paper introduces compositional data analysis (CoDa) methods like log-ratios and geometric means, which are mathematically moderate but presented pedagogically as a ‘gentle introduction’ without heavy derivations. Empirical rigor is low as it focuses on conceptual application and a tutorial with existing software (CoDaPack) rather than extensive backtesting, real-time data processing, or performance metrics on live financial datasets.
flowchart TD
A["Research Goal<br>Accounting Statement Analysis<br>at Industry Level"] --> B["Data Preparation<br>Spanish Wineries Financial Statements"]
B --> C["Core Methodology<br>Compositional Data Analysis CoDa"]
C --> D["Computational Processes<br>1. Geometric Means<br>2. Compositional PCA Biplot<br>3. Cluster Analysis<br>4. Compositional Regression<br>5. DuPont ROE Decomposition"]
D --> E["Key Outcomes<br>Industry Averages<br>Visual Classification<br>Homogeneous Profiles<br>Performance Drivers<br>Tutorial via CoDaPack"]