A Multi-Method Validation Framework for Large-Scale Multilingual Text Analytics

Jan 15, 2026·

Stefano Blando

Domenica Fioredistella Iezzi

· 0 min read

Abstract

To distinguish genuine findings from methodological artifacts, this paper proposes a validation framework based on method-invariant patterns. Analyzing 999,152 multilingual reviews across 18 independent techniques (from classical clustering to Transformers), we demonstrate that substantive content accounts for 95.4% of variance, while methodological choice explains less than 3%. The study confirms that robust patterns transcend specific algorithms and implementations. Furthermore, while BERT achieves peak accuracy (91.3%), classical approaches like SVM offer comparable performance (89.1%) with a 29-fold reduction in computational cost.

Type

Preprint / Working Paper

Publication

JADT 2026, Palermo, Italy (in review)

Last updated on Jan 15, 2026

NLP Methodology BERT Graph Neural Networks Deep Learning Validation Framework

Authors

Stefano Blando (he/him)

PhD Student in Artificial Intelligence

Stefano Blando is a PhD student in the National PhD Program in Artificial Intelligence at Scuola Superiore Sant’Anna and the University of Pisa. His research lies at the intersection of AI, agent-based modeling, and economics. He studies adaptive multi-agent systems, statistical verification of economic simulations, and robust quantitative methods for financial and socio-economic data.

Authors

Domenica Fioredistella Iezzi

← Statistical model checking of the Island Model: an established economic agent-based model of endogenous growth Apr 12, 2026

Network Topology Analysis and Machine Learning Techniques for Systemic Risk Prediction in U.S. Equity Markets Jan 15, 2026 →

No results found

A Multi-Method Validation Framework for Large-Scale Multilingual Text Analytics