HYPE-MAD: high-yield python environment for molecular automation and data generation
Peivaste I., Makradi A., Mercuri F., Belouettar S.
Acta Mechanica, art. no. 150901, 2025
The exploration and discovery of new materials, particularly organic molecules and polymers, require efficient methods to navigate vast chemical spaces. Traditional experimental approaches, based on trial and error, are inadequate for this task due to their high resource demands. Computational methods, such as those based on quantum mechanical simulations, offer a practical alternative but are often hindered by the extensive manual effort and computational resources required when applied to large datasets. To address these challenges, we present a Python-based framework that automates the generation of atomic-scale models from molecular representations, optimizes their geometries, performs quantum mechanical calculations using a tight-binding approximation, and systematically organizes the computed properties for further analysis. By integrating widely used tools for molecular modeling and optimization, the framework enables high-throughput data generation, facilitating the rapid screening of chemical spaces. This capability is particularly important for renewable energy applications, such as hydrogen production through water splitting, where identifying efficient materials is critical. Furthermore, the framework supports the development of data-driven methods, providing large, high-quality datasets for machine learning models, and thus accelerates the materials discovery process.
doi:10.1007/s00707-025-04317-6