AI News HubLIVE
站内改写2 min read

AlphaFold for Materials: MPA Achieves SOTA on 40 Industrial Tasks, a Major Breakthrough for AI4S

Deep Principle's new materials foundation model MPA (Materials Property Axiom) leverages LLM-inspired three-stage training to achieve state-of-the-art results on 40 real-world industrial datasets. By incorporating physics-guided alignment during mid-training and a hybrid readout head, MPA excels at predicting properties of unseen structures, marking a significant advance in AI for science.

Source量子位Author: 思邈

A new materials foundation model called MPA (Materials Property Axiom), developed by Deep Principle, is making waves in the AI for Science community. Inspired by the training methodology of large language models, MPA achieves state-of-the-art results on 40 real-world industrial property prediction tasks, bridging the gap between theoretical computation and experimental reality.

Traditional materials AI models often struggle when moving from simulated data to real-world experiments. The main issue lies in the training approach: most models are pretrained on perfect computed data and then fine-tuned on specific tasks, but they lack the "physical intuition" needed to generalize to noisy, scarce experimental data. MPA addresses this by adopting a three-stage training pipeline borrowed from LLMs: pre-training, mid-training, and fine-tuning.

The key innovation is the mid-training stage, which focuses on physics-guided alignment. Unlike standard models that skip this step, MPA uses massive first-principles calculation data to align the model with fundamental physical properties such as formation enthalpy and dipole moments. This helps the model learn transferable concepts—like "molecules with OH groups tend to have larger dipole moments"—rather than just memorizing atomic arrangements.

Another novel component is the Hybrid Readout head, designed to handle two distinct types of molecular properties: those that are size-independent (e.g., boiling point, bioactivity) and those that are size-dependent (e.g., formation enthalpy, heat capacity). The hybrid head combines an attention-based pooling path (for global properties) with an atom-wise summation path (for additive properties), controlled by a trainable parameter α. This allows the model to automatically decide which approach to favor for each property.

Experiments demonstrate MPA's superiority. Against a baseline without mid-training or hybrid readout, MPA reduces average error by 14.0% on random splits and 14.6% on scaffold splits—the latter being more challenging and realistic. When compared with five other leading models (ChemBERTa, ChemProp, Chemeleon, Uni-Mol2, Suiren), MPA achieves the best overall performance, winning 35 out of 40 tasks on scaffold splits.

These results confirm that MPA's strength lies in predicting properties of novel structures, exactly the scenario scientists face when designing new materials. By redefining the problem from "task adaptation" to "physics alignment," MPA provides a scalable framework that integrates first-principles data, experimental data, and task-specific fine-tuning. As computational and experimental data continue to grow, MPA offers a way to turn these resources into reusable predictive capabilities, moving beyond isolated single-purpose models toward a more general materials foundation model.

MPA has been integrated into Deep Principle's Agent product as a skill. Interested users can try it at sciclaw.cn (invitation code: CN-SUL0WEAB). For more details, see the MPA blog and technical report.