Interpretable physics-informed machine learning approaches to accelerate electrocatalyst development
Abstract
Identifying exceptional electrocatalysts from the vast materials space remains a formidable challenge. Machine learning (ML) has emerged as a powerful tool to address this challenge, offering high efficiency while maintaining good accuracy in predictions. From this perspective, we provide a brief overview of recent advancements in ML for electrocatalyst discoveries. We emphasize the applications of physics-informed ML (PIML) models and explainable artificial intelligence (XAI) to electrocatalyst development, through which valuable physical and chemical insights can be distilled. Additionally, we delve into the challenges faced by PIML approaches, explore future directions, and discuss potential breakthroughs that could revolutionize the field of electrocatalyst development.
Keywords
INTRODUCTION
Excessive fossil fuel combustion since the Industrial Revolution has led to severe environmental problems such as global warming, extreme weather, and environmental pollution. In response, there is a concerted effort to develop green energy solutions to reduce our dependence on fossil fuels. Electrochemical reactions, including but not limited to hydrogen evolution reaction (HER), oxygen evolution reaction (OER), oxygen reduction reaction (ORR), carbon dioxide reduction reaction (CO2RR), and nitrogen reduction reaction (NRR), hold significant promise for storing and utilizing essential intermittent renewable energies in the near future[1,2]. Electrocatalysts are essential in increasing the reaction rate of the electrochemical reactions, thereby improving the conversion and utilization efficiency of renewable energies. However, the commonly used electrocatalysts are usually based on precious metals such as platinum, which are both expensive and scarce. In addition, the existing electrocatalysts also suffer from issues such as limited efficiency, durability, and scalability[3]. Therefore, discovering electrocatalysts that can solve the above issues is fundamental to the development of high-performance electrochemical energy conversion/storage devices, including fuel cells, batteries, and supercapacitors[4].
Over the past few decades, electrocatalyst discovery has been greatly accelerated by the rapid advancement of theoretical approaches and experimental capabilities[5]. For instance, density-functional theory (DFT) calculations have been widely used to predict various critical properties of electrocatalysts, such as formation energies, adsorption energies and d-band centers[6]. Based on thermodynamic and kinetic calculations, the free energy diagrams and micro-kinetic models can be constructed, which are essential to identifying catalysts with improved activity and selectivity[7,8]. Furthermore, high-throughput calculations have been applied to navigate vast materials and configurations space[9-12]. Semi-automated experiments have also been developed to accelerate the development of electrocatalysts. Although high-throughput computational and experimental methods can significantly reduce the development time and cost compared with the traditional trial-and-error approach, the vastness of the search space remains a grand challenge for these techniques to explore efficiently. In this regard, machine learning (ML) has emerged as an indispensable tool in the efficient discovery of electrocatalyst[13-16]. By leveraging large computational or experimental datasets amassed from the high-throughput methods, ML has been used to predict the key properties of electrocatalysts[17-22] [Figure 1]. Significant progress has been made in this area, with ML models achieving remarkable results. However, most of these models require large and complex neural networks, which come with high computational costs and a lack of physical and chemical interpretability. As a result, substantial efforts have been devoted to incorporating physical or chemical insights into ML models, which has led to the physics-informed ML (PIML) models and explainable artificial intelligence (XAI) methods [Figure 1]. These approaches not only aim to improve the accuracy and efficiency of predictions, but also provide valuable interpretability, making it easier to understand the results generated by ML models and eventually leading to effective and efficient design and development of functional materials. This area has seen significant advancements in recent years, highlighting the need for a comprehensive review to summarize the state-of-the-art findings.
Figure 1. Schematic diagram of ML-driven closed-loop catalyst discovery consisting of collecting data, building datasets, training ML models, and predicting materials’ properties to accelerate materials optimization. XAI and PIML approaches enable the interpretation of physical and chemical insights from the “black box” ML models. Reproduced with permission from refs[74,79,81,93]. Copyright 2021 Springer Nature, licensed under Creative Commons CC BY, copyright 2021 by the author(s) and licensed under Creative Commons CC BY, respectively. ML: Machine learning; XAI: explainable artificial intelligence; explainable artificial intelligence; PIML: physics-informed machine learning.
In this review, we summarize the recent advancement in ML for electrocatalysts, with a particular focus on the PIML models. We begin by discussing the progress made in the application of ML to electrocatalysts, providing an overview of fundamental concepts in this field. Next, we emphasize the necessity and advantages of integrating PIML models, highlighting their recent applications and significant contributions to the development of electrocatalysts. Finally, we conclude with a brief discussion of the challenges in the field, along with perspectives on future directions and potential breakthroughs in PIML for electrocatalysts.
PROGRESS OF ML FOR ELECTROCATALYSTS
Over the past decade, the high-throughput first-principles calculations have been intensively deployed and many large materials datasets have therefore been built, including Open Catalyst 2020 (OC20)[23], Open Catalyst 2022 (OC22)[24], Materials Project (MP)[25], the Open Quantum Materials Database (OQMD)[26], Two-Dimensional Materials Encyclopedia (2DMatPedia)[27], to name a few. These high-quality datasets are the basis to train high-fidelity ML models. Furthermore, Geometric matrices such as Coulomb matrix[28,29] and the Smooth Overlap of Atomic Positions (SOAP)[30-33] play an important role in capturing global or local geometrics for most material systems, from which descriptors such as electronegativity, d-band features, covalent radius and fingerprints[34] can be identified to evaluate electrochemical reactions. Thus, these geometric matrices and descriptors are also useful to construct ML models.
This section will briefly introduce the application of three ML techniques for electrocatalyst development: ML, deep learning (DL), and natural language processing. For each ML technique, we will start with the foundational theories of the models and then explore their recent applications in electrocatalysis.
ML
Linear regression (LR) is the simplest ML model. It assumes a linear relationship between the output (y) and the input descriptors (X1, X2 … Xn), written as:
where y is the predicted value of the model, w0 is the bias, wn is the regression coefficient of the n-th independent variable [n ∈ (1, 2, …, n)], and ϵ is the error for this model.
The Brønsted−Evans−Polanyi (BEP) principle and scaling relationships are two well-known empirical linear rules in catalysis. The BEP relation states that the activation barrier of a reaction (Ea) linearly scales with the corresponding reaction energy (ΔE), i.e., Ea = αΔE + β. Scaling relationships refer to the widely observed linearity in the binding energy between different adsorbates which are similarly bound to the catalysts, that is, ΔE1= mΔE2 + b. These linear relationships have been intensively applied to simplify and accelerate the catalyst design. However, they also reveal inherent limitations in performance optimization[35-37].
LR can be used to construct BEP principles and explore the scaling relationship. For example, for the high-entropy alloy (HEA) based electrocatalyst, LR has been used to predict *OH and *O adsorption energy on HEA IrPdPtRhRu, which suggests a new HEA composition with better performance than pure Pt(111)[38]. In addition, LR has been deployed to discover the active sites responsible for ORR in nonplatinum porphyrin-based electrocatalysts, in which two types of active sites are identified: the Co site associated with pyropolymer and the Co particles covered by oxide layers[39]. Despite its simple formulation, the outstanding generalization and interpretability make LR appealing and widely deployed in catalyst design.
Support vector machine (SVM) is another widely used ML model[40]. It was initially designed to find a hyperplane to separate samples from different groups, which is a useful solution to a classification task. By employing kernel functions, SVM is able to handle nonlinear relationships by mapping samples from a lower dimension to a higher dimension. Moreover, support vector regression (SVR) is capable of handling regression tasks by introducing tolerance margin ϵ into the SVM. Both SVM and SVR have wide applications in the electrochemical domain[41-45]. For example, Tamtaji et al. used SVR to predict the Gibbs free energies of various reaction intermediates on single-atom catalysts (SACs) supported by graphene and porphyrin[44]. Based on the trained SVR model, they reported that the most crucial factors in this system are the number of pyridinic nitrogen atoms, the number of d electrons, and the number of valence electrons of the reaction intermediate.
Random forest (RF), which is regarded as one of the most successful ensemble models of ML, employs the Bagging algorithm. It uses decision trees as its base learner while introducing the random feature selection method in the training process. RF is also useful for predicting the properties of electrocatalysts[46-53]. In a recent investigation on double-atom catalysts, SVR, RF, Xtreme gradient boosting regression (XGBR), and artificial neural network (ANN) were utilized to predict the Gibbs free energy change of hydrogen adsorption[49]. Among them, RF exhibits the best prediction performance owing to its ensemble nature and is considered as an effective model as it allows automatic feature selection in the training process. Another well-known ensemble model is gradient boosting regression (GBR), with which improved performance has been reported in various cases such as predicting free energy of N2 electroreduction reaction and selecting MXenes as HER catalysts[47,48,52,54,55]. While these ML models have shown promising performance for the development of electrocatalysts, they often suffer from issues such as strong dependence on the features, the models’ generalizability, and the extension to more complex prediction.
DL
Apart from the aforementioned ML models, DL has attracted increasing interest for catalyst development due to its capability for more complicated material systems and properties. DL includes convolutional neural networks (CNNs) and graph neural networks (GNNs). CNNs can deal with data in Euclidean space, such as images, videos, etc., and have shown promising performance for the explorations of electrocatalyst materials. For example, Yang et al. employed a CNN model to predict the adsorption energies of various adsorbates on 2D SACs based on their electronic density of states (DOS). They achieved a low mean absolute error (MAE) of 0.06 eV across various adsorbates such as CO2, COOH, CO, and CHO. Combining the CNN model with the volcano plot in the analysis of catalysis performance, the framework is useful for designing SACs as the electrocatalyst for CO2RR[56]. On the other hand, GNNs are designed to solve problems of non-Euclidean data including social networks, knowledge graphs, and molecules/materials. Given the node, edge and global attributes, the information is transformed using “message passing” algorithms, which can be written as:
where m is the message, h and e are the embeddings of nodes and edges, respectively, Mt denotes the message update function, and Ut indicates the vertex update functions. Afterward, the readout layer calculates node, edge and global embeddings. Finally, the results can be predicted by adding fully connected layers to the embeddings corresponding to the task[57]. GNNs usually perform better than conventional ML models without the requirement of dedicated feature designs. Specifically, simple and accessible features such as electronegativity, covalent radius, and group number[58] can be directly used as the input of GNNs. Recent applications in ML interatomic potentials (MLIPs) have illustrated that GNNs have better generalization and higher accuracy than other models[59-63]. Batatia et al. proposed a new equivariant message passing GNN (MPNN) model called MACE[64,65], one of the most accurate MLIPs, as evidenced by F1 score of 0.669 eV/atom, coefficient of determination (R2) of 0.697 eV/atom, and MAE of
Natural language processing
The advent of large langue models (LLMs)-based ChatGPT represents the most significant progress in nature language processing, which has drawn great attention of the scientific community[68]. LLMs are often built on transformer architecture designed to effectively process sequential data such as text[69,70]. The critical component of transformer architecture is the attention mechanism, which focuses on relevant parts of the dataset to enhance performance. However, LLMs can sometimes produce misleading results in professional contexts due to the immense diversity of large datasets[71]. Fortunately, transfer learning algorithms can be utilized to adapt the universal LLMs for specific tasks[72]. Although the application of LLMs in electrocatalysis is still in the early stages, various attempts have been made. For example, Beltagy et al. developed the pretrained BERT model, SciBERT, to automatically extract scientific knowledge from existing papers[73]. Other efforts include CataLM[74] and InCrEDible-MaT-GO[75]. However, the application of LLMs on functional materials such as electrocatalyst development may require very large and complex models, which often lead to difficulty in understanding their perdition. Another issue is that language-like representation for the material structure often lacks inherent physics or chemistry information, which might require more complicated models or training processes.
PIML FOR ELECTROCATALYSTS
ML has achieved great success in catalysis research. However, the predictions made by ML models are based on previous experiences rather than on understanding the underlying mechanism. While those ML models demonstrate high accuracy in interpolation, they fall short in inspiring chemists to creatively discover unseen electrocatalysts because they are not effective for extrapolation. One solution to this limitation is to develop PIML for electrocatalyst design with better generalization and interpretability, in which the models will naturally give physical or chemical explanations for their predictions. The other solution is the XAI approach. Although both PIML and XAI models aim to enhance model interpretability, they differ fundamentally in their approaches and implementation. PIML models incorporate physical and chemical laws directly into the model architecture and training process, ensuring that predictions are consistent with established scientific principles. In contrast, XAI methods mainly focus on post-training analysis and interpretation of model behavior. Each approach has its own limitations: PIML models are restricted to systems with well-defined physical laws and might struggle with complex or unknown physical systems, while XAI methods may produce inconsistent or contradictory interpretations due to their high sensitivity to hyperparameters. Importantly, PIML and XAI are complementary: PIML can leverage XAI techniques to better illustrate how physical/chemical principles affect model outputs; XAI can increase its credibility with the help of PIML[76]. This section will briefly introduce these methods and their applications in the electrochemical realm.
PIML models
PIML models are often implemented through adding physical constraints. This can be achieved by modifying loss functions, adding more physics or chemistry-based features, and adapting model architecture. Based on the architecture of the models, they can be grouped as GNN-based, kernel-based, or equivariant GNN-based models.
GNN-based model
TinNet is one of the most famous GNN-based models that integrate the d-band theory into the model to predict the adsorption energies of adsorbates on transition-metal surfaces[77]. Two sequential parts are included for the TinNet model, i.e., the regression module and the theory module. The regression module is similar to the approach used in the crystal graph convolutional neural network (CGCNN), where the flattened readout features of the adsorbate-substrate system can be integrated with the d-band theory to make the final prediction of adsorption energies based on minimization of the loss function between the predicted properties and physical features in the output layer[58]. By considering the metal sp-state, d-state (including Pauli repulsion and orbital hybridization), and the projected DOS onto the adsorbate orbitals, the d-band theory for chemical bonding at the metal surface can be established [Figure 2A]. Compared to the model with the fully connected neural network (FCNN) and CGCNN, TinNet shows comparable prediction performance but with physical insights, highlighting the importance of the frontier molecular orbital theory and electronic structure methods. Moreover, TinNet demonstrates improved generalization capability because it is applicable to varied adsorbates and facets, as shown in Figure 2B-D.
Figure 2. (A) The model architecture of TinNet. The information transits from the graph representation of an adsorption system to the theory module to calculate adsorption energy ΔE, where ρa1…ρai indicates the projected DOS onto the adsorbate frontier orbital(s) and μ1…μj indicates d-band moments; (B-D) DFT-calculated vs. TinNet-predicted. (B) *OH adsorption energies of {111}W-TM surfaces, (C) *O adsorption energies at the atop the site of {111}-terminated alloy surfaces, (D) *N adsorption energies at the hollow site of (100)-terminated alloy surfaces; (E) The model architecture of ACE-GCN; (F and G) Predictions of conformational stability of unrelaxed. (F) NO* and (G) OH* using the ACE-GCN model generated by SurfGraph. Reproduced with permission from refs[74,75], licensed under Creative Commons CC BY. DOS: Density of states; DFT: density-functional theory; TM: transition metal; ACE-GCN: adsorbate chemical environment-based graph convolution neural network.
Another attempt is to integrate local bonding information into ML framework design. Ghanekar et al. proposed the adsorbate chemical environment-based graph convolution neural network (ACE-GCN)[78], which is a screening workflow considering diverse atomistic configurations [Figure 2E]. The performance of ACE-GCN was successfully verified by the cases of NO* adsorption on a Pt3Sn (111) alloy surface [Figure 2F] and OH* adsorbed on a stepped Pt (221) facet [Figure 2G]. Both cases are very complicated in electrocatalysis: one involves strong binding of adsorbates on low-symmetry alloyed surfaces, while the other pertains to directionally dependent adsorption on defective surface structures. The chemical insights originated from ACE-GCN and replaced the full graph with subgraphs, which represent the local environment of the structure and chemistry. These subgraphs lead to not only accurate predictions of adsorption energy but also a faster training process for considering fewer atoms [Figure 2E]. It is also noted that the performance ACE-GCN might be further improved if the short-range interaction could be included more comprehensively in the subgraph instead of only the first-nearest neighboring interaction in the current model.
The CGCNN-HD is the revised version of CGCNN, using the hyperbolic tangent activation function and dropout algorithm to add the uncertainty qualification for each prediction[79]. Recently, Abed et al. utilized CGCNN-HD to identify the Pourbaix electrochemical stability. They also used metal−oxygen covalency calculated by DFT as another screening descriptor to further predict the bulk stability of the oxide. The combination of ML and calculated descriptors finally screens out the candidate Ru0.6Cr0.2Ti0.2O2 with high durability and slow degradation rate[80]. The performance of GNNs such as model transparency can be further improved by including the attention mechanisms to the GNN models. For example, Zhang et al. proposed the atomic graph attention (AGAT) network to discover high-entropy electrocatalysts (HEECs) with enormous local environments and phase space, which are challenging using experiments and DFT calculations[81]. Figure 3A shows the model architecture of AGAT. The attention score was calculated on the edges of the graph to decide how much information can be passed from the source atoms, which naturally represents the importance of the nodes. The model was trained and tested for ORR on the RuRhPdIrPt and NiCoFePdPt surfaces, and two candidates were recommended (Ni0.13Co0.13Fe0.13Pd0.10Pt0.50 and Ni0.10Co0.10Fe0.10Pd0.30Pt0.40). AGAT further revealed that the distance between adsorbates and the HEA surface is a key factor for attention score, highlighting the significance of the local environment [Figure 3B]. The AGAT model also suggested that HEECs can circumvent scaling relations due to their diverse local environments.
Figure 3. (A) The model architecture of AGAT. The top panel denotes the AGAT layer, and the bottom panel denotes the AGAT model; (B) The interpretability of the AGAT model. The attention scores of the energy and forces models compared with the energy and forces variations; (C) ML-predicted vs. DFT-calculated adsorption enthalpies in 5-fold cross-validation using RBF-GPR, WWL-GPR, and XGBoost for the simple adsorbates database, respectively; (D) The model architecture of WWL-GPR. The adsorption enthalpy for the relaxed structure is predicted by representing the initial structure as a graph. Node attributes are calculated based on the gas-phase molecule and the pristine surface. The similarity between graphs is assessed using the WWL graph kernel, and this information is then used in a GPR model. Reproduced with permission from refs[78,79]. Copyright 2023 Elsevier and Copyright 2022 Springer Nature, respectively. AGAT: Atomic graph attention; ML: machine learning; DFT: density-functional theory; RBF-GPR: radial basis function and Gaussian progress regression; WWL-GPR: Wasserstein Weisfeiler-Lehman graph kernel and GPR.
Kernal-based models
Gaussian process regression (GPR) is a Kernal-based ML technique that models complex relationships between variables by treating the regression function as a random process governed by prior probability distributions. The advantage of GPR is that it not only outputs the predicted result, but also gives confidence for the prediction. GPR naturally quantifies uncertainty in its predictions by computing both mean estimations and confidence intervals since it is a probabilistic method[82]. As a result, GPR has been widely used to predict surface phase diagrams, model MLIPs, etc.[83,84].
Xu et al. proposed a data-efficient model called the Wasserstein Weisfeiler-Lehman graph kernel and GPR (WWL-GPR) to predict the binding motifs and adsorption enthalpies of various adsorbates on transition metals (TMs) and their alloys[85,86]. It was trained on the TM dataset consisting of Cu, Rh, Pd, and Co and achieved a test root mean square error (RMSE) of 0.2 eV. Figure 3C shows that WWL-GPR outperforms RBF-GPR and XGBoost for 41 complex adsorbates in various adsorption motifs on surfaces of Cu, Co, Pd and Rh. Its accuracy is comparable to the DFT calculations owing to its node attributes through the combination of geometrical information from graph representation and easily accessible physics-informed attributes [e.g., d-band moments and highest occupied molecular orbital/lowest unoccupied molecular orbital (HOMO/LUMO) energy levels]. The node embeddings were then generated to further calculate the Wasserstein distance between their distributions. Based on the Wasserstein distance-based graph similarities, the GPR can be used to make predictions of adsorption enthalpies [Figure 3D]. Understanding electron behavior in metal electrodes is crucial for energy storage and conversion devices (batteries, capacitors, and electrocatalysts). Grisafi et al. introduced an equivariant kernel-based method that combines long-range interactions to accurately predict electron density responses in metal electrodes under various electric field conditions, achieving quantum-level accuracy at a fraction of the computational cost of traditional methods[87].
Equivariant GNN-based models
The GNNs are usually E(3)-invariant models considering the unchanged properties under the Euclidean group of transformations, which includes translations, rotations, and reflections in three-dimensional space. However, the interactions between molecules and materials are far beyond this. Unlike invariance, equivariance means that the output transforms in the same way as the input, which can be written as:
where ϕ(·) is the nonlinear function, x denotes the input vector, Tg is a translation on the input vector, and Sg is an equivalent translation on the output set[88]. The formula turns to define invariance when Sg = I, indicating that invariance is just a special case of equivariance. This means that SE(3)-equivariant GNNs usually have more complex geometric constraints than invariant models[89]. Consequently, the equivariant GNN consistently outperforms the invariant GNN in predicting forces, although the difference in energy prediction is insignificant. OC20 and OC22 are two well-known open-source electrochemical reaction databases[23,24]. Three primary tasks were proposed, namely structure to energy and forces (S2EF), initial structure to relaxed energy (IS2RE), and initial structure to relaxed structure (IS2RS). Many equivariant models, including spherical channel network (SCN)[90], equivariant SCN (eSCN)[91], and EquiformerV2[92], have been tested, where EquiformerV2 model is state-of-the-art in most cases.
XAI approaches
Generalized additive model (GAM)[93] and sure independence screening and sparsifying operator (SISSO)[94] are considered interpretable ML models, which are often referred to as “glass box” models due to their inherent simplicity and transparency. GAM is built by constructing additive nonlinear functions of each feature, while SISSO is built on the combination of given features and mathematical operators (e.g., +, -, ×, ÷, log, exp). Both models can give physical or chemical insights into electrochemical reactions, including finding structure descriptors and identifying feature importance, etc.[95-100]. For example, although widely used, the empirical BEP relationship does not explicitly consider the geometric and compositional properties of catalysts, and thus has limited applicability to structure sensitivity exploration and the rational design of efficient catalysts. In order to solve this issue, Shu et al. applied SISSO with a multitask learning strategy to discover a two-dimensional descriptor called the topologically under-coordinated number, which can accurately describe the structure sensitivity[97].
The other type of XAI is based on the post-hoc explanation methods, which are usually model-agnostic and can extract physical or chemical insight after the training. Typically, there are two approaches to realize the post-hoc explanation. One is visualization, such as local interpretable model-agnostic explanation (LIME)[101] and t-distributed stochastic neighbor embedding (t-SNE)[102]; the other is to calculate the feature importance, e.g., Shapley Additive explanations (SHAP)[103]. Because the post-hoc interpretation approach is applicable to all models and can be used for both local and global interpretation, it has become one of the top choices of XAI[46,54,75,104-108]. For example, Roy et al. utilized LIME, permutation feature importance (PFI), and accumulated local effects (ALE) for local/global interpretation of the black-box model, and successfully established the scaling relationship between CO2RR intermediates and HEA surfaces[53]. Moreover, Zhang et al. discovered that SACs in the pyrrole-type coordination could exhibit superior catalytic activity towards NRR when the d-orbitals are exactly half occupied and the difference in covalent radius is approximately 140 pm. Their model was based on GBR and employed SHAP methods to interpret the results[48]. Recently, Zhong et al. used active ML to accelerate DFT screening for CO2 reduction electrocatalysts to ethylene, where t-SNE visualization reveals that Cu-Al alloys have the highest density of adsorption sites with optimal CO binding energies. This finding successfully guided experiment to achieve record-high Faradaic efficiency over 80% at 400 mA/cm2[109].
DISCUSSION AND PERSPECTIVES
PIML has demonstrated great potential in developing high-performance electrocatalysts at reduced cost and time. However, there are still some limitations. First, the integration of physical or chemical insights into PIML models often adds complexity to the model training process, especially when it comes to large datasets. In essence, more complex physical or chemical constraints can result in higher interpretability but lower efficiency, and vice versa[110,111]. As a result, the trade-off between interpretability and efficiency needs to be navigated. Second, the use of prior knowledge from experts, such as physics- or chemistry-based descriptors, remains essential for PIML models. While some PIML models, such as equivariant GNNs, are highly integrated with physical constraints, their complexity can make it difficult to interpret the underlying physics or chemistry. The combination of XAI with these models may offer a solution to improve their interpretability. Last but not least, a lack of benchmarks to evaluate the interpretability performance of PIML models remains a challenge.
Despite these limitations, PIML is poised to become an indispensable tool in materials science. Notable progress has been made, such as the work by Szymanski et al., who established an automated laboratory (A-lab) and identified 41 novel materials from 58 candidates in just 17 days[112]. This achievement encourages further development of more effective A-labs, where PIML will play a pivotal role. In the future, PIML-based generative models may directly generate physically and chemically meaningful candidates, providing experts with valuable physical/chemical insight and reducing the observation error. Moreover, integrating LLMs with PIMLs could enhance the interactivity of models with humans, enabling more efficient communication and decision-making. In addition, integrating robotics with these models will accelerate the synthesis and characterization of new materials with minimal human intervention. Figure 4 shows the potential role of PIMLs in an A-lab. In this process, researchers can propose their requirements to LLMs, which will generate model parameters to PIML for experimental predictions. These predictions will guide robots to synthesize new materials, whose structures will be analyzed and returned to researchers by the PIML-based generative models. Data from these new materials will be added to the database for training next-generation PIML. This active learning strategy will be added to the database to train future iterations of PIML models, creating an active learning loop. This strategy holds the promise of transforming the materials science field, with PIML playing an increasingly critical role in driving scientific discovery.
Figure 4. Schematic diagram of PIML applications in an A-lab environment. The process begins with researchers posing their requirements through LLMs, which generate physical parameters for PIML-based experimental predictions. These predictions guide robotic synthesis while PIML generative models simultaneously predict candidate structures. The resulting materials and structural data are validated and screened into the database, enabling active learning to continuously improve the PIML model. PIML: Physics-informed machine learning; A-lab: automated laboratory; LLMs: large langue models.
CONCLUSION
PIML has emerged as a transformative approach in electrocatalyst development, combining predictive power with a deeper understanding of the underlying mechanisms. Despite significant progress, several challenges remain in data quality, model interpretability, and computational efficiency. As these challenges are addressed, PIML approaches are poised to play an increasingly important role in the development of efficient, cost-effective electrocatalysts for clean energy applications. By leveraging the synergy between ML and physical sciences, PIML has the potential to accelerate the discovery of novel materials and improve the performance of existing electrocatalysts. This progress may significantly contribute to the global transition to sustainable energy systems, helping address pressing environmental concerns and enabling a cleaner, more efficient energy future.
DECLARATIONS
Authors’ contributions
Original draft: Wu, H.
Visualization, writing editing: Wu, H.
Review and editing: Chen, M., Cheng, H., Yang, T., Zeng, M., Yang, M.
Funding acquisition: Yang, M.
Availability of data and materials
All detailed materials that support the findings are available from the corresponding author upon reasonable request.
Financial support and sponsorship
This work was supported by the Hong Kong Polytechnic University (project numbers: P0042711, P0042711 and P0048122) and Guangdong Natural Science Foundation (project number: 2024A1515010031).
Conflicts of interest
All authors declared that there are no conflicts of interest.
Ethical approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Copyright
© The Author(s) 2025.
REFERENCES
1. Seh, Z. W.; Kibsgaard, J.; Dickens, C. F.; Chorkendorff, I.; Nørskov, J. K.; Jaramillo, T. F. Combining theory and experiment inelectrocatalysis: insights into materials design. Science 2017, 355, eaad4998.
2. Ding, K.; Yang, T.; Leung, M. T.; et al. Recent advances in the data-driven development of emerging electrocatalysts. Curr. Opin. Electrochem. 2023, 42, 101404.
3. Banoth, P.; Kandula, C.; Kollu, P. Introduction to electrocatalysts. In Noble metal-free electrocatalysts: new trends in electrocatalysts for energy applications. Vol. 2; Washington: American Chemical Society, 2022; pp. 1-37.
4. Santos, D. M. F.; Šljukić, B. Advanced materials for electrochemical energy conversion and storage devices. Materials 2021, 14, 7711.
5. Chen, L.; Zhang, X.; Chen, A.; Yao, S.; Hu, X.; Zhou, Z. Targeted design of advanced electrocatalysts by machine learning. Chin. J. Catal. 2022, 43, 11-32.
6. Liao, X.; Lu, R.; Xia, L.; et al. Density functional theory for electrocatalysis. Energy. Environ. Mater. 2022, 5, 157-85.
7. Chen, Z. W.; Li, J.; Ou, P.; et al. Unusual Sabatier principle on high entropy alloy catalysts for hydrogen evolution reactions. Nat. Commun. 2024, 15, 359.
8. Peng, J.; Schwalbe-Koda, D.; Akkiraju, K.; et al. Human- and machine-centred designs of molecules and materials for sustainability and decarbonization. Nat. Rev. Mater. 2022, 7, 991-1009.
9. Yang, T.; Song, T. T.; Zhou, J.; et al. High-throughput screening of transition metal single atom catalysts anchored on molybdenum disulfide for nitrogen fixation. Nano. Energy. 2020, 68, 104304.
10. Yang, T.; Zhou, J.; Song, T. T.; Shen, L.; Feng, Y. P.; Yang, M. High-throughput identification of exfoliable two-dimensional materials with active basal planes for hydrogen evolution. ACS. Energy. Lett. 2020, 5, 2313-21.
11. Shen, L.; Zhou, J.; Yang, T.; Yang, M.; Feng, Y. P. High-throughput computational discovery and intelligent design of two-dimensional functional materials for various applications. Acc. Mater. Res. 2022, 3, 572-83.
12. Zhou, J.; Shen, L.; Yang, M.; Cheng, H.; Kong, W.; Feng, Y. P. Discovery of hidden classes of layered electrides by extensive high-throughput material screening. Chem. Mater. 2019, 31, 1860-8.
13. Steinmann, S. N.; Wang, Q.; Seh, Z. W. How machine learning can accelerate electrocatalysis discovery and optimization. Mater. Horiz. 2023, 10, 393-406.
14. Liu, C.; Senftle, T. P. Finding physical insights in catalysis with machine learning. Curr. Opin. Chem. Eng. 2022, 37, 100832.
15. Xin, H.; Mou, T.; Pillai, H. S.; Wang, S.; Huang, Y. Interpretable machine learning for catalytic materials design toward sustainability. Acc. Mater. Res. 2024, 5, 22-34.
16. Kayode, G. O.; Montemore, M. M. Latent variable machine learning framework for catalysis: general models, transfer learning, and interpretability. JACS. Au. 2024, 4, 80-91.
17. Rangarajan, S. Chapter 6 - Artificial intelligence in catalysis. In Artificial intelligence in manufacturing, Elsevier, 2024; pp. 167-204.
18. Zhang, Y.; Peck, T. C.; Reddy, G. K.; et al. Descriptor-free design of multicomponent catalysts. ACS. Catal. 2022, 12, 10562-71.
19. Jäger, M. O. J.; Ranawat, Y. S.; Canova, F. F.; Morooka, E. V.; Foster, A. S. Efficient machine-learning-aided screening of hydrogen adsorption on bimetallic nanoclusters. ACS. Comb. Sci. 2020, 22, 768-81.
20. Fan, X.; Chen, L.; Huang, D.; et al. From single metals to high-entropy alloys: how machine learning accelerates the development of metal electrocatalysts. Adv. Funct. Mater. 2024, 34, 2401887.
21. Park, Y.; Hwang, C.; Bang, K.; et al. Machine learning filters out efficient electrocatalysts in the massive ternary alloy space for fuel cells. Appl. Catal. B. Environ. 2023, 339, 123128.
22. Esterhuizen, J. A.; Goldsmith, B. R.; Linic, S. Interpretable machine learning for knowledge generation in heterogeneous catalysis. Nat. Catal. 2022, 5, 175-84.
23. Chanussot, L.; Das, A.; Goyal, S.; et al. Open Catalyst 2020 (OC20) Dataset and community challenges. ACS. Catal. 2021, 11, 6059-72.
24. Tran, R.; Lan, J.; Shuaibi, M.; et al. The Open Catalyst 2022 (OC22) Dataset and challenges for oxide electrocatalysts. ACS. Catal. 2023, 13, 3066-84.
25. Jain, A.; Ong, S. P.; Hautier, G.; et al. Commentary: the materials project: a materials genome approach to accelerating materials innovation. APL. Mater. 2013, 1, 011002.
26. Saal, J. E.; Kirklin, S.; Aykol, M.; Meredig, B.; Wolverton, C. Materials design and discovery with high-throughput density functional theory: the Open Quantum Materials Database (OQMD). JOM. 2013, 65, 1501-9.
27. Zhou, J.; Shen, L.; Costa, M. D.; et al. 2DMatPedia, an open computational database of two-dimensional materials from top-down and bottom-up approaches. Sci. Data. 2019, 6, 86.
28. Montavon, G.; Hansen, K.; Fazli, S.; et al. Learning invariant representations of molecules for atomization energy prediction. In Proceedings of the 26th International Conference on Neural Information Processing Systems, Curran Associates Inc.: Red Hook, USA, 2012; Vol 1, pp 440-8.
29. Rupp, M.; Tkatchenko, A.; Müller, K. R.; von, L. O. A. Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 2012, 108, 058301.
30. Bartók, A. P.; Kondor, R.; Csányi, G. On representing chemical environments. Phys. Rev. B. 2013, 87, 184115.
31. Willatt, M. J.; Musil, F.; Ceriotti, M. Feature optimization for atomistic machine learning yields a data-driven construction of the periodic table of the elements. Phys. Chem. Chem. Phys. 2018, 20, 29661-8.
32. Jäger, M. O. J.; Morooka, E. V.; Federici, C. F.; Himanen, L.; Foster, A. S. Machine learning hydrogen adsorption on nanoclusters through structural descriptors. npj. Comput. Mater. 2018, 4, 96.
33. De, S.; Bartók, A. P.; Csányi, G.; Ceriotti, M. Comparing molecules and solids across structural and alchemical space. Phys. Chem. Chem. Phys. 2016, 18, 13754-69.
34. Mai, H.; Le, T. C.; Chen, D.; Winkler, D. A.; Caruso, R. A. Machine learning for electrocatalyst and photocatalyst design and discovery. Chem. Rev. 2022, 122, 13478-515.
35. Nørskov, J. K.; Rossmeisl, J.; Logadottir, A.; et al. Origin of the overpotential for oxygen reduction at a fuel-cell cathode. J. Phys. Chem. B. 2004, 108, 17886-92.
36. Motagamwala, A. H.; Ball, M. R.; Dumesic, J. A. Microkinetic analysis and scaling relations for catalyst design. Annu. Rev. Chem. Biomol. Eng. 2018, 9, 413-50.
37. Pérez-Ramírez, J.; López, N. Strategies to break linear scaling relationships. Nat. Catal. 2019, 2, 971-6.
38. Batchelor, T. A.; Pedersen, J. K.; Winther, S. H.; Castelli, I. E.; Jacobsen, K. W.; Rossmeisl, J. High-entropy alloys as a discovery platform for electrocatalysis. Joule 2019, 3, 834-45.
39. Artyushkova, K.; Pylypenko, S.; Olson, T. S.; Fulghum, J. E.; Atanassov, P. Predictive modeling of electrocatalyst structure based on structure-to-property correlations of x-ray photoelectron spectroscopic and electrochemical measurements. Langmuir 2008, 24, 9082-8.
40. Hearst, M.; Dumais, S.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE. Intell. Syst. Their. Appl. 1998, 13, 18-28.
41. Sun, H.; Li, Y.; Gao, L.; et al. High throughput screening of single atomic catalysts with optimized local structures for the electrochemical oxygen reduction by machine learning. J. Energy. Chem. 2023, 81, 349-57.
42. Arjmandi, M.; Fattahi, M.; Motevassel, M.; Rezaveisi, H. Evaluating algorithms of decision tree, support vector machine and regression for anode side catalyst data in proton exchange membrane water electrolysis. Sci. Rep. 2023, 13, 20309.
43. Hossain, S. S.; Ali, S. S.; Rushd, S.; Ayodele, B. V.; Cheng, C. K. Interaction effect of process parameters and Pd-electrocatalyst in formic acid electro-oxidation for fuel cell applications: implementing supervised machine learning algorithms. Int. J. Energy. Res. 2022, 46, 21583-97.
44. Tamtaji, M.; Chen, S.; Hu, Z.; Goddard, I. W. A.; Chen, G. A surrogate machine learning model for the design of single-atom catalyst on carbon and porphyrin supports towards electrochemistry. J. Phys. Chem. C. 2023, 127, 9992-10000.
45. Anbari, E.; Adib, H.; Iranshahi, D. Experimental investigation and development of a SVM model for hydrogenation reaction of carbon monoxide in presence of Co–Mo/Al2O3 catalyst. Chem. Eng. J. 2015, 276, 213-21.
46. Sun, J.; Chen, A.; Guan, J.; et al. Interpretable machine learning-assisted high-throughput screening for understanding NRR electrocatalyst performance modulation between active center and C-N coordination. Energy. Environ. Mater. 2024, 7, e12693.
47. Tan, S.; Wang, R.; Song, G.; et al. Machine learning and Shapley Additive Explanation-based interpretable prediction of the electrocatalytic performance of N-doped carbon materials. Fuel 2024, 355, 129469.
48. Zhang, Y.; Wang, Y.; Ma, N.; Fan, J. Directly predicting N2 electroreduction reaction free energy using interpretable machine learning with non-DFT calculated features. J. Energy. Chem. 2024, 97, 139-48.
49. Wei, C.; Shi, D.; Yang, Z.; et al. Data-driven design of double-atom catalysts with high H2 evolution activity/CO2 reduction selectivity based on simple features. J. Mater. Chem. A. 2023, 11, 18168-78.
50. Ying, Y.; Fan, K.; Luo, X.; Qiao, J.; Huang, H. Unravelling the origin of bifunctional OER/ORR activity for single-atom catalysts supported on C2N by DFT and machine learning. J. Mater. Chem. A. 2021, 9, 16860-7.
51. Lin, S.; Xu, H.; Wang, Y.; Zeng, X. C.; Chen, Z. Directly predicting limiting potentials from easily obtainable physical properties of graphene-supported single-atom electrocatalysts by machine learning. J. Mater. Chem. A. 2020, 8, 5663-70.
52. Lu, S.; Song, P.; Jia, Z.; et al. Symbolic transform optimized convolutional neural network model for high-performance prediction and analysis of MXenes hydrogen evolution reaction catalysts. Int. J. Hydrogen. Energy. 2024, 85, 200-9.
53. Roy, D.; Charan, M. S.; Das, A.; Pathak, B. Unravelling CO2 reduction reaction intermediates on high entropy alloy catalysts: an interpretable machine learning approach to establish scaling relations. Chemistry 2024, 30, e202302679.
54. Wang, Y.; Zhang, Y.; Ma, N.; et al. Machine learning accelerated catalysts design for CO reduction: an interpretability and transferability analysis. J. Mater. Sci. Technol. 2025, 213, 14-23.
55. Jia, X.; Li, H. Machine learning enabled exploration of multicomponent metal oxides for catalyzing oxygen reduction in alkaline media. J. Mater. Chem. A. 2024, 12, 12487-500.
56. Yang, H.; Zhao, J.; Wang, Q.; et al. Convolutional neural networks and volcano plots: screening and prediction of two-dimensional single-atom catalystsar. arXiv2024, arXiv:2402.03876. Available online: https://doi.org/10.48550/arXiv.2402.03876. (accessed 15 Jan 2025)
57. Gilmer, J.; Schoenholz, S. S.; Riley, P. F.; Vinyals, O.; Dahl, G. E. Neural message passing for quantum chemistry. arXiv2017, arXiv:1704.01212. Available online: https://doi.org/10.48550/arXiv.1704.01212. (accessed 15 Jan 2025)
58. Xie, T.; Grossman, J. C. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 2018, 120, 145301.
59. Park, Y.; Kim, J.; Hwang, S.; Han, S. Scalable parallel algorithm for graph neural network interatomic potentials in molecular dynamics simulations. J. Chem. Theory. Comput. 2024, 20, 4857-68.
60. Merchant, A.; Batzner, S.; Schoenholz, S. S.; Aykol, M.; Cheon, G.; Cubuk, E. D. Scaling deep learning for materials discovery. Nature 2023, 624, 80-5.
61. Deng, B.; Zhong, P.; Jun, K.; et al. CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nat. Mach. Intell. 2023, 5, 1031-41.
62. Chen, C.; Ong, S. P. A universal graph deep learning interatomic potential for the periodic table. Nat. Comput. Sci. 2022, 2, 718-28.
63. Tang, D.; Ketkaew, R.; Luber, S. Machine learning interatomic potentials for heterogeneous catalysis. Chemistry 2024, 30, e202401148.
64. Batatia, I.; Kovacs, D. P.; Simm, G. N. C.; Ortner, C.; Csányi, G. MACE: higher order equivariant message passing neural networks for fast and accurate force fields. arXiv2022, arXiv:2206.07697. Available online: https://doi.org/10.48550/arXiv.2206.07697 (accessed 15 Jan 2025)
65. Batatia, I.; Batzner, S.; Kovács, D. P.; et al. The design space of e (3)-equivariant atom-centered interatomic potentials. arXiv2022, arXiv:2205.06643. Available online: https://doi.org/10.48550/arXiv.2205.06643 (accessed 15 Jan 2025)
66. Riebesell, J.; Goodall, R. E. A.; Benner, P.; et al. Matbench discovery - an evaluation framework for machine learning crystal stability prediction. arXiv2023, arXiv:2308.14920. Available online: https://doi.org/10.48550/arXiv.2308.14920 (accessed 15 Jan 2025)
67. Batatia, I.; Benner, P.; Chiang, Y.; et al. A foundation model for atomistic materials chemistry. arXiv2023, arXiv:2401.00096. Available online: https://doi.org/10.48550/arXiv.2401.00096 (accessed 15 Jan 2025)
68. Open AI; Achiam J, Adler S, Agarwal S, et al. Gpt-4 technical report. arXiv2023, arXiv:2303.08774. Available online: https://doi.org/10.48550/arXiv.2303.08774 (accessed 15 Jan 2025)
69. Yao, Y.; Duan, J.; Xu, K.; Cai, Y.; Sun, Z.; Zhang, Y. A survey on large language model (LLM) security and privacy: the good, the bad, and the ugly. High. Confid. Comput. 2024, 4, 100211.
70. Chang, Y.; Wang, X.; Wang, J.; et al. A survey on evaluation of large language models. ACM. Trans. Intell. Syst. Technol. 2024, 15, 1-45.
71. Augenstein, I.; Baldwin, T.; Cha, M.; et al. Factuality challenges in the era of large language models and opportunities for fact-checking. Nat. Mach. Intell. 2024, 6, 852-63.
72. Patil, R.; Gudivada, V. A review of current trends, techniques, and challenges in large language models (LLMs). Appl. Sci. 2024, 14, 2074.
73. Beltagy, I.; Lo, K.; Cohan, A. SciBERT: a pretrained language model for scientific text. arXiv2019, arXiv:1903.10676. Available online: https://doi.org/10.48550/arXiv.1903.10676 (accessed 15 Jan 2025)
74. Wang, L.; Chen, X.; Du, Y.; Zhou, Y.; Gao, Y.; Cui, W. CataLM: empowering catalyst design through large language models. arXiv2024, arXiv:2405.17440. Available online: https://doi.org/10.48550/arXiv.2405.17440 (accessed 15 Jan 2025)
75. Ding, R.; Wang, X.; Tan, A.; Li, J.; Liu, J. Unlocking new insights for electrocatalyst design: a unique data science workflow leveraging internet-sourced big data. ACS. Catal. 2023, 13, 13267-81.
76. Minh, D.; Wang, H. X.; Li, Y. F.; Nguyen, T. N. Explainable artificial intelligence: a comprehensive review. Artif. Intell. Rev. 2022, 55, 3503-68.
77. Wang, S. H.; Pillai, H. S.; Wang, S.; Achenie, L. E. K.; Xin, H. Infusing theory into deep learning for interpretable reactivity prediction. Nat. Commun. 2021, 12, 5288.
78. Ghanekar, P. G.; Deshpande, S.; Greeley, J. Adsorbate chemical environment-based machine learning framework for heterogeneous catalysis. Nat. Commun. 2022, 13, 5788.
79. Noh, J.; Gu, G. H.; Kim, S.; Jung, Y. Uncertainty-quantified hybrid machine learning/density functional theory high throughput screening method for crystals. J. Chem. Inf. Model. 2020, 60, 1996-2003.
80. Abed, J.; Heras-Domingo, J.; Sanspeur, R. Y.; et al. Pourbaix machine learning framework identifies acidic water oxidation catalysts exhibiting suppressed ruthenium dissolution. J. Am. Chem. Soc. 2024, 146, 15740-50.
81. Zhang, J.; Wang, C.; Huang, S.; et al. Design high-entropy electrocatalyst via interpretable deep graph attention learning. Joule 2023, 7, 1832-51.
82. Deringer, V. L.; Bartók, A. P.; Bernstein, N.; Wilkins, D. M.; Ceriotti, M.; Csányi, G. Gaussian process regression for materials and molecules. Chem. Rev. 2021, 121, 10073-141.
83. Ulissi, Z. W.; Singh, A. R.; Tsai, C.; Nørskov, J. K. Automated discovery and construction of surface phase diagrams using machine learning. J. Phys. Chem. Lett. 2016, 7, 3931-5.
84. Christensen, A. S.; Bratholm, L. A.; Faber, F. A.; Anatole, L. O. FCHL revisited: faster and more accurate quantum machine learning. J. Chem. Phys. 2020, 152, 044107.
85. Xu, W.; Reuter, K.; Andersen, M. Predicting binding motifs of complex adsorbates using machine learning with a physics-inspired graph representation. Nat. Comput. Sci. 2022, 2, 443-50.
86. Togninalli, M.; Ghisu, E.; Llinares-López, F.; Rieck, B.; Borgwardt, K. Wasserstein weisfeiler-lehman graph kernels. arXiv2019, arXiv:1906.01277. Available online: https://doi.org/10.48550/arXiv.1906.01277 (accessed 15 Jan 2025)
87. Grisafi, A.; Bussy, A.; Salanne, M.; Vuilleumier, R. Predicting the charge density response in metal electrodes. Phys. Rev. Mater. 2023, 7, 125403.
88. Satorras, V. G.; Hoogeboom, E.; Welling, M. E(n) equivariant graph neural networks. arXiv2021, arXiv:2102.09844. Available online: https://doi.org/10.48550/arXiv.2102.09844 (accessed 15 Jan 2025)
89. Zhang, X.; Wang, L.; Helwig, J.; et al. Artificial intelligence for science in quantum, atomistic, and continuum systems. arXiv2023, arXiv:2307.08423. Available online: https://doi.org/10.48550/arXiv.2307.08423 (accessed 15 Jan 2025)
90. Zitnick, C. L.; Das, A.; Kolluru, A.; et al. Spherical channels for modeling atomic interactions. arXiv2022, arXiv:2206.14331. Available online: https://doi.org/10.48550/arXiv.2206.14331 (accessed 15 Jan 2025)
91. Passaro, S.; Zitnick, C. L. Reducing SO3 convolutions to SO2 for efficient equivariant GNNs. arXiv2023, arXiv:2302.03655. Available online: https://doi.org/10.48550/arXiv.2302.03655 (accessed 15 Jan 2025)
92. Liao, Y. L.; Wood, B.; Das, A.; Smidt, T. EquiformerV2: improved equivariant transformer for scaling to higher-degree representations. arXiv2023, arXiv:2306.12059. Available online: https://doi.org/10.48550/arXiv.2306.12059 (accessed 15 Jan 2025)
93. Hastie, T. J. Generalized additive models. In Statistical models in S, 1st ed.; Routledge, 2017; pp 249-307.
94. Ouyang, R.; Curtarolo, S.; Ahmetcik, E.; Scheffler, M.; Ghiringhelli, L. M. SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys. Rev. Mater. 2018, 2, 083802.
95. Lin, X.; Wang, Y.; Chang, X.; Zhen, S.; Zhao, Z.; Gong, J. High-throughput screening of electrocatalysts for nitrogen reduction reactions accelerated by interpretable intrinsic descriptor. Angew. Chem. Int. Ed. 2023, 135, e202300122.
96. Ding, Z.; Pang, Y.; Ma, A.; et al. Single-atom catalysts based on two-dimensional metalloporphyrin monolayers for electrochemical nitrate reduction to ammonia by first-principles calculations and interpretable machine learning. Int. J. Hydrogen. Energy. 2024, 80, 586-98.
97. Shu, W.; Li, J.; Liu, J. X.; et al. Structure sensitivity of metal catalysts revealed by interpretable machine learning and first-principles calculations. J. Am. Chem. Soc. 2024, 146, 8737-45.
98. Su, Y.; Wang, X.; Ye, Y.; et al. Automation and machine learning augmented by large language models in a catalysis study. Chem. Sci. 2024, 15, 12200-33.
99. Liu, X.; Peng, H. Toward next-generation heterogeneous catalysts: empowering surface reactivity prediction with machine learning. Engineering 2024, 39, 25-44.
100. Yang, Z.; Gao, W. Applications of machine learning in alloy catalysts: rational selection and future development of descriptors. Adv. Sci. 2022, 9, e2106043.
101. Ribeiro, M. T.; Singh, S.; Guestrin, C. “Why should I trust you?”: Explaining the predictions of any classifier. arXiv2016, arXiv:1602.04938. Available online: https://doi.org/10.48550/arXiv.1602.04938 (accessed 15 Jan 2025)
102. Van der Maaten L, Hinton G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579-605. https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf. (accessed 2025-01-15).
103. Lundberg, S.; Lee, S. I. A unified approach to interpreting model predictions. arXiv2017, arXiv:1705.07874. Available online: https://doi.org/10.48550/arXiv.1705.07874 (accessed 15 Jan 2025)
104. Omidvar, N.; Wang, S.; Huang, Y.; et al. Explainable AI for optimizing oxygen reduction on Pt monolayer core–shell catalysts. Electrochem. Sci. Adv. 2024, 4, e202300028.
105. Li, Y.; Zhang, X.; Li, T.; Chen, Y.; Liu, Y.; Feng, L. Accelerating materials discovery for electrocatalytic water oxidation via center-environment deep learning in spinel oxides. J. Mater. Chem. A. 2024, 12, 19362-77.
106. Roh, J.; Park, H.; Kwon, H.; et al. Interpretable machine learning framework for catalyst performance prediction and validation with dry reforming of methane. Appl. Catal. B. Environ. 2024, 343, 123454.
107. Ding, R.; Chen, Y.; Chen, P.; et al. Machine learning-guided discovery of underlying decisive factors and new mechanisms for the design of nonprecious metal electrocatalysts. ACS. Catal. 2021, 11, 9798-808.
108. Pillai, H. S.; Li, Y.; Wang, S. H.; et al. Interpretable design of Ir-free trimetallic electrocatalysts for ammonia oxidation with graph neural networks. Nat. Commun. 2023, 14, 792.
109. Zhong, M.; Tran, K.; Min, Y.; et al. Accelerated discovery of CO2 electrocatalysts using active machine learning. Nature 2020, 581, 178-83.
110. Pablo-García, S.; Morandi, S.; Vargas-Hernández, R. A.; et al. Fast evaluation of the adsorption energy of organic molecules on metals via graph neural networks. Nat. Comput. Sci. 2023, 3, 433-42.
111. Schütt, K. T.; Unke, O. T.; Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. arXiv2021, arXiv:2102.03150. Available online: https://doi.org/10.48550/arXiv.2102.03150 (accessed 15 Jan 2025)
Cite This Article
How to Cite
Wu, H.; Chen, M.; Cheng, H.; Yang, T.; Zeng, M.; Yang, M. Interpretable physics-informed machine learning approaches to accelerate electrocatalyst development. J. Mater. Inf. 2025, 5, 15. http://dx.doi.org/10.20517/jmi.2024.67
Download Citation
Export Citation File:
Type of Import
Tips on Downloading Citation
Citation Manager File Format
Type of Import
Direct Import: When the Direct Import option is selected (the default state), a dialogue box will give you the option to Save or Open the downloaded citation data. Choosing Open will either launch your citation manager or give you a choice of applications with which to use the metadata. The Save option saves the file locally for later use.
Indirect Import: When the Indirect Import option is selected, the metadata is displayed and may be copied and pasted as needed.
About This Article
Special Issue
Copyright
Data & Comments
Data

Comments
Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at [email protected].