FT2DP: large atomic model fine-tuned machine learning potential for accelerating atomistic simulation of iron-based Fischer-Tropsch synthesis

Zhao-Qing Liu; Zhe Deng; Huabo Zhao; Han Wang; Mohan Chen; Hong Jiang

doi:10.20517/jmi.2024.105

Download PDF

Research Article | Open Access | 25 Mar 2025

FT²DP: large atomic model fine-tuned machine learning potential for accelerating atomistic simulation of iron-based Fischer-Tropsch synthesis

Views: 238 | Downloads: 44 | Cited:

0

Zhao-Qing Liu^1,#

,

Zhe Deng^1,#

, ...

Hong Jiang^1,*

J. Mater. Inf. 2025, 5, 27.

10.20517/jmi.2024.105 | © The Author(s) 2025.

Author Information

Article Notes

Cite This Article

Abstract

Density-functional theory (DFT)-based atomistic simulation methods have been essential in studying the structure-property relationships in heterogeneous catalysis. However, for complex catalytic processes, such as iron-based Fischer-Tropsch synthesis (FTS), the temporal or spatial scales involved are generally too large to perform DFT calculations. Recently, the development of machine learning potentials (MLPs) has demonstrated the capability for atomistic simulation on a large scale and long duration, and the rise of large atomic models (LAMs) is gaining much attention with unified descriptors incorporating a wide range of chemical knowledge and fine-tuning methodology for efficiently deploying the model to downstream tasks. In this work, we construct a MLP named fine-tuned Fischer-Tropsch deep potential (FT$$ ^2 $$DP) model, which is fine-tuned from upstream DPA-2 LAM on a downstream dataset focused on the iron-based FTS process. We further applied this model to investigate iron-based FTS in both surface reactions and reconstructions of edge sites combined with the double-to-single transition state optimization method and the local genetic algorithm. Our work demonstrated the capability and efficiency of our model for iron-based FTS simulations, while revealing the reaction mechanism on common active sites containing [Fe$$ _4 $$C] squares, and the abundant formation of [Fe$$ _4 $$C] squares on several reconstructed surfaces. These insights highlight the potential of utilizing LAM for atomistic simulation for iron-based FTS processes and other complex catalytic reactions.

Graphical Abstract

Keywords

Large atomic model, machine learning potentials, fine-tuning, Fischer-Tropsch synthesis, transition state optimization, surface reconstruction

Download PDF 0 0

INTRODUCTION

As a key aspect of the chemical industry, heterogeneous catalysis has played a crucial role in the large-scale production of commodity chemicals such as ammonia, alcohol, and synthetic fuels. Nowadays, computational simulations based on ab initio density-functional theory (DFT) calculations have offered unprecedented opportunities for the rational design of novel solid catalysts by providing a deep atomistic analysis of the structures and reaction properties combined with theories in heterogeneous catalysis ^[1]. However, accurate and efficient computational simulations of complex heterogeneous catalytic processes remain highly challenging because of the demanding computational cost. One of the typical examples of complex heterogeneous catalytic systems is the Fischer-Tropsch synthesis (FTS), which converts syngas (a mixture of CO and H₂) into fuels and other valuable chemicals, holding a special position in the energy industry ^{[2, 3]}. Among many possible candidates, the iron-based catalyst has gained much attention due to its low cost, high sulfur tolerance, and low methane selectivity ^{[4, 5]}. The active phases of iron-based FTS are believed to be in situ formed iron carbides such as $$ \chi $$-Fe$$ _5 $$C$$ _2 $$, $$ \eta $$-Fe$$ _2 $$C, $$ \epsilon $$-Fe$$ _2 $$C/$$ \epsilon $$-Fe$$ _{2.2} $$C, and $$ \theta $$-Fe$$ _3 $$C, which can be verified with spectroscopic and electron-microscopy experiments ^[6-10]. However, iron-based FTS is recognized as a structure-sensitive reaction ^[11], and many factors of Fe-catalysts such as particle size, chemical composition, and promoters are found to have a great impact on their catalytic activity ^[11-13], which poses great challenges for both theoretical and experimental research.

To achieve the goal of rational catalyst design, extensive research has been conducted to elucidate the relationship between the surface structures of iron carbides and their reactivities in the FTS process ^[14-18]. For instance, Chen et al. found that the CO dissociation barrier on different $$ \chi $$-Fe$$ _5 $$C$$ _2 $$ surfaces can be effectively predicted using local descriptors such as atomic charges^[16]. They also showed that their derived Brønsted-Evans-Polanyi (BEP) relationship remains applicable in cases including non-stoichiometric terminations, carbon vacancies, and potassium promotion. Li et al. studied the CO activation processes on $$ \theta $$-Fe$$ _3 $$C(010) surfaces, and revealed that on the Fe/C-terminated surface, direct CO dissociation is not favored due to the high concentration of surface carbon atoms, and the participation of hydrogen is essential for CO dissociation, while on the Fe-terminated $$ \theta $$-Fe$$ _3 $$C(010) surface the direct CO dissociation is preferred^[17]. Yin et al. investigated the CH$$ _4 $$ formation and C-C coupling reactions on multiple $$ \chi $$-Fe$$ _5 $$C$$ _2 $$ surfaces based on a Wulff structure^[18]. They demonstrated that certain "active facets" only account for a small fraction of the total exposed surface area, but could play a significant role in the overall FTS activity. Although these studies have provided valuable insights into the FTS-related properties of iron carbides under various chemical environments, there is still a lack of comprehensive understanding of the pattern of catalytic sites and reaction mechanisms under realistic FTS conditions. This is primarily due to the high computational cost of DFT calculations, particularly when applied to systems with large temporal or spatial scales, which are common in iron-based FTS reactions and other complex heterogeneous catalytic processes.

The past several years have witnessed enormous advances in artificial intelligence (AI) methods, fueling a new paradigm shift of discoveries in natural sciences and giving rise to a new area of research, known as AI for science ^[19]. Especially, with the rapid development of machine learning potentials (MLPs) with both the accuracy of DFT and the efficiency of classical force fields, the detailed atomistic simulation for complex heterogeneous catalysis systems on a large temporal and spatial scale has become feasible^[20], so that there emerge a large number of so-called AI-driven atomistic simulation platforms incorporating MLPs sharing the same basic architecture that using structural descriptors to represent the atomic structures and a fitting model (usually an artificial neural network) to decode the representation from descriptor to the potential energy. One notable example is the LASP software ^[21] with its neural network potential constructed by the power-type structural descriptors represented by atom-centered symmetry functions ^[22] developed by Huang et al., which has been widely utilized in computational simulation for various heterogeneous catalysis topics. For instance, in the ethene epoxidation reaction on silver^[23], they identified the O$$ _5 $$ phase as highly active. Moreover, the calculated selectivity and ethene conversion are consistent with experimental results. In the case of the Fe-FTS system, Liu et al. showed that surface reconstructions of iron carbides under a typical FTS gas atmosphere are significant and usually involve the relocation of surface carbon atoms^[24]. They demonstrated that the A-P5 site, a pentagon configuration consisting of five iron atoms and a carbon atom bonded to four of the iron atoms, is abundant on several active surfaces of iron carbides as an active site for CO dissociation and C-C coupling^[25]. Their obtained product yield also shows good agreement with experiments^[26]. Another well-known example is the DeePMD-kit software ^[27-30] with its deep potential (DP) method using an end-to-end symmetry-preserving structural descriptor consisting of smooth internal coordinates and embedding network transformation for encoding local atomic environment^{[27, 31]}, widely used for atomistic simulation including the dynamics of growing carbon nanotube interfaces^[32], the sintering of Au nano-particles on supports having different metal affinities^[33], the size effects of supported Au catalysts for CO oxidation activity ^[34], CO adsorption induced surface reconstruction dynamics on the Cu surfaces ^[35], and structural/compositional evolution of Pd(111) surfaces for CO oxidation under varying adsorption coverages^[36].

Despite numerous successes of MLPs in atomistic simulations, their applications often face economic and scalability limitations. The most obvious one is that all of the data used for training and validating MLPs are generated from scratch, namely, by ab initio molecular dynamic (AIMD) simulation or other simulation methods with ab initio calculation (typically DFT). Efficient data generation platforms through an active learning procedure such as DP GENerator (DP-GEN)^[37] or Generating DP with Python (GDPy)^[38] can significantly facilitate this process. However, a substantial amount of effort is still needed to construct DFT-labeled datasets, especially for heterogeneous catalysis domain due to its complexity with characteristics of bulks, surfaces, and adsorbates altogether. The publicly available large datasets, such as OC20^[39], OC22^[40], and MPtrj^[41], have covered extensive physical and chemical knowledge, becoming very useful for overcoming the challenges of data generation. However, traditional MLPs usually struggle to generalize to applications not covered by the training data, especially when additional elements and structural configuration are included in the simulation tasks, making them poor at combining and utilizing the knowledge from these multiple and large datasets together due to the varying ab initio calculation methods employed in different datasets and the vast compositional space contained across all these datasets.

Recently, the rise of "universal" or "fundamental" MLPs offers opportunities for addressing the issues above and greatly extending the application scope of MLPs, often referred to as large atomic models (LAMs). One remarkable advancement in LAM is the second version of the deep potential with attention pre-trained model (DPA-2) developed by Zhang et al. from the DeepModeling open-source community ^{[42, 43]}. This model utilizes unified descriptors constructed by deep neural network architecture, and uses the multi-task pre-training strategy to jointly pre-train a multi-head model using multiple datasets that encompass multidisciplinary knowledge from a broad range of application domains. By fine-tuning for specific downstream tasks, the pre-trained DPA-2 model can be efficiently deployed to evaluate the potential energy surface (PES) of a specific research domain with precision and generalization.

In this work, we have developed an MLP based on the pre-trained DPA-2 model by fine-tuning methodology, named fine-tuned Fischer-Tropsch deep potential (FT$$ ^2 $$DP), which aims to describe the global chemical space consisting of Fe-C-H-O compositions in an accurate and extendable way, enabling us to perform efficient computational simulation to investigate the characteristics of Fe-FTS systems. This paper is organized as follows. In the next section, we present theoretical schemes and computational settings used in this work. In the section "Results and Discussions", we illustrate the construction of FT$$ ^2 $$DP, and its application of the Fe-FTS at the DFT precision level with efficiency, including exploring the reaction mechanism of Fe-FTS reactions and revealing the morphology of reconstructed FeC$$ _x $$ surfaces with steps. The last section summarizes the main findings of this work and remarks on the possible further developments and applications of the FT$$ ^2 $$DP model in the future.

MATERIALS AND METHODS

DFT calculations

All spin-polarized DFT calculations were carried out by using Atomic-orbital Based Ab-initio Computation at UStc (ABACUS) package ^{[44, 45]}. The SG15-optimized Norm-Conserving Vanderbilt (SG15-ONCV) multi-projector pseudopotentials ^{[46, 47]} were employed and the valence configurations were [H]1s$$ ^1 $$, [C]2s$$ ^2 $$2p$$ ^2 $$, [O]2s$$ ^2 $$2p$$ ^4 $$ and [Fe]3d$$ ^6 $$4s$$ ^2 $$. The generalized gradient approximation (GGA) in the Perdew-Burke-Ernzerhof (PBE) variant ^[48] was adopted for the exchange-correlation functional. The second generation of numerical atomic orbitals (NAOs) in the double-$$ \zeta $$ plus polarization function (DZP) form ^[49] was used as the basis set. The periodic boundary condition (PBC) and the $$ \Gamma $$-centered Monkhorst–Pack scheme^[50] for sampling the Brillouin zone were adopted in the DFT calculations, with an automated mesh determined by k-spacing = 0.14 Bohr$$ ^{-1} $$ and only one k-point for the direction without and with vacuum layers, respectively. The dipole correction perpendicular to the surface was applied for all DFT calculations of surfaces. The electron density criterion for electron self-consistency convergence was set at $$ 1 \times 10^{-7} $$, and the first-order Methfessel-Paxton (MP) smearing ^[51] was used for the occupation of orbitals. In geometry and transition state (TS) optimizations, the convergence criterion for the largest force among all atoms was set to 0.05 eV/Å.

DPA-2 and fine-tuning methodology

DPA-2 is a multi-task pre-trained LAM originating from the DP architecture and evolved from the DPA-1 model ^[52]. The DPA-1 descriptor has introduced an element-type embedding for encoding the elemental information covering the whole periodic table, and a gated self-attention mechanism ^[53] excelling in modeling the importance of neighboring atoms and re-weighting the interaction among them, which also makes the model generalizable and pre-trainable. Inheriting the DPA-1 backbone, the DPA-2 descriptor further enhances its resolution and generalizability of atomic representation through stacking multiple transformer ^[53] layers called representation transformer, incorporating operators such as convolution, symmetrization, localized self-attention, and gated self-attention, which can be interpreted as an E(3) equivariant graph neural network (GNN) and offers greater capacity compared to conventional GNNs ^[42], ensuring the robust capability of DPA-2 for serving as a LAM assembling comprehensive knowledge from massive pre-training data.

Besides having a more sophisticated model architecture, DPA-2 employs a multi-task training strategy for pre-training in multiple datasets labeled with different DFT settings to extract multidisciplinary knowledge. In particular, the multi-task DPA-2 model has multiple heads, and each head is an identical fitting network used to fit the DFT labels of each pre-training dataset from different downstream domains. During the pre-training process, the parameters within the DPA-2 descriptor are concurrently optimized through back-propagation using all pre-training datasets, while the parameters of the fitting network are updated exclusively with the specific pre-training dataset to which they are associated ^[42].

The pre-trained descriptor and fitting networks can be fine-tuned on specific downstream tasks, and the multidisciplinary knowledge learned from the multiple upstream datasets can help to reduce the consumption in model training and the amount of training data. In the fine-tuning process, the descriptor of the downstream model will be initialized with the pre-trained parameters, and the fitting network could also be initialized by choosing a fitting network in the pre-trained model. The energy bias in the fitting network will be aligned to the labels of the downstream dataset subsequently, and then the typical model training process will proceed with the initialized parameters incorporating upstream knowledge. Our FT$$ ^2 $$DP is constructed by this pre-training-fine-tuning methodology based on the publicly available pre-trained upstream DPA-2 model. All fine-tuning and model validation in this work were performed using DeePMD-kit software in version 3.0.0 beta 3 ^[30].

Double-to-single workflow for TS optimization

To investigate the catalytic reaction mechanism from a theoretical point of view, a crucial task is the optimization of TS structures of each elementary reaction for acquiring the activation energy. The climbing-image nudged elastic band (CI-NEB) ^{[54, 55]} and the dimer method ^{[56, 57]} are the two most popular TS optimization methods in heterogeneous catalysis simulation, representing two types of TS optimization methods respectively: double-ended methods that start from combining optimized initial state (IS) and final state (FS), and single-ended methods that are based on only one state, usually a guessed TS. It is well-known that the efficiency of single-ended methods highly relies on the quality of the input structure, but the optimization will converge quickly when the optimization reaches the quadratic region around TS. On the contrary, double-ended methods have no reliance on any guessed TS structure, but they tend to have convergence problems. In our work, we combine these two types of methods together as a workflow, named double-to-single (D2S), implemented in the atomic simulation environment (ASE) package ^[58] (as illustrated in Figure 1A) to make good use of them for accelerating the TS optimization process. The D2S workflow uses CI-NEB first to generate a rough reaction pathway with a relatively loose convergence criterion (usually 1.0 eV/Å for maximum of atomic forces), and then a single-ended method, such as the dimer, or a better choice, Sella algorithm based on iterative Hessian diagonalization and partitioned rational function optimization (P-RFO)^{[59, 60]}, is utilized by starting from the maximum point of the NEB pathway for strict TS optimization with target convergence criterion (usually 0.05 eV/Å for maximum of atomic forces). The free energy corrections, including zero-point energy (ZPE) and thermal corrections (translational, rotational and vibrational) for gas-phase molecules, as well as ZPE and vibrational contribution for adsorbates, are also computed using ASE. All these codes are open-sourced in the ATST-Tools suite ^[61], which supports using ABACUS and our FT$$ ^2 $$DP model as a property calculator. All structures in this part were visualized using ASE.

FT<sup>2</sup>DP: large atomic model fine-tuned machine learning potential for accelerating atomistic simulation of iron-based Fischer-Tropsch synthesis

Figure 1. Scheme of model applications. (A) TS optimization, starting from a map of probable reaction network providing the reaction patterns of elementary reactions, and each reaction (taking CO dissociation as an example) is calculated by optimizing its IS, FS, and TS, where the TS is optimized by the D2S workflow; (B) Conventional and local schemes in optimizing the edge sites of the $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(510) slab using a GA. Orange: Fe atoms; Grey: C atoms; Red: O atoms. TS: Transition state; IS: initial state; FS: final state; D2S: double-to-single; GA: genetic algorithm.

Genetic algorithm for global optimization

We employed the genetic algorithm (GA) implemented in ASE ^[62], using FT$$ ^2 $$DP as the energy-evaluation calculator. In our global optimization process, the initial and subsequent populations each consist of 80 members, with each population exploring 80 new mutated candidates. If convergence is found difficult with this default setting, the population size and number of mutated candidates will be increased to 100. The mutation of candidates involves three operators implemented in the ASE-GA module: mirror (mirroring half atoms in a randomly oriented cutting plane), rattle (perturbing a part of atoms with a small random displacement), and permutation (switching the positions of a subset of atoms randomly), each with the same probability of 1/3. The stopping criterion in GA is regarded as met when the best candidate does not change for ten generations and the current generation number is at least 20. Following this, at least ten lowest-energy candidates are refined with PBE-DFT single-point calculations to identify the most stable structure. When using GA to obtain low-energy surface reconstructions, conventional implementation typically operates on an entire surface with constraints only on the z-axis^{[63, 64]}. This approach is unnecessary for predicting surface reconstructions that only involve local environments such as edge structures. In this work, we adopted a local setting implemented in ASE, specifying the involved atoms and where they are generated [Figure 1B]. All structures in this part were visualized using VESTA^[65].

Ab initio atomistic thermodynamics

To evaluate the stability of reconstructed surfaces with different compositions, the ab initio atomistic thermodynamics theory developed by Reuter and Scheffler^{[66, 67]} was used in this work. In this way, the surface energy ($$ \gamma $$) of a symmetric surface can be calculated from

(1)

$$ \begin{equation} \gamma=\frac{1}{2 A}\left[G_{\mathrm{slab}}\left(N_{\mathrm{Fe}}, N_{\mathrm{C}}\right)-N_{\mathrm{Fe}} \mu_{\mathrm{Fe}}-N_{\mathrm{C}} \mu_{\mathrm{C}}\right], \end{equation} $$

where $$ G_\text{slab} $$ is the Gibbs free energy of a slab with two equivalent surfaces, $$ \mu_\text{Fe} $$ and $$ \mu_\text{C} $$ are the chemical potentials of Fe and C atoms, $$ N_\text{Fe} $$ and $$ N_\text{C} $$ are the numbers of Fe and C atoms, and $$ A $$ is the surface area. The Gibbs free energies were calculated by FT$$ ^2 $$DP under $$ T $$ = 523 K, which is a typical iron-based FTS temperature.

For a surface that is in equilibrium with a bulk with a fixed composition (e.g., FeC$$ _\text{x} $$), the chemical potentials of the contained elements are not all independent. In this work, $$ \mu_{\mathrm{Fe}} $$ and $$ \mu_{\mathrm{C}} $$ are related to the Gibbs free energy per formula unit of the FeC$$ _x $$ bulk ($$ \mu_{\text{FeC}_x} $$) as:

(2)

$$ \begin{equation} \mu_{\mathrm{Fe}}+\mathrm{x} \mu_{\mathrm{C}}=\mu_{\mathrm{FeC}_x}. \end{equation} $$

In our discussion, since not all structures contain the same amount of atoms, we defined a relative surface energy ($$ \Delta \gamma $$) as:

(3)

$$ \begin{equation} \Delta \gamma=\frac{1}{A}\left[G_{\mathrm{rec}}\left(N_{\mathrm{Fe}}-\Delta N_{\mathrm{Fe}}, N_{\mathrm{C}}-\Delta N_{\mathrm{C}}\right)-G_{\text{slab }}\left(N_{\mathrm{Fe}}, N_{\mathrm{C}}\right)+\Delta N_{\mathrm{Fe}} \mu_{\mathrm{FeC_x}}+\left(\Delta N_{\mathrm{C}}-\mathrm{x} \Delta N_{\mathrm{Fe}}\right) \mu_{\mathrm{C}}\right]. \end{equation} $$

Here, the reference ($$ G_{\text{slab }} $$) is the clean slab without edge sites. The reason why the denominator "$$ 2A $$" in Equation (1) is replaced by "$$ A $$" in Equation (3) is that edge sites are only built on the top layer in this work, while the bottom layers remain fixed in structural relaxations. Additionally, $$ \Delta N_{\mathrm{Fe}} $$ or $$ \Delta N_{\mathrm{C}} $$ represents the differences in the number of Fe or C atoms between the reconstructed structure and the clean surface, respectively.

For convenience, we used the electronic energy of an isolated carbon atom ($$ E_\text{C} $$) as the reference for the carbon chemical potential, that is $$ \Delta \mu_\text{C} = \mu_\text{C} - E_\text{C} $$. Since the free energies and chemical potentials are relevant to temperature, pressure, and gas atmosphere. Here we used the results from Liu et al. to simulate a realistic iron-based FTS condition ($$ T $$ = 523 K, -6.60 eV $$ \leq $$$$ \Delta \mu _\text{C} $$$$ \leq $$ -7.45 eV)^[24].

RESULTS AND DISCUSSION

FT²DP construction and validation

Our model, FT$$ ^2 $$DP, is constructed through fine-tuning on our downstream dataset from the upstream DPA-2.2.0 model ^[68], which is a pre-trained open LAM (OpenLAM) from the AIS Square website ^[69]. Thanks to the multi-task training protocol, this LAM was trained on more than 20 different datasets containing various physical and chemical systems including organic molecules, clusters, alloys, semiconductors, surfaces, and adsorbates through multi-task training mechanism, gathering multidisciplinary knowledge in one unified DPA-2 descriptor. Apart from the descriptor, the fitting model is a neural network containing three hidden layers with the typical numbers of neurons being (240, 240, 240) for all heads in the upstream DPA-2.2.0 model and our fine-tuned model.

There are 30, 656 frames in our FT$$ ^2 $$DP downstream dataset, including Fe-C-H-O element combinations and various types of structures, derived from the previous work by Liu et al. ^[25], and approximately 8, 000 structures were removed after the data cleaning procedures below to remove outliers and redundancies: (1) Removal of structures with identical DFT-calculated energy labels to eliminate redundant conformations; (2) Removal of the structures with fewer than 12 atoms per cell, which often represent isolated molecules or radicals in a cell. Such configurations are prone to DFT inaccuracies in PBCs or poor MLP generalizability; (3) Exclusion of structures having the absolute difference between model prediction and DFT results exceeds 80.0 meV/atom (energy) or 1.00 eV/Å (maximum atomic forces), where the model referenced here was fine-tuned from the upstream DPA-2 model on the original datasets following the same fine-tuning protocol detailed in the next paragraph. All DFT energies and forces were calculated by ABACUS following the computational settings mentioned above. A brief overview of this dataset is given in Table 1, showing the number of structures (with or without Fe) in different types, including three-dimensional (3D) bulks, two-dimensional (2D) surfaces, one-dimensional (1D) strings, and zero-dimensional (0D) clusters. Besides, a sketch-map visualization is shown in Figure 2 for illustrating the wide configuration distribution of our FT$$ ^2 $$DP dataset.

Table 1

A brief overview of the FT$$ ^2 $$DP downstream dataset used in training the model

Type of structures	3D bulks	2D surfaces	1D strings	0D clusters	All
FT$$ ^2 $$DP: Fine-tuned Fischer-Tropsch deep potential; 3D: three-dimensional; 2D: two-dimensional; 1D: one-dimensional; 0D: zero-dimensional.
Numbers of structures with Fe	6, 917	13, 829	117	37	20, 900
Numbers of structures without Fe	7, 341	1, 114	330	971	9, 756
Total numbers	14, 258	14, 943	447	1, 008	30, 656

Figure 2. Sketch-map visualization of the FT$$ ^2 $$DP downstream dataset. Each point of this map represents an individual atomic configuration. The position of each point is determined by t-SNE of the learned descriptor of FT$$ ^2 $$DP model, and its color indicates the corresponding formation energy of the structure. FT$$ ^2 $$DP: Fine-tuned Fischer-Tropsch deep potential; t-SNE: t-distribution stochastic neighbor embedding

Our fine-tuning protocol was initialized by using the parameters of the global descriptor in pre-trained DPA-2.2.0 model and fitting network from the Domains_OC2M branch. The fine-tuning process on our dataset is done by following the default training process of the DPA-2 model with some setting modifications. In the default pre-training process of DPA-2.2.0 LAM, the learning rate starts from $$ 2 \times 10^{-4} $$ and gradually decreases to $$ 3.51 \times 10^{-8} $$ by an exponential decreasing scheme with each decrease performed at every 1/200 checkpoint of the total training step. The setting is usually effective for from-scratch training process, but the initial learning rate may be too large to be suitable for our fine-turning tasks owing to the loss of global knowledge in model descriptor after over-fitting in downstream dataset and possible gradient explosion in training practices. As a result, a relatively low initial learning rate $$ 2 \times 10^{-5} $$ was utilized together with one million training batches in our fine-tuning protocol.

To evaluate the generalizability of the FT$$ ^2 $$DP model, we performed a validation test where a model, named FT$$ ^2 $$DP-80p, was trained on a randomly selected 80% subset and tested on the remaining 20%. Validation results [Supplementary Figure 1, Table 2] present parity plots of formation energies and atomic forces, alongside R$$ ^2 $$ values, as well as the mean absolute error (MAE) and the root mean square error (RMSE) metrics comparing model predictions to DFT-calculated energies and forces. The FT$$ ^2 $$DP-80p model achieved comparable performance on training and validation subsets, confirming the robust generalizability. Furthermore, the accuracy of the final FT$$ ^2 $$DP model fine-tuned on the entire dataset is demonstrated in Figure 3A and B and Table 2 (final column), showing strong agreement between FT$$ ^2 $$DP prediction and DFT calculation for energies and forces. Moreover, the violin plots in Figure 3C and D illustrate the distribution of energy difference and atomic force difference between FT$$ ^2 $$DP prediction and DFT results, showing that although there are some outliers in a relatively wide range, our FT$$ ^2 $$DP model still achieves a good accuracy with energy deviation of less than 3 meV/Atom and a force deviation of less than 0.09 eV/Å for 75% of the FT$$ ^2 $$DP dataset. All these results indicate the model's promising reliability for direct usage in high-efficiency structural optimization and TS searching tasks. The accuracy of the final FT$$ ^2 $$DP model will be further demonstrated below by comparing FT$$ ^2 $$DP and DFT results for the lots of structures that emerged from atomistic modeling practices.

Figure 3. Validation results of the FT$$ ^2 $$DP model after fine-tuning on the entire dataset. Figures on the left are the parity plots comparing formation energies (A) and forces (B) from FT$$ ^2 $$DP against those from DFT on the dataset, with R$$ ^2 $$ equal to 1.000 and 0.9570, respectively. Figures on the right illustrate the violin plots of (C) distribution of DFT formation energies and atomic forces of the FT$$ ^2 $$DP dataset and (D) distribution of difference in energies and atomic forces between FT$$ ^2 $$DP predictions and DFT results on this dataset, and a box-plot without outlier is used to show the detailed distribution inside the peak region. FT$$ ^2 $$DP: Fine-tuned Fischer-Tropsch deep potential; DFT: density-functional theory.

Additionally, it is valuable to identify outlier structures with high prediction errors to investigate their characteristics and similarities. Structures in the entire FT$$ ^2 $$DP dataset with the top ten highest absolute prediction errors in energies ($$ \vert \Delta E \vert $$) and maximum absolute prediction errors in atomic forces ($$ \vert \Delta F \vert_{max} $$) are presented in Supplementary Tables 1 and 2 and Supplementary Figures 2 and 3, respectively. We observe that most of these structures exhibit disordered configurations or unphysical features. For example, 60% of these structures correspond to the same chemical formula C$$ _{6} $$H$$ _{12} $$O$$ _{6} $$, displaying disordered molecular configurations in a relatively small cubic cell with lattice vector equals to 10Å, rendering their physical state (gas/liquid/solid) indeterminate. Moreover, the third- and sixth-ranked structures in $$ \vert \Delta E \vert $$ contain unphysical H$$ _6 $$ clusters located in the vacuum layer of Fe$$ _{16} $$H$$ _{7} $$ surfaces. While these outliers reflect the equilibrium diversity of the dataset, which could improve model stability in the non-equilibrium region^[70], they also highlight the presence of unphysical configurations that necessitate data cleaning to remove the outliers for preventing training instability and downgraded model performance in practical atomistic modeling applications.

Table 2

Validation results of the FT$$ ^2 $$DP models

Test items	FT$$ ^2 $$DP-80p on the training set	FT$$ ^2 $$DP-80p on the validation set	Final FT$$ ^2 $$DP on the entire dataset
FT$$ ^2 $$DP: Fine-tuned Fischer-Tropsch deep potential; MAE: mean absolute error; RMSE: root mean square error.
Number of structures	24, 525	6, 131	30, 656
Energy parity plot R$$ ^2 $$	1.000	1.000	1.000
Energy MAE (eV)	0.1837	0.1890	0.1829
Energy RMSE (eV)	0.3141	0.3233	0.3112
Energy MAE (meV/atom)	5.497	5.780	5.500
Energy RMSE (meV/atom)	10.02	10.85	9.974
Force parity plot R$$ ^2 $$	0.9605	0.9480	0.9570
Force MAE (eV/Å)	0.0764	0.0792	0.0764
Force RMSE (eV/Å)	0.1167	0.1271	0.1167

Reaction pathways

Many previous studies have demonstrated that certain surfaces exhibit much higher FTS activity, identifying them as the active surfaces^{[6, 71]}. A well-known example is the $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(510) surface, which has been shown to present relatively low CO dissociation and C-C coupling barriers^{[16, 24, 25]}. In this section, the FT$$ ^2 $$DP model and our D2S TS optimization workflow are utilized to investigate the key elementary reactions of FTS reaction pathways on the A-P5 site of $$ \chi $$-Fe$$ _5 $$C$$ _2 $$ (510) surface, known as an important active site for FTS process due to the participation of the lattice carbon and carbon vacancy on it in the surface reaction through a Mars-von Krevelen (MvK) mechanism revealed by previous studies ^{[6, 24, 25]}. In particular, we consider the dissociative adsorption of H$$ _2 $$ by Langmuir-Hinshelwood (L-H) mechanism, the dissociative adsorption of CO by the MvK mechanism, the competition between chain growth (C-C coupling) and carbon hydrogenation (C-H coupling) for carbon adsorbates, and the desorption of hydrocarbon compounds such as CH$$ _4 $$. The MvK reaction mechanism similar to previous works is also revealed in our investigation, especially for CO dissociation and chain growth process, illustrating the importance of the A-P5 site on iron carbide for FTS.

To further validate the accuracy of the FT$$ ^2 $$DP model in the TS optimization, we compared the results from three types of calculations: (1) purely FT$$ ^2 $$DP-based; (2) DFT single-point calculation after TS optimization by FT$$ ^2 $$DP, denoted as DFT@FT$$ ^2 $$DP; and (3) purely DFT-based, in Figure 4, illustrating that most reaction pathways calculated directly by FT$$ ^2 $$DP agree well with DFT benchmarks. However, a subset of cases exhibits notable energy discrepancies between FT$$ ^2 $$DP and DFT, particularly for TS. For example, the TS5 and TS6 states in Figure 4B show energy differences up to 0.30 eV. Such errors could propagate into significant uncertainties in identifying the rate-determining step (RDS) and microkinetics modeling of reaction networks. This highlights the inherent challenges in training MLPs to capture reactive dynamics and the limitations of relying solely on MLPs for reaction simulation, especially for out-of-distribution configurations. Further improvement can be attained in the DFT@FT$$ ^2 $$DP scheme. For instance, the energy differences of TS5 and TS6 states in Figure 4B are reduced to 0.01 and 0.03 eV, respectively, when comparing DFT@FT$$ ^2 $$DP results and pure DFT results. This hybrid approach aligns with established practices in MLP-based studies^[24], following the spirit of using low-fidelity (or rough) computational methods for geometry evaluation while performing high-fidelity (or precise) computational methods to refine the energy and electronic structure of crucial states^{[72, 73]}. While FT$$ ^2 $$DP can qualitatively map reaction networks without requiring posterior DFT energy corrections, quantitative accuracy cannot be fully guaranteed without further model refinement and dataset expansion to span the target reactive chemical space.

Figure 4. FTS reaction pathways on $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(510) surface, including (A) dissociation of H$$ _2 $$ and CO on A-P5 site; (B) *CH-*C coupling and following steps towards long-chain (C$$ _{2+} $$) hydrocarbon products; (C) *CH-*H coupling steps towards CH$$ _4 $$ product; and (D) *CH-*CH coupling steps towards CH$$ _2 $$CH$$ _2 $$ product. In these plots, red line denotes that the PES calculations are done merely by FT$$ ^2 $$DP, blue line denotes that the calculated data are acquired by single-point DFT calculation after FT$$ ^2 $$DP optimization, purple line denotes that the PES calculations are purely performed by DFT calculation, and C$$ _v $$ stands for carbon vacancy on A-P5 site. All the top views are for DFT-optimized structures. Orange: Fe atoms; Grey: C atoms; White: H atoms; Red: O atoms. FTS: Fischer-Tropsch synthesis; PES: potential energy surface; FT$$ ^2 $$DP: fine-tuned Fischer-Tropsch deep potential; DFT: density-functional theory.

Global optimization of reconstruction of edge sites

In addition to the $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(510) surface, some other surfaces have also been proposed as catalytically active in Fe-FTS process, including $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(021), $$ \chi $$-Fe$$ _5 $$C$$ _2 $$($$ \overline{4} $$11), $$ \eta $$-Fe$$ _2 $$C(111), $$ \theta $$-Fe$$ _3 $$C(010), and $$ \theta $$-Fe$$ _3 $$C(031)^{[18, 25]}. In recent years, some studies have suggested that the FTS activity of low-coordinated atoms (typically at edges and corners) may differ significantly from that of regular surface atoms^{[12, 74]}, highlighting the importance of these sites in mechanistic research. In this section, four potential FeC$$ _x $$ active surfaces in iron-based Fischer-Tropsch, i.e., $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(510), $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(021), $$ \eta $$-Fe$$ _2 $$C(111), and $$ \theta $$-Fe$$ _3 $$C(010) [Supplementary Figure 4], were considered to investigate the surface morphology and stability of edge sites. Each surface was modeled using a slab with a thickness of more than 10 Å, with the bottom two or three layers of FeC$$ _x $$ fixed during structural relaxations. A vacuum layer of 20 Å was introduced to separate two neighboring slabs for all surfaces. To model the edge sites, a 'half-surface' scheme was employed: half atoms of the top layer were removed, maximizing the distance and simultaneously minimizing the interaction between two adjacent edge sites. This approach is particularly useful for GA calculations, as it allows for two independent local searches on two edges to identify the most stable reconstructed structure.

Before performing GA calculations, we first checked the accuracy of FT$$ ^2 $$DP on large clean surfaces and unreconstructed edge structures. We used the FT$$ ^2 $$DP model to optimize the structures and then calculated their single-point energies on the PBE level. In Supplementary Tables 3 and 4, we report the energy differences between PBE single-point energies for FT$$ ^2 $$DP and PBE-optimized structures. The energy differences for all tested structures are less than 1.2 meV/atom, indicating that FT$$ ^2 $$DP is sufficiently accurate for investigating surface reconstructions of iron carbides.

There are two possible terminations of $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(510) ^[75]: Fe- and C-terminated surfaces, and their relative stability is dependent on the chemical potential of carbon ($$ \Delta \mu _\text{C} $$). In modeling the edge sites, we chose to remove four consecutive rows of Fe atoms (24 Fe atoms in total), along with the corresponding C atoms. These unreconstructed structures with edge sites may contain many under-coordinated Fe and C atoms, making global optimization essential to obtain low-energy reconstructed structures. A representative low-energy reconstructed structure of $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(510) surface with edge sites is shown in Figure 5A. In this configuration, only the relocation of carbon atoms is observed, while the positions of iron atoms remain largely unchanged, which is consistent with previous findings ^[24]. In contrast, reconstructions involving the migration of both iron and carbon atoms on Fe-terminated surfaces were identified, together with some newly formed [Fe$$ _4 $$C] squares [Supplementary Figure 5]. Since the atom number of different structures is not identical, we used Equation (3) to characterize their stabilities and plotted the relative surface energies ($$ \Delta \gamma $$) against $$ \Delta \mu _\text{C} $$ in Figure 5B. Notably, reconstructed C-terminated surfaces, such as C-4-7 in Figure 5B, exhibit higher stability than other clean surfaces at the upper limit of $$ \Delta \mu _\text{C} $$, including the reconstructed $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(510) surface, which is built according to the previous work of Liu et al.^[24]. Previous studies ^[25] have demonstrated the important role of [Fe$$ _4 $$C] squares (or A-P5 sites) in facilitating both CO dissociation and C-C coupling reactions. We obtained a reconstructed structure containing a row of inclined [Fe$$ _4 $$C] squares as shown in Figure 5A, which implies that the reconstructed edge sites on $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(510) surfaces could serve as active sites for iron-based Fischer-Tropsch reactions.

Figure 5. Structure of the lowest-energy reconstruction of each surface and relative surface energies of low-energy reconstructions with respect to $$ \Delta \mu _\text{C} $$. (A) and (B) $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(510) surface, (C) and (D) $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(021) surface, (E) and (F) $$ \theta $$-Fe$$ _3 $$C(010) surface. When referring to $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(510) surfaces, "Fe-" or "C-" denotes the Fe-terminated or C-terminated surface, respectively. The numbers in the names of surfaces [e.g., 6-15 in $$ \theta $$-Fe$$ _3 $$C(010) surface] represent the range of column numbers (counted from left to right) from which iron atoms have been removed. The cyan curve in (B) marked by "clean-rec" corresponds to the reconstructed $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(510) surface reported by the previous work of Liu et al.^[24].

$$ \chi $$-Fe$$ _5 $$C$$ _2 $$(021) and $$ \theta $$-Fe$$ _3 $$C(010) were also considered as potential active surfaces in iron-based FTS ^{[17, 25]}. Among them, the $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(021) surface exhibits greater structural complexity compared to the other three surfaces, consisting of both [Fe$$ _4 $$C] squares and [Fe$$ _5 $$C] pentagons [Supplementary Figure 4]. Despite this complexity, our workflow successfully performed global optimizations and identified the reconstructed surfaces with edge sites. The most stable reconstructed structure is illustrated in Figure 5C, where newly formed [Fe$$ _4 $$C] squares are also observed (also see Supplementary Figure 6 for additional structures). The relative surface energies ($$ \Delta \gamma $$) against $$ \Delta \mu _\text{C} $$ are plotted in Figure 5D. Although all of them are less stable than the clean $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(021) surface, with relatively small energy differences, they may still play certain roles in the FTS at a high temperature.

Figure 6. Results of variable-composition calculations on the $$ \eta $$-Fe$$ _2 $$C(111) surface. (A) A schematic representation of the variable-composition search, where "-3C" indicates three fewer carbon atoms compared to the unreconstructed structure; (B) Side and top views of unreconstructed and reconstructed structures of the most stable configuration at the lower $$ \Delta \mu _\text{C} $$ limit; (C) Plot of the relative surface energies of the most stable reconstructions with different total carbon numbers against $$ \Delta \mu _\text{C} $$.

The clean $$ \theta $$-Fe$$ _3 $$C(010) surface exhibits a highly ordered structure. The most stable reconstructed structure of $$ \theta $$-Fe$$ _3 $$C(010) surface with edge sites is presented in Figure 5E. This behavior resembles the phenomenon of "island decay" in surface science^[76], where a rough surface evolves toward a smoother surface to achieve higher stability (also see Supplementary Figure 7 for two additional reconstructions). Similar to the $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(021) surface, newly formed [Fe$$ _4 $$C] are observed in these reconstructed $$ \theta $$-Fe$$ _3 $$C(010) surfaces, while the original surface morphology is still preserved. The relative surface energies ($$ \Delta \gamma $$) as a function of $$ \Delta \mu _\text{C} $$ are also plotted in Figure 5F. Notably, none of the reconstructed $$ \theta $$-Fe$$ _3 $$C(010) surfaces are found to be more stable than the clean surface. Moreover, the energy differences are significantly larger than those on $$ \chi $$-Fe$$ _5 $$C$$ _2 $$(021) surfaces, indicating that the formation of edge sites on $$ \theta $$-Fe$$ _3 $$C(010) is much less favorable under a typical Fischer-Tropsch reaction condition.

Compared to other surfaces, the $$ \eta $$-Fe$$ _2 $$C(111) surface exhibits a relatively simple structure and all surface Fe-4fold square sites are fully occupied by carbon atoms [Supplementary Figure 4]. Here, we demonstrated variable-composition global optimization calculations involving different numbers of carbon atoms. When performing fixed-composition calculations, two edges of each surface were optimized using GA separately. While performing variable-composition calculations, the number of atoms at each edge can be changed and these searches will be repeated several times under different total numbers of atoms [Figure 6A]. One example of a reconstructed surface is shown in Figure 6B, where carbon atoms migrate from one edge to the other, forming C-C dimers to fully bond with neighboring atoms. The relative surface energies ($$ \Delta \gamma $$) of the reconstructed surfaces against $$ \Delta \mu _\text{C} $$ are plotted in Figure 6C. Structures with varying carbon numbers are detailed in Supplementary Figure 8. Although all of the reconstructed surfaces are less stable than the clean $$ \eta $$-Fe$$ _2 $$C(111) surface, the energy differences between them are relatively small, especially for carbon-rich structures under a high $$ \Delta \mu _\text{C} $$. For example, at the lower limit of $$ \Delta \mu _\text{C} $$ = -7.45 eV, the most stable reconstructed surface is identified as "-3C". This means the surface energy is minimized when three fewer carbon atoms are present compared to the unreconstructed structure. Conversely, at the upper limit of $$ \Delta \mu_\text{C} $$ = -6.60 eV, the relative surface energy decreases monotonically with an increasing carbon number. In the end, the stability of the surface with most carbon atoms ("+C") is very close to the clean surface. This suggests carbon deposition on the iron-carbon surface is presumed to be thermodynamically favorable under a high $$ \Delta \mu _\text{C} $$, typically near the reactant-catalyst equilibrium in a carbon-rich gas environment.

CONCLUSIONS

To summarize, we have constructed the FT$$ ^2 $$DP model by fine-tuning the DPA-2 LAM on a dataset focused on the iron-based FTS process and demonstrating its performances for the investigation of key reaction pathways in the FTS process and global optimization of several iron carbide surfaces with edge sites. The model validation and the atomistic simulation tasks served as comparative analysis between the results obtained through MLP and DFT calculations demonstrate that our FT$$ ^2 $$DP fine-tuning protocol exhibits significant promise for constructing universal MLPs with notable performance in studying reaction pathways, surface reconstruction and other atomistic processes of complex heterogeneous catalytic systems such as iron-based FTS, providing a valuable practice for the application of LAM in the theoretical simulation of heterogeneous catalysis as well. It should be emphasized that all our works including DFT calculation, MLP training, and atomic simulation workflow construction have been conducted in open-source platforms for integrating the collective intelligence of developers from varying domains and making our contributions at the same time.

We close the paper by giving a few general remarks. While the FT$$ ^2 $$DP model constructed via our fine-tuning protocol has achieved promising results, there remains significant room for improvement. Most importantly, since the model is pre-trained and fine-tuned by mainly using DFT data at the GGA-level, it is expected to be inadequate for systems with strong electronic correlation such as the surfaces of Fe oxides that also play important roles in the FTS process. To extend the FT$$ ^2 $$DP model to more diverse chemical scenarios that are relevant to FTS, it is necessary to include more accurate DFT training data that cover structures falling outside the current training set. First-principles calculation of strongly correlated materials with sufficient accuracy and efficiency is challenging by itself, and it is under active exploration to build unified MLP models that can describe weakly and strongly correlated systems with comparable accuracy by using mixed training data obtained from different theoretical methods. Considering highly demanding computational cost of generating new training data, especially when using advanced electronic structure methods beyond GGA that are necessary for strongly correlated systems, it is crucial to leverage various active learning strategies ^[70] to discern iteratively and automatically unlabeled structures that can improve the current model in the most efficient way. There are available platforms to facilitate this process such as DP-GEN ^[37]; however, further optimizing the active learning protocol for fine-tuned LAMs remains an open challenge. For example, conventional data selection criteria in active learning often rely on ensemble-based (also known as query-by-committee ^[70]) uncertainty quantification, which may under-perform for LAMs owing to additional computational costs required for training several large models, and more importantly can suffer from overconfidence problems ^[77]. Furthermore, as the upstream DPA-2 LAM and its associated DeePMD-kit platform are in continuous development, our fine-tuning protocol should keep evolving to harness emerging capabilities. Finally, applying the FT$$ ^2 $$DP model to more simulation tasks would broaden its utility and deepen insights into the challenges in the domain of iron-based FTS. In the future, we expect to further improve the FT$$ ^2 $$DP model, including using multi-task fine-tuning tactics to resist knowledge loss in unified descriptors, and refining the dataset by active-learning strategies with suitable uncertainty quantification and unique data selection in order to enrich the dataset for effectively covering larger configuration space related to the FTS domain from open-source datasets and specific catalytic simulation tasks while concurrently dropping redundant and outlying data, making it better for addressing more advanced challenges in iron-based FTS, such as the automatic exploration of surface reaction pathways, the morphology of FeC$$ _\text{x} $$ nanoclusters under different chemical environments, or the detailed effects of alkali promoters in FTS process. Our work may provide a blueprint for utilizing pre-trained LAMs through fine-tuning methodology in the atomistic simulation not only for iron-based FTS but also for other complex heterogeneous catalytic systems.

DECLARATIONS

Acknowledgments

The authors acknowledge Dr. Duo Zhang from AI for Science Institute for helping with model training and dataset visualization and Dr. Yike Huang from AI for Science Institute for sharing template code on utilizing ASE-GA in surface systems. The authors also appreciate the support of the High-performance Computing Platform of Peking University and National Supercomputer Center in Tianjin for the computational resources.

Authors' contributions

Contributed equally to this work: Liu, Z.Q.; Deng, Z.

Dataset preparation, model training and evaluation, data analysis, writing: Liu, Z. Q.; Deng, Z.

Discussion of results, revision: Zhao, H.; Wang, H.; Chen, M.

Project conceptualization, methodology, supervision, revision, funding acquisition: Jiang, H.

Availability of data and materials

The dataset and model used in this study are both available on AIS Square (https://www.aissquare.com/). In detail, the dataset with the structural information, DFT label information, and DFT input setting files can be accessed at https://www.aissquare.com/datasets/detail?pageType=datasets&name=FT2DP-dataset-FeCHO-v1&id=306, and the model with its input scripts is available at https://www.aissquare.com/models/detail?pageType=models&id=307. Additionally, the codes for TS optimization are all available in the ATST-Tools repository (https://github.com/QuantumMisaka/ATST-Tools). All other data and codes supporting the findings presented in this work are available from the corresponding author upon reasonable request.

Financial support and sponsorship

This work is financially supported by the National Key Research and Development Program of China (Project no. 2022YFB4101401) and National Natural Science Foundation of China (Project no. 22273002).

Conflicts of interest

All authors declared that there are no conflicts of interest.

Ethical approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Copyright

® The Author(s) 2025.

Supplementary Materials

REFERENCES

1. Chen, B. W. J.; Xu, L.; Mavrikakis, M. Computational methods in heterogeneous catalysis. Chem. Rev. 2021, 121, 1007-48.

2. Mahmoudi, H.; Mahmoudi, M.; Doustdar, O.; et al. A review of Fischer Tropsch synthesis process, mechanism, surface chemistry and catalyst formulation. Biofuels. Eng. 2017, 2, 11-31.

3. Ail, S. S.; Dasappa, S. Biomass to liquid transportation fuel via Fischer Tropsch synthesis – technology review and current scenario. Renew. Sust. Energ. Rev. 2016, 58, 267-86.

4. Schulz, H. Comparing Fischer-Tropsch synthesis on iron- and cobalt catalysts: the dynamics of structure and function. Stud. Surf. Sci. Catal. 2007, 163, 177-99.

5. Ma, W.; Jacobs, G.; Sparks, D. E.; Todic, B.; Bukur, D. B.; Davis, B. H. Quantitative comparison of iron and cobalt based catalysts for the Fischer-Tropsch synthesis under clean and poisoning conditions. Catal. Today. 2020, 343, 125-36.

6. Liu, Q. Y.; Shang, C.; Liu, Z. P. In. situ. active. site. for. Fe-catalyzed. Fischer–Tropsch. synthesis:. recent. progress. and. future. challenges. J. Phys. Chem. Lett. 2022, 13, 3342-52.

7. de Smit, E.; Cinquini, F.; Beale, A. M.; et al. Stability and reactivity of $$\epsilon$$-$$\chi$$-$$\theta$$ iron carbide catalyst phases in Fischer-Tropsch synthesis: controlling μ_C. J. Am. Chem. Soc. 2010, 132, 14928-41.

8. Zhang, J.; Abbas, M.; Zhao, W.; Chen, J. Enhanced stability of a fused iron catalyst under realistic Fischer–Tropsch synthesis conditions: insights into the role of iron phases ($$\chi$$-Fe$$_5$$C$$_2$$, $$\theta$$-Fe$$_3$$C and $$\alpha$$-Fe). Catal. Sci. Technol. 2022, 12, 4217-27.

9. Yang, C.; Zhao, H.; Hou, Y.; Ma, D. Fe$$_5$$C$$_2$$ nanoparticles: a facile bromide-induced synthesis and as an active phase for Fischer–Tropsch synthesis. J. Am. Chem. Soc. 2012, 134, 15814-21.

10. Zhao, H.; Liu, J. X.; Yang, C.; et al. Synthesis of iron-carbide nanoparticles: identification of the active phase and mechanism of Fe-based Fischer–Tropsch synthesis. CCS. Chem. 2021, 3, 2712-24.

11. Cano, L. A.; Cagnoli, M. V.; Fellenz, N. A.; et al. Influence of the crystal size of iron active species on the activity and selectivity. Appl. Catal. A. Gen. 2010, 379, 105-10.

12. Torres Galvis, H. M.; Bitter, J. H.; Davidian, T.; Ruitenbeek, M.; Dugulan, A. I.; de Jong, K. P. Iron particle size effects for direct production of lower olefins from synthesis gas. J. Am. Chem. Soc. 2012, 134, 16207-15.

13. Park, J. C.; Yeo, S. C.; Chun, D. H.; et al. Highly activated K-doped iron carbide nanocatalysts designed by computational simulation for Fischer–Tropsch synthesis. J. Mater. Chem. A. 2014, 2, 14371-9.

14. Pham, T. H.; Qi, Y.; Yang, J.; et al. Insights into Hägg iron-carbide-catalyzed Fischer–Tropsch synthesis: suppression of CH$$_4$$ formation and enhancement of C–C coupling on $$\chi$$-Fe$$_5$$C$$_2$$. ACS. Catal. 2015, 5, 2203-8.

15. Song, N.; Cao, J.; Chen, B.; Qian, G.; Duan, X.; Zhou, X. CO adsorption and activation of $$\eta$$-Fe$$_2$$C Fischer–Tropsch catalyst. Ind. Eng. Chem. Res. 2019, 58, 21296-303.

16. Chen, B.; Wang, D.; Duan, X.; et al. Charge-tuned CO activation over a $$\chi$$-Fe$$_5$$C$$_2$$ Fischer–Tropsch catalyst. ACS. Catal. 2018, 8, 2709-14.

17. Li, T.; Wen, X.; Yang, Y.; Li, Y. W.; Jiao, H. Mechanistic aspects of CO activation and C–C bond formation on the Fe/C- and Fe-terminated Fe$$_3$$C(010) surfaces. ACS. Catal. 2020, 10, 877-90.

18. Yin, J.; Liu, X.; Liu, X. W.; et al. Theoretical exploration of intrinsic facet-dependent CH$$_4$$ and C$$_2$$ formation on Fe$$_5$$C$$_2$$ particle. Appl. Catal. B. Environ. 2020, 278, 119308.

19. Zhang, X.; Wang, L.; Helwig, J.; et al. Artificial intelligence for science in quantum, atomistic, and continuum systems. arXiv2023, arXiv: 2307.08423. Available online: https://arxiv.org/abs/2307.08423. (accessed on 17 Mar 2025).

20. Ma, S.; Liu, Z. P. Machine learning for atomic simulation and activity prediction in heterogeneous catalysis: current status and future. ACS. Catal. 2020, 10, 13213-26.

21. Huang, S. D.; Shang, C.; Kang, P. L.; Zhang, X. J.; Liu, Z. P. LASP: fast global potential energy surface exploration. WIREs. Comput. Mol. Sci. 2019, 9, e1415.

22. Huang, S. D.; Shang, C.; Kang, P. L.; Liu, Z. P. Atomic structure of boron resolved using machine learning and global sampling. Chem. Sci. 2018, 9, 8644-55.

23. Chen, D.; Chen, L.; Zhao, Q. C.; Yang, Z. X.; Shang, C.; Liu, Z. P. Square-pyramidal subsurface oxygen[Ag$$_4$$OAg] drives selective ethene epoxidation on silver. Nat. Catal. 2024, 7, 536-45.

24. Liu, Q. Y.; Shang, C.; Liu, Z. P. In situ active site for CO activation in Fe-catalyzed Fischer-Tropsch synthesis from machine learning. J. Am. Chem. Soc. 2021, 143, 11109-20.

25. Liu, Q. Y.; Chen, D.; Shang, C.; Liu, Z. P. An optimal Fe–C coordination ensemble for hydrocarbon chain growth: a full Fischer–Tropsch synthesis mechanism from machine learning. Chem. Sci. 2023, 14, 9461-75.

26. van Steen, E.; Schulz, H. Polymerisation kinetics of the Fischer-Tropsch CO hydrogenation using iron and cobalt based catalysts. Appl. Catal. A. Gen. 1999, 186, 309-20.

27. Han, J.; Zhang, L.; Car, R.; Weinan, E. Deep potential: a general representation of a many-body potential energy surface. Commun. Comput. Phys. 2018, 23, 629-39.

28. Wang, H.; Zhang, L.; Han, J.; Weinan, E. DeePMD-kit: a deep learning package for many-body potential energy representation and molecular dynamics. Comput. Phys. Commun. 2018, 228, 178-84.

29. Zeng, J.; Zhang, D.; Lu, D.; et al. DeePMD-kit v2: a software package for deep potential models. J. Chem. Phys. 2023, 159, 054801.

30. DeePMD-kit. https://github.com/deepmodeling/deepmd-kit. (accessed on 2025-03-17).

31. Zhang, L.; Han, J.; Wang, H.; Saidi, W. A.; Car, R.; Weinan, E. End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems. arXiv2018, arXiv: 1805.09003. Available online: https://arxiv.org/abs/1805.09003. (accessed on 17 Mar 2025).

32. Hedman, D.; McLean, B.; Bichara, C.; Maruyama, S.; Larsson, J. A.; Ding, F. Dynamics of growing carbon nanotube interfaces probed by machine learning-enabled molecular simulations. Nat. Commun. 2024, 15, 4076.

33. Liu, J. C.; Luo, L.; Xiao, H.; Zhu, J.; He, Y.; Li, J. Metal affinity of support dictates sintering of gold catalysts. J. Am. Chem. Soc. 2022, 144, 20601-9.

34. Zhou, L.; Fu, X. P.; Wang, R.; et al. Dynamic phase transitions dictate the size effect and activity of supported gold catalysts. Sci. Adv. 2024, 10, eadr4145.

35. Hou, P.; Yu, Q.; Luo, F.; Liu, J. C. Reactant-induced dynamic active sites on Cu catalysts during the water - gas shift reaction. ACS. Catal. 2025, 15, 352-60.

36. Wu, J.; Chen, D.; Chen, J.; Wang, H. Structural and composition evolution of palladium catalyst for CO oxidation under steady-state reaction conditions. J. Phys. Chem. C. 2023, 127, 6262-70.

37. Zhang, Y.; Wang, H.; Chen, W.; et al. DP-GEN: a concurrent learning platform for the generation of reliable deep learning based potential energy models. Comput. Phys. Commun. 2020, 253, 107206.

38. Cheng, X.; Wu, C.; Xu, J.; Han, Y.; Xie, W.; Hu, P. Leveraging machine learning potentials for in-situ searching of active sites in heterogeneous catalysis. Precis. Chem. 2024, 2, 570-86.

39. Chanussot, L.; Das, A.; Goyal, S.; et al. Open Catalyst 2020 (OC20) dataset and community challenges. ACS. Catal. 2021, 11, 6059-72.

40. Tran, R.; Lan, J.; Shuaibi, M.; et al. The Open Catalyst 2022 (OC22) dataset and challenges for oxide electrocatalysts. ACS. Catal. 2023, 13, 3066-84.

41. Deng, B.; Zhong, P.; Jun, K.; et al. CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nat. Mach. Intell. 2023, 5, 1031-41.

42. Zhang, D.; Liu, X.; Zhang, X.; et al. DPA-2: a large atomic model as a multi-task learner. npj. Comput. Mater. 2024, 10, 293.

43. DeepModeling Community. https://deepmodeling.com. (accessed on 2025-03-17).

44. Chen, M.; Guo, G. C.; He, L. Systematically improvable optimized atomic basis sets for ab initio calculations. J. Phys. Condens. Mat. 2010, 22, 445501.

45. Li, P.; Liu, X.; Chen, M.; et al. Large-scale ab initio simulations based on systematically improvable atomic basis. Comput. Mater. Sci. 2016, 112, 503-17.

46. Schlipf, M.; Gygi, F. Optimization algorithm for the generation of ONCV pseudopotentials. Comput. Phys. Commun. 2015, 196, 36-44.

47. SG15 ONCV Potentials. http://quantum-simulation.org/potentials/sg15_oncv. (accessed on 2025-03-17).

48. Perdew, J. P.; Burke, K.; Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 1996, 77, 3865.

49. Lin, P.; Ren, X.; He, L. Strategy for constructing compact numerical atomic orbital basis sets by incorporating the gradients of reference wavefunctions. Phys. Rev. B. 2021, 103, 235131.

50. Monkhorst, H. J.; Pack, J. D. Special points for Brillouin-zone integrations. Phys. Rev. B. 1976, 13, 5188.

51. Methfessel, M.; Paxton, A. T. High-precision sampling for Brillouin-zone integration in metals. Phys. Rev. B. 1989, 40, 3616.

52. Zhang, D.; Bi, H.; Dai, F. Z.; et al. Pretraining of attention-based deep learning potential model for molecular simulation. npj. Comput. Mater. 2024, 10, 94.

53. Vaswani, A.; Shazeer, N.; Parmar, N.; et al. Attention is all you need. arXiv2017, arXiv: 1706.03762. Available online: https://arxiv.org/abs/1706.03762. (accessed on 17 Mar 2025).

54. Henkelman, G.; Uberuaga, B. P.; Jónsson, H. A climbing image nudged elastic band method for finding saddle points and minimum energy paths. J. Chem. Phys. 2000, 113, 9901-4.

55. Henkelman, G.; Jónsson, H. Improved tangent estimate in the nudged elastic band method for finding minimum energy paths and saddle points. J. Chem. Phys. 2000, 113, 9978-85.

56. Henkelman, G.; Jónsson, H. A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives. J. Chem. Phys. 1999, 111, 7010-22.

57. Kästner, J.; Sherwood, P. Superlinearly converging dimer method for transition state search. J. Chem. Phys. 2008, 128, 014106.

58. Larsen, A. H.; Mortensen, J. J.; Blomqvist, J.; et al. The atomic simulation environment - a Python library for working with atoms. J. Phys. Condens. Mat. 2017, 29, 273002.

59. Hermes, E. D.; Sargsyan, K.; Najm, H. N.; Zádor, J. Accelerated saddle point refinement through full exploitation of partial hessian diagonalization. J. Chem. Theory. Comput. 2019, 15, 6536-49.

60. Hermes, E. D.; Sargsyan, K.; Najm, H. N.; Zádor, J. Sella, an open-source automation-friendly molecular saddle point optimizer. J. Chem. Theory. Comput. 2022, 18, 6974-88.

61. ATST-Tools. https://github.com/QuantumMisaka/ATST-Tools. (accessed on 2025-03-17).

62. Vilhelmsen, L. B.; Hammer, B. A genetic algorithm for first principles global structure optimization of supported nano structures. J. Chem. Phys. 2014, 141, 044711.

63. Zakaryan, H. A.; Kvashnin, A. G.; Oganov, A. R. Stable reconstruction of the (110) Surface and its role in pseudocapacitance of rutile-like RuO$$_2$$. Sci. Rep. 2017, 7, 10357.

64. Hinuma, Y.; Kamachi, T.; Hamamoto, N.; Takao, M.; Toyao, T.; Shimizu, K. Surface oxygen vacancy formation energy calculations in 34 orientations of $$\beta$$-Ga$$_2$$O$$_3$$ and $$\theta$$-Al$$_2$$O$$_3$$. J. Phys. Chem. C. 2020, 124, 10509-22.

65. Momma, K.; Izumi, F. VESTA 3 for three-dimensional visualization of crystal, volumetric and morphology data. J. Appl. Cryst. 2011, 44, 1272-6.

66. Reuter, K.; Scheffler, M. Composition, structure, and stability of RuO$$_2$$(110) as a function of oxygen pressure. Phys. Rev. B. 2001, 65, 035406.

67. Reuter, K.; Scheffler, M. Composition and structure of the RuO$$_2$$(110) surface in an O$$_2$$ and CO environment: implications for the catalytic formation of CO$$_2$$. Phys. Rev. B. 2003, 68, 045407.

68. AIS-Square. DPA-2.2.0 pre-trained model. https://www.aissquare.com/models/detail?pageType=models&name=DPA-2.2.0-v3.0.0b3&id=272. (accessed on 2025-03-17).

69. AIS-Square. OpenLAM Project. https://www.aissquare.com/openlam. (accessed on 2025-03-17).

70. Kulichenko, M.; Nebgen, B.; Lubbers, N.; et al. Data generation for machine learning interatomic potentials and beyond. Chem. Rev. 2024, 124, 13681-714.

71. Broos, R. J. P.; Zijlstra, B.; Filot, I. A. W.; Hensen, E. J. M. Quantum-chemical DFT study of direct and H- and C-assisted CO dissociation on the $$\chi$$-Fe$$_5$$C$$_2$$ Hägg carbide. J. Phys. Chem. C. 2018, 122, 9929-38.

72. Zhao, Z. J.; Li, Z.; Cui, Y.; et al. Importance of metal-oxide interfaces in heterogeneous catalysis: a combined DFT, microkinetic, and experimental study of water-gas shift on Au/MgO. J. Catal. 2017, 345, 157-69.

73. Chen, C.; Zuo, Y.; Ye, W.; Li, X.; Ong, S. P. Learning properties of ordered and disordered materials from multi-fidelity data. Nat. Comp. Sci. 2021, 1, 46-53.

74. Xie, J.; Yang, J.; Dugulan, A. I.; et al. Size and promoter effects in supported iron Fischer-Tropsch catalysts: insights from experiment and theory. ACS. Catal. 2016, 6, 3147-57.

75. Zhao, S.; Liu, X. W.; Huo, C. F.; Li, Y. W.; Wang, J.; Jiao, H. Surface morphology of Hägg iron carbide ($$\chi$$-Fe$$_5$$C$$_2$$) from ab initio atomistic thermodynamics. J. Catal. 2012, 294, 47-53.

76. Morgenstern, K.; Rosenfeld, G.; Comsa, G. Decay of two-dimensional Ag islands on Ag(111). Phys. Rev. Lett. 1996, 76, 2113.

77. Kahle, L.; Zipoli, F. Quality of uncertainty estimates from neural network potential ensembles. Phys. Rev. E. 2022, 105, 015311.

Cite This Article

Research Article

Open Access

FT²DP: large atomic model fine-tuned machine learning potential for accelerating atomistic simulation of iron-based Fischer-Tropsch synthesis

Zhao-Qing Liu

, ... Hong Jiang

How to Cite

Liu, Z. Q.; Deng, Z.; Zhao, H.; Wang, H.; Chen, M.; Jiang, H. FT²DP: large atomic model fine-tuned machine learning potential for accelerating atomistic simulation of iron-based Fischer-Tropsch synthesis. J. Mater. Inf. 2025, 5, 27. http://dx.doi.org/10.20517/jmi.2024.105

Download Citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click on download.

Export Citation File:

RIS BibTeX EndNote

Type of Import

Direct Import Indirect Import

Tips on Downloading Citation

This feature enables you to download the bibliographic information (also called citation data, header data, or metadata) for the articles on our site.

Citation Manager File Format

Use the radio buttons to choose how to format the bibliographic data you're harvesting. Several citation manager formats are available, including EndNote and BibTex.

Type of Import

If you have citation management software installed on your computer your Web browser should be able to import metadata directly into your reference database.

Direct Import: When the Direct Import option is selected (the default state), a dialogue box will give you the option to Save or Open the downloaded citation data. Choosing Open will either launch your citation manager or give you a choice of applications with which to use the metadata. The Save option saves the file locally for later use.

Indirect Import: When the Indirect Import option is selected, the metadata is displayed and may be copied and pasted as needed.

About This Article

Special Issue

This article belongs to the Special Issue “Unlocking the AI Future of Materials Science”: Selected Papers from the International Workshop on Data-driven Computational and Theoretical Materials Design (DCTMD)

Copyright

© The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Data & Comments

Data

Views

238

Downloads

44

Citations

0

Comments

0

Comments

Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at [email protected].

⁰

Download PDF

Download XML 1 downloads

Cite This Article 0 clicks

Export Citation 0 clicks

Like This Article 0 likes

Share This Article

https://www.oaepublish.com/articles/jmi.2024.105

Scan the QR code for reading!

See Updates

Contents

Figures

FT2DP: large atomic model fine-tuned machine learning potential for accelerating atomistic simulation of iron-based Fischer-Tropsch synthesis

Abstract

Graphical Abstract

Keywords

INTRODUCTION

MATERIALS AND METHODS

DFT calculations

DPA-2 and fine-tuning methodology

Double-to-single workflow for TS optimization

Genetic algorithm for global optimization

Ab initio atomistic thermodynamics

RESULTS AND DISCUSSION

FT2DP construction and validation

Reaction pathways

Global optimization of reconstruction of edge sites

CONCLUSIONS

DECLARATIONS

Acknowledgments

Authors' contributions

Availability of data and materials

Financial support and sponsorship

Conflicts of interest

Ethical approval and consent to participate

Consent for publication

Copyright

Supplementary Materials

REFERENCES

Cite This Article

How to Cite

Download Citation

Export Citation File:

Type of Import

Tips on Downloading Citation

Citation Manager File Format

Type of Import

About This Article

Special Issue

Copyright

Data & Comments

Data

Comments

Share This Article

See Updates

Committee on Publication Ethics

Portico

Committee on Publication Ethics

Portico

FT²DP: large atomic model fine-tuned machine learning potential for accelerating atomistic simulation of iron-based Fischer-Tropsch synthesis

FT²DP construction and validation