Component of Code ================= DoMD-ChemFAST is a modular platform for constructing molecular dynamics simulations. This manual details its code architecture, core module functionalities, and data management protocols. Directory Structure ------------------- The directory structure of DoMD follows the principle of **"Logic-Data-Interface"** separation. Below is the core file tree of the project with functional descriptions: .. parsed-literal:: DoMD/ ├── domd_cgbuilder/ │ ├── HSP_predictor/ │ │ ├── models/ │ │ │ ├── __init__.py │ │ │ ├── dDPredictor.pt │ │ │ ├── dHPredictor.pt │ │ │ └── dPPredictor.pt │ │ ├── __init__.py │ │ └── hsp_models.py │ ├── __init__.py │ ├── _conf_gen.py │ ├── cg_ff.py │ ├── cg_ff_old.py │ ├── cg_mol.py ├── domd_database/ │ ├── forcefield/ │ │ ├── gaff/ │ │ │ ├── data/ │ │ │ │ ├── amber_10.pkl │ │ │ │ └── gaff_10.db │ │ │ └── add_data_to_db.py │ │ └── oplsaa/ │ │ ├── data/ │ │ │ ├── boss_bonded.sb │ │ │ ├── ffbonded.itp │ │ │ ├── ffnonbonded.itp │ │ │ └── STaGE_opls_tomoltemplate_opls.txt │ │ ├── add_data_to_db.py │ │ └── make_db.py │ └── __init__.py ├── domd_forcefield/ │ ├── gaff/ │ │ ├── resources/ │ │ │ └── gaff_10.db │ │ ├── __init__.py │ │ ├── database.py │ │ ├── gaff.py │ │ ├── gaff_db.py │ │ ├── gaff_types.py │ │ └── ml.py │ ├── oplsaa/ │ │ ├── ml_functions/ │ │ │ ├── resources/ │ │ │ │ ├── angle_idx.pkl │ │ │ │ ├── bond_idx.pkl │ │ │ │ ├── di_idx.pkl │ │ │ │ ├── idx_angle.pkl │ │ │ │ ├── idx_bond.pkl │ │ │ │ ├── idx_di.pkl │ │ │ │ ├── idx_imps.pkl │ │ │ │ ├── idx_nonbond.pkl │ │ │ │ ├── imps_idx.pkl │ │ │ │ ├── minAngle.pt │ │ │ │ ├── minBond.pt │ │ │ │ ├── minCharge.pt │ │ │ │ ├── minDi.pt │ │ │ │ ├── minDi_add.pt │ │ │ │ ├── minImp.pt │ │ │ │ ├── minNonbond.pt │ │ │ │ └── nbtype_an_hash.pkl │ │ │ ├── __init__.py │ │ │ └── models.py │ │ ├── resources/ │ │ │ ├── opls.db # This file is not included in the repository due to its size. Please refer to the Data Management section for setup instructions. │ │ │ └── readme.md │ │ ├── __init__.py │ │ ├── database.py │ │ ├── ml.py │ │ ├── opls.py │ │ ├── opls_db.py │ │ └── opls_types.py │ ├── __init__.py │ ├── charge_model.py │ ├── forcefield.py │ └── functions.py ├── domd_tools/ │ ├── __init__.py │ ├── aa_builder.py │ ├── coarse_grain.py │ ├── force_field.py │ ├── gmx_output.py │ └── manage_db.py ├── domd_topology/ │ ├── __init__.py │ ├── _mapping.py │ ├── functions.py │ ├── reactor.py │ └── reactor_old.py ├── domd_xyz/ │ ├── embed/ │ │ ├── __init__.py │ │ ├── embed_with_cg_xyz.py │ │ └── optimize_orientation.py │ ├── __init__.py │ └── embed_molecule.py ├── misc/ │ ├── io/ │ │ ├── __init__.py │ │ ├── assemble.py │ │ ├── gmx_reader.py │ │ ├── gmx_writer.py │ │ ├── xml_reader.py │ │ └── xml_writer.py │ ├── __init__.py │ ├── aa_molecule.py │ ├── cg_system.py │ ├── draw.py │ └── logger.py ├── polyimides_dataset/ │ ├── dbapp.zip │ └── readme.md ├── .gitignore ├── .readthedocs.yaml ├── ChemFAST-logo.png ├── DoMD-logo.png ├── DoMDlogo-square.png ├── environment.yml ├── environment_gpu.yml ├── LICENSE ├── MAINFEST.in ├── README.md └── setup.py Core Modules Explained ---------------------- DoMD consists of several core sub-packages that work synergistically to transform input data from SMILES strings into GROMACS input files. 1. domd_topology (Topology Engine) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This is the logical core of DoMD, responsible for handling chemical connectivity. * **Function**: Parses SMILES/SMARTS strings and executes the **S-CGFG** (Stochastic Coarse-Grained Fine-Graining) algorithm. * **Key Class**: ``Reactor``. It reads reaction templates to connect disconnected monomers or coarse-grained beads into a complete All-Atom (AA) topology graph (NetworkX Graph). 2. domd_forcefield (Parameterization) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This is the physical core of DoMD, responsible for assigning physical parameters to the topology graph. * **Function**: Assigns atomic charges, Lennard-Jones parameters, and bonded parameters (Bond/Angle/Dihedral). * **Strategy**: Uses a **Hybrid Strategy**. It prioritizes querying verified experimental parameters (BOSS/LigParGen) from ``domd_database``. For unknown fragments, it utilizes built-in **GAT (Graph Attention Networks)** models for high-precision prediction. 3. domd_xyz (Geometry Embedding) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Responsible for generating 3D coordinates from the topology graph. * **Algorithm**: Uses **Fragment Embedding** technology. It generates local coordinates for rigid fragments, maps them back to the positions of Coarse-Grained (CG) beads, and performs rotational optimization to eliminate steric clashes. * **Feature**: Supports the ``large=N`` parameter, which uses spatial grid decomposition to accelerate the construction of macromolecules. 4. domd_cgbuilder (Coarse-Graining) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Responsible for bottom-up coarse-grained modeling. * **Function**: Predicts Hansen Solubility Parameters (HSP) based on chemical structure to derive interaction potentials ($\epsilon$) between CG beads. It also handles the definition of Rigid Bodies. 5. domd_tools (User Interface) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This layer is for direct user interaction. It is recommended to import functions directly from this module when writing scripts, rather than calling low-level modules. * ``create_cg_system``: One-click generation of coarse-grained simulation files. * ``build_aa_topology``: Executes Backmapping. * ``assign_ff_parameters``: Automated force field assignment. * ``get_gmx``: Exports final simulation files. Data Management --------------- DoMD relies on extensive force field database files, which are typically not included in the Git repository due to their size. **OPLS Database Setup:** Users must ensure that the ``opls.db`` file is placed in the correct path; otherwise, the force field engine will not function: .. code-block:: bash # Correct Path domd_database/forcefield/oplsaa/resources/opls.db This database contains pre-calculated parameters for millions of molecules and serves as the foundation of the Hybrid strategy. Workflow Integration -------------------- The modular design of DoMD allows for flexible pipeline integration. A typical data flow is as follows: 1. **Input**: SMILES + Reaction Template 2. **CGBuilder**: ``domd_cgbuilder`` $\rightarrow$ ``.xml`` (for HOOMD/GALAMOST) 3. **Simulation**: (External MD Engine) $\rightarrow$ Relaxed Configuration 4. **Backmapping**: ``domd_topology`` + ``domd_xyz`` $\rightarrow$ AA Graph + Coords 5. **Typing**: ``domd_forcefield`` $\rightarrow$ Parameterized System 6. **Output**: ``misc.io`` $\rightarrow$ ``.gro`` / ``.top``