Input Configuration and File Format =================================== ChemFAST does not use static configuration files (like ``.mdp`` or ``.in``). Instead, it uses **Python Data Structures** (Dictionaries and Lists) to define the system. This allows users to leverage Python's logic (loops, conditionals) to construct complex topologies programmatically. This section details the format of the two core configuration blocks: **Chemical Definitions** and **CG System Configuration**. 1. Chemical Language: SMILES and SMARTS --------------------------------------- ChemFAST functions as a chemical compiler. It relies on standard chemical line notations to define identity and logic. SMILES (Simplified Molecular Input Line Entry System) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * **Role**: Defines the **Identity** (Atomic Graph) of a single Coarse-Grained bead. * **Format**: Standard OpenSMILES strings. * **Usage**: * ``CCO`` (Ethanol): A bead representing a solvent molecule. * ``C=Cc1ccccc1`` (Styrene): A bead representing a monomer. * *Note*: The complexity of the SMILES determines the resolution. ``C`` is a united-atom methane; ``CC(C(F)...)`` is a complex functional group. SMARTS (SMARTS Pattern) ~~~~~~~~~~~~~~~~~~~~~~~ * **Role**: Defines the **Logic** (Reaction/Connectivity) between beads. * **Format**: Reaction SMARTS with **Atom Mapping Numbers** (e.g., ``:1``, ``:2``). * **Crucial Rule**: **Regiospecificity**. * The mapping indices dictate the direction of the bond. * ``[C:1]=[C:2].[C:3]=[C:4] >> [C:1][C:2][C:3]=[C:4]`` implies a specific connection between ``[C:2]`` and ``[C:3]``. * Always ensure the atom map indices in the reactant side match the intended connection points in the product side. 2. Resolution-Agnostic Backmapping ---------------------------------- ChemFAST is **Resolution-Agnostic**. This means the software does not enforce a specific grain size. * **Logic**: A "CG Bead" is simply a container. Its contents are defined by the SMILES string linked to it in the ``mols`` dictionary. * **Flexibility**: * *Fine Resolution*: 1 Bead = ``C`` (United Atom Polyethylene). * *Standard Resolution*: 1 Bead = ``c1ccccc1`` (Benzene ring). * *Coarse Resolution*: 1 Bead = ``NCCCC(N)C(=O)O`` (Entire Lysine residue). * **Backmapping**: The backmapping engine simply "unpacks" the SMILES (from 2D to 3D structure) into the bead's center of mass and rotates it to match the connectivity defined by the SMARTS. 3. Chemical Definition Format ----------------------------- The chemical rules are defined in two dictionaries: ``mols`` (Ingredients) and ``reaction_template`` (Recipes). The ``mols`` Dictionary ~~~~~~~~~~~~~~~~~~~~~~~ Defines the mapping between a CG Bead Name (Key) and its chemical properties (Value). Here, we take a Single-Ion Polymer Electrolyte system as an example. .. code-block:: python mols = { 'F': { 'smiles': 'CC(C(F)(F)(F))C(=O)[N-]S(=O)(=O)C(F)(F)(F)', # Atomic structure 'file': None, # Path to PDB file (if is_rigid=True) 'is_rigid': False # Treat as rigid body? }, 'P': { 'smiles': 'CCO', 'file': None, 'is_rigid': False }, 'L': { 'smiles': '[Li+]', 'file': None } } * **Key** (str): The arbitrary name of the bead (e.g., 'F', 'P', 'L'). * **smiles** (str): The atomic structure. * **file** (str/None): Used for Bio-Hybrid systems. If provided, ChemFAST extracts coordinates from this PDB instead of generating them from SMILES. * **is_rigid** (bool): If ``True``, the bead behaves as a rigid body during pre-equilibration, and backmapping restores the PDB structure via rotation matrices. The ``reaction_template`` Dictionary ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Defines how beads connect. This is critical for topology reconstruction. .. code-block:: python reaction_template = { 'P-F': { 'cg_reactant_list': [('P', 'F')], 'smarts': '[CH3,CH2:1].[CH2,CH3:2]>>[C:1][C:2]', 'prod_idx': [0] }, 'O-O': { 'cg_reactant_list': [('O', 'O')], 'smarts': '[CH3:1].[O:2]>>[C:1][O:2]', 'prod_idx': [0] }, 'S': { # Example of a self-mapping identity reaction 'cg_reactant_list': [('L',)], 'smarts': '*>>*', 'prod_idx': [0] } } * **Key** (str): Unique name for the reaction type (e.g., 'P-F'). * **cg_reactant_list** (list of tuples): Which bead types undergo this reaction? e.g., ``[('P', 'F')]`` means bead 'P' reacts with bead 'F'. * **smarts** (str): The reaction string defining atomic connectivity. * **prod_idx** (list of int): **Product Selection**. * In simulation, we often model the *result* of a reaction. * A SMARTS reaction might generate byproducts (e.g., Water). * ``[0]`` tells ChemFAST: "The first product in the SMARTS string is the actual connected structure we want to keep in the topology. Discard any others." 4. CG System Configuration Format --------------------------------- The initial state of the simulation box is defined by a list of metadata dictionaries (``metas``). Each dictionary represents a distinct molecule type (polymer chain, ion, solvent) and its abundance. .. code-block:: python # Example 1: A Polymer Chain (Sequence P-O-O...F) # Topology: Linear connections defined in 'bond' string meta1 = { 'type': ['P','O','O','O','O','O','O','O','O','O','O','O','O','O','F'] * 20, 'bond': '0-1,1-2,2-3,3-4,4-5...', # Explicit connectivity indices 'N': 500, # Number of chains 'is_rigid': False, # Chain is flexible 'is_angle': True, # Generate angle potentials 'is_dihedral': False # Do not generate dihedral potentials } # Example 2: Free Ions (Li+) # Topology: No bonds meta2 = { 'type': ['L'], 'N': 10000, # Number of ions } # Combine into the system list metas = [meta1, meta2] Key Definitions ~~~~~~~~~~~~~~~ * **type** (list of str): The sequence of beads in **one single molecule**. * *Example*: ``['P', 'O', 'F']`` defines a trimer. * *Reference*: Strings must match keys in ``mols``. * **bond** (str): A comma-separated string defining the connectivity **within the chain**. * **Format**: ``"id1-id2,id3-id4,..."`` * **Indexing**: 0-based relative to the ``type`` list. * *Example*: For ``['A', 'B', 'C']`` connected linearly: * ``bond`` = ``"0-1,1-2"`` * *Complex Topologies*: For branched polymers or networks, you can define arbitrary connections (e.g., ``"0-1,0-5"`` creates a branch point at bead 0). * **N** (int): The number of copies of this molecule to place in the simulation box. * **is_rigid** (bool): **Chain-level Rigidity**. * If ``True``, the entire molecule defined in ``type`` is treated as a single rigid body in the CG simulation (e.g., a rigid nanoparticle). * *Contrast*: ``mols['is_rigid']`` makes a *single bead* rigid. ``meta['is_rigid']`` makes the *whole chain* rigid. * **is_angle** (bool): If ``True``, ChemFAST will automatically generate angles for this chain in the output XML file based on the connectivity. * **is_dihedral** (bool): If ``True``, ChemFAST will automatically generate dihedrals for this chain in the output XML file based on the connectivity.