Input Configuration and File Format

ChemFAST does not use static configuration files (like .mdp or .in). Instead, it uses Python Data Structures (Dictionaries and Lists) to define the system. This allows users to leverage Python’s logic (loops, conditionals) to construct complex topologies programmatically.

This section details the format of the two core configuration blocks: Chemical Definitions and CG System Configuration.

1. Chemical Language: SMILES and SMARTS

ChemFAST functions as a chemical compiler. It relies on standard chemical line notations to define identity and logic.

SMILES (Simplified Molecular Input Line Entry System)

  • Role: Defines the Identity (Atomic Graph) of a single Coarse-Grained bead.

  • Format: Standard OpenSMILES strings.

  • Usage:
    • CCO (Ethanol): A bead representing a solvent molecule.

    • C=Cc1ccccc1 (Styrene): A bead representing a monomer.

    • Note: The complexity of the SMILES determines the resolution. C is a united-atom methane; CC(C(F)...) is a complex functional group.

SMARTS (SMARTS Pattern)

  • Role: Defines the Logic (Reaction/Connectivity) between beads.

  • Format: Reaction SMARTS with Atom Mapping Numbers (e.g., :1, :2).

  • Crucial Rule: Regiospecificity.
    • The mapping indices dictate the direction of the bond.

    • [C:1]=[C:2].[C:3]=[C:4] >> [C:1][C:2][C:3]=[C:4] implies a specific connection between [C:2] and [C:3].

    • Always ensure the atom map indices in the reactant side match the intended connection points in the product side.

2. Resolution-Agnostic Backmapping

ChemFAST is Resolution-Agnostic. This means the software does not enforce a specific grain size.

  • Logic: A “CG Bead” is simply a container. Its contents are defined by the SMILES string linked to it in the mols dictionary.

  • Flexibility:
    • Fine Resolution: 1 Bead = C (United Atom Polyethylene).

    • Standard Resolution: 1 Bead = c1ccccc1 (Benzene ring).

    • Coarse Resolution: 1 Bead = NCCCC(N)C(=O)O (Entire Lysine residue).

  • Backmapping: The backmapping engine simply “unpacks” the SMILES (from 2D to 3D structure) into the bead’s center of mass and rotates it to match the connectivity defined by the SMARTS.

3. Chemical Definition Format

The chemical rules are defined in two dictionaries: mols (Ingredients) and reaction_template (Recipes).

The mols Dictionary

Defines the mapping between a CG Bead Name (Key) and its chemical properties (Value). Here, we take a Single-Ion Polymer Electrolyte system as an example.

mols = {
    'F': {
        'smiles': 'CC(C(F)(F)(F))C(=O)[N-]S(=O)(=O)C(F)(F)(F)', # Atomic structure
        'file': None,         # Path to PDB file (if is_rigid=True)
        'is_rigid': False     # Treat as rigid body?
    },
    'P': {
        'smiles': 'CCO',
        'file': None,
        'is_rigid': False
    },
    'L': {
        'smiles': '[Li+]',
        'file': None
    }
}
  • Key (str): The arbitrary name of the bead (e.g., ‘F’, ‘P’, ‘L’).

  • smiles (str): The atomic structure.

  • file (str/None): Used for Bio-Hybrid systems. If provided, ChemFAST extracts coordinates from this PDB instead of generating them from SMILES.

  • is_rigid (bool): If True, the bead behaves as a rigid body during pre-equilibration, and backmapping restores the PDB structure via rotation matrices.

The reaction_template Dictionary

Defines how beads connect. This is critical for topology reconstruction.

reaction_template = {
    'P-F': {
        'cg_reactant_list': [('P', 'F')],
        'smarts': '[CH3,CH2:1].[CH2,CH3:2]>>[C:1][C:2]',
        'prod_idx': [0]
    },
    'O-O': {
        'cg_reactant_list': [('O', 'O')],
        'smarts': '[CH3:1].[O:2]>>[C:1][O:2]',
        'prod_idx': [0]
    },
    'S': { # Example of a self-mapping identity reaction
        'cg_reactant_list': [('L',)],
        'smarts': '*>>*',
        'prod_idx': [0]
    }
}
  • Key (str): Unique name for the reaction type (e.g., ‘P-F’).

  • cg_reactant_list (list of tuples): Which bead types undergo this reaction? e.g., [('P', 'F')] means bead ‘P’ reacts with bead ‘F’.

  • smarts (str): The reaction string defining atomic connectivity.

  • prod_idx (list of int): Product Selection.
    • In simulation, we often model the result of a reaction.

    • A SMARTS reaction might generate byproducts (e.g., Water).

    • [0] tells ChemFAST: “The first product in the SMARTS string is the actual connected structure we want to keep in the topology. Discard any others.”

4. CG System Configuration Format

The initial state of the simulation box is defined by a list of metadata dictionaries (metas). Each dictionary represents a distinct molecule type (polymer chain, ion, solvent) and its abundance.

# Example 1: A Polymer Chain (Sequence P-O-O...F)
# Topology: Linear connections defined in 'bond' string
meta1 = {
    'type': ['P','O','O','O','O','O','O','O','O','O','O','O','O','O','F'] * 20,
    'bond': '0-1,1-2,2-3,3-4,4-5...',  # Explicit connectivity indices
    'N': 500,                          # Number of chains
    'is_rigid': False,                 # Chain is flexible
    'is_angle': True,                  # Generate angle potentials
    'is_dihedral': False               # Do not generate dihedral potentials
}

# Example 2: Free Ions (Li+)
# Topology: No bonds
meta2 = {
    'type': ['L'],
    'N': 10000,       # Number of ions
}

# Combine into the system list
metas = [meta1, meta2]

Key Definitions

  • type (list of str):

    The sequence of beads in one single molecule. * Example: ['P', 'O', 'F'] defines a trimer. * Reference: Strings must match keys in mols.

  • bond (str):

    A comma-separated string defining the connectivity within the chain. * Format: "id1-id2,id3-id4,..." * Indexing: 0-based relative to the type list. * Example: For ['A', 'B', 'C'] connected linearly:

    • bond = "0-1,1-2"

    • Complex Topologies: For branched polymers or networks, you can define arbitrary connections (e.g., "0-1,0-5" creates a branch point at bead 0).

  • N (int):

    The number of copies of this molecule to place in the simulation box.

  • is_rigid (bool):

    Chain-level Rigidity. * If True, the entire molecule defined in type is treated as a single rigid body in the CG simulation (e.g., a rigid nanoparticle). * Contrast: mols['is_rigid'] makes a single bead rigid. meta['is_rigid'] makes the whole chain rigid.

  • is_angle (bool):

    If True, ChemFAST will automatically generate angles for this chain in the output XML file based on the connectivity.

  • is_dihedral (bool):

    If True, ChemFAST will automatically generate dihedrals for this chain in the output XML file based on the connectivity.