pySBML¶

pySBML is a library to parse SBML models into native, type-annotated Python types and transform ODE models into a simpler representation.

In [1]:

Copied!

from pathlib import Path

import pysbml
from pathlib import Path

import pysbml

Main routine¶

The main feature of pySBML is to read SBML models and then transform them into a simpler representation that directly can be interpreted as a system of ordinary differential equations.

For a one-line solution, you can use the load_and_transform_model function.

This supports both Path and str arguments, although the pathlib.Path solution is always preferred to support cross-platform scripts.

Note that we defined a _repr_markdown_ method for nice markdown display of a model in jupyter notebooks

In [2]:

Copied!

model = pysbml.load_and_transform_model(Path("assets") / "00462.xml")
model
model = pysbml.load_and_transform_model(Path("assets") / "00462.xml")
model

Out[2]:

case00462¶

Parameters¶

name	value	unit
k1	1.00000000000000	None
C	1.00000000000000	None

Variables¶

name	value	unit
S1	$0.000150000000000000$	None
S2	$0.0$	None

Reactions¶

name	fn	stoichiometry
reaction1	$S1*k1$	{'S1': -1.00000000000000, 'S2': 1.00000000000000}

We also supply a codegen function to directly transform your model into a Python module that you can execute.

In [3]:

Copied!

from pysbml.codegen import codegen

print(codegen(model))
from pysbml.codegen import codegen

print(codegen(model))

import math
import scipy.special
import pandas as pd

time: float = 0.0
k1: float = 1.00000000000000
C: float = 1.00000000000000
S1: float = 0.000150000000000000
S2: float = 0.0

# Initial assignments
reaction1 = S1*k1
y0 = [S1, S2]
variable_names = ['S1', 'S2']

def model(time: float, variables: tuple[float, ...]) -> tuple[float, ...]:
    S1, S2 = variables
    reaction1: float = S1*k1
    dS1dt: float = -reaction1
    dS2dt: float = reaction1
    return dS1dt, dS2dt


def derived(time: float, variables: tuple[float, ...]) -> dict[str, float]:
    S1, S2 = variables
    reaction1: float = S1*k1
    return {
        'k1': k1,
        'C': C,
        'reaction1': reaction1,
    }

Step by step¶

If you want to inspect every step of the process, you can.
In this case, we start by loading the entire SBML document, which contains plugin information and the actual model.

Step 1: loading the model¶

Using the load_document function, we parse the model into native Python types without further modifications.

All SBML constructs as well as the mathml data is represented in a modern way, using type-annotated dataclasses.
You can find these in pysbml.parse.data and pysbml.parse.mathml respectively.

This representation will make it a lot easier to keep all variants in mind.

For example, the Reaction class can contain locally defined parameters as well as stoichiometries which either map a variable directly to a factor or a tuple of factor and species reference. This is encoded as follows

@dataclass(kw_only=True, slots=True)
class Reaction:
    body: Base
    stoichiometry: Mapping[str, float | list[tuple[float, str]]]
    args: list[Symbol]
    local_pars: dict[str, Parameter] = field(default_factory=dict)

No untyped model.getListOfReactions() methods, just data. Simple and efficient.

In [4]:

Copied!

from pysbml import load_document

doc = load_document(Path("assets") / "00462.xml")
doc.model
from pysbml import load_document

doc = load_document(Path("assets") / "00462.xml")
doc.model

Out[4]:

case00462¶

Compartment¶

name	size	is_constant
C	1.0	True

Variables¶

name	amount	conc	constant	substance_units	compartment	only_substance_units	boundary_condition
S1	None	0.00015	False	substance	C	False	False
S2	None	0.0	False	substance	C	False	False

Parameters¶

name	value	is_constant	unit
k1	1.0	True

Reactions¶

name	body	args	stoichiometry	local pars
reaction1	C * k1 * S1	[C, k1, S1]	{'S1': -1.0, 'S2': 1.0}	{}

Step 2: transforming the model¶

As you can see above, the SBML standard contains a lot of different flags and options for what e.g. a Variable is supposed to mean.

This includes whether the variable is an amount, a concentration, constant, is to be interpreted as an amount (only_substrate_units), has a boundary condition, lives in a constant or dynamic comparment and so on.

To us that representation is too complex.
We want something simpler.
Using the transform method, we can represent the model using just the data below.

type Expr = sympy.Symbol | sympy.Float | sympy.Expr
type Stoichiometry = dict[str, Expr]

class Parameter:
    value: sympy.Float
    unit: Quantity | None

class Variable:
    value: sympy.Float
    unit: Quantity | None

class Reaction:
    expr: sympy.Expr
    stoichiometry: Stoichiometry

class Model:
    name: str
    units: dict[str, Quantity] = field(default_factory=dict)
    functions: dict[str, Expr] = field(default_factory=dict)
    parameters: dict[str, Parameter] = field(default_factory=dict)
    variables: dict[str, Variable] = field(default_factory=dict)
    derived: dict[str, Expr] = field(default_factory=dict)
    reactions: dict[str, Reaction] = field(default_factory=dict)
    initial_assignments: dict[str, Expr] = field(default_factory=dict)

Parameters are always constant, variables always change.
No special handling of compartments, no locally defined parameters.

Note that we also transformed the MathML classes into sympy expressions for easier manipulation.

In [5]:

Copied!

from pysbml.transform import transform

model = transform(doc)
model
from pysbml.transform import transform

model = transform(doc)
model

Out[5]:

case00462¶

Parameters¶

name	value	unit
k1	1.00000000000000	None
C	1.00000000000000	None

Variables¶

name	value	unit
S1	$0.000150000000000000$	None
S2	$0.0$	None

Reactions¶

name	fn	stoichiometry
reaction1	$S1*k1$	{'S1': -1.00000000000000, 'S2': 1.00000000000000}

In [6]:

Copied!

print(model._repr_markdown_())
print(model._repr_markdown_())

# case00462
# Parameters
| name | value | unit | 
| --- | --- | --- | 
| k1 | 1.00000000000000 | None | 
| C | 1.00000000000000 | None | 
# Variables
| name | value | unit | 
| --- | --- | --- | 
| S1 | $0.000150000000000000$ | None | 
| S2 | $0.0$ | None | 
# Reactions
| name | fn | stoichiometry | 
| --- | --- | --- | 
| reaction1 | $S1*k1$ | {'S1': -1.00000000000000, 'S2': 1.00000000000000} |

Step 3: codegen¶

As above, you can use our codegen function to directly generate a model.

In [7]:

Copied!

print(codegen(model))
print(codegen(model))

import math
import scipy.special
import pandas as pd

time: float = 0.0
k1: float = 1.00000000000000
C: float = 1.00000000000000
S1: float = 0.000150000000000000
S2: float = 0.0

# Initial assignments
reaction1 = S1*k1
y0 = [S1, S2]
variable_names = ['S1', 'S2']

def model(time: float, variables: tuple[float, ...]) -> tuple[float, ...]:
    S1, S2 = variables
    reaction1: float = S1*k1
    dS1dt: float = -reaction1
    dS2dt: float = reaction1
    return dS1dt, dS2dt


def derived(time: float, variables: tuple[float, ...]) -> dict[str, float]:
    S1, S2 = variables
    reaction1: float = S1*k1
    return {
        'k1': k1,
        'C': C,
        'reaction1': reaction1,
    }

If you have a library yourself and want to just use our transformed model to create your own code, great!
We do the same at MxlPy.

A few pointers for that to work seamlessly:

Derived values are stored as dictionaries internally. Depending on how you set up your models, you will need to sort these such that they are called in the right sequence (as they might depend on each other). Since this is essentially a dependency resolution problem, we implemented a topological sort for this. Take a look at pysbml.codegen._sort_dependencies for inspiration how to do this
Initial assignments have the same issue. Since they can depend on derived values, we recommend sorting twice: once with the initial ones and once without
It is legal SBML to have an ODE model without variables or ODEs. Be aware that your inputs and outputs might be empty