Internal DSL

An internal domain-specific language for probability expressions.

Expression	Description
\(P(A)\)	The probability of A occurring
\(P(A^*)\)	The probability of A not occurring
\(P(A, B)\)	The joint probability of A and B occurring
\(P(A \mid B)\)	The conditional probability of A given B occurring
\(P(A \mid B^*)\)	The conditional probability of A occurring given B not occurring
\(P(A^* \mid B)\)	The conditional probability of A not occurring given B occurring
\(P(A^* \mid B^*)\)	The conditional probability of A not occurring given B not occurring
\(\sum_A P(A, B)\)	The marginal probability of B

Level 3 of Pearl’s Causal Hierarchy.

Expression	Description
\(P(Y_X \mid X^, Y^)\)	Probability of sufficient causation
\(P(Y^_{X^} \mid X, Y)\)	Probability of necessary causation
\(P(Y_X, Y^_{X^})\)	Probability of necessary and sufficient causation

PP: alias of PopulationProbabilityBuilderType

class CounterfactualVariable(name: str, star: bool | None = None, interventions: frozenset[Intervention] = <factory>)[source]

A counterfactual variable.

Counterfactual variables are like normal variables, but can have a list of interventions. Each intervention is either the same as what was observed (no star) or different from what was observed (star).

interventions: frozenset[Intervention] = <dataclasses._MISSING_TYPE object>: The interventions on the variable. Should be non-empty

to_text() → str[source]: Output this counterfactual variable in the internal string format.

to_latex() → str[source]

Output this counterfactual variable in the LaTeX string format.

Returns:: A latex representation of this counterfactual variable

to_y0() → str[source]: Output this counterfactual variable instance as y0 internal DSL code.

is_event() → bool[source]: Return if the counterfactual variable has a value.

has_tautology() → bool[source]

Return if the counterfactual variable contain its own value in the subscript.

Returns:: True if we force a variable X to have a value x and the resulting value of X is x.
Raises:: ValueError – if the counterfactual value doesn’t have a value assigned

is_inconsistent() → bool[source]

Return if the counterfactual variable violates the Axiom of Effectiveness.

Returns:: True if we force a variable X to have a value x and the resulting value of X is not x
Raises:: ValueError – if the counterfactual value doesn’t have a value assigned

intervene(variables: str | Variable | Iterable[str | Variable]) → CounterfactualVariable[source]

Intervene on this counterfactual variable with the given variable(s).

Parameters:: variables – The variable(s) used to extend this counterfactual variable’s current interventions. Automatically converts variables to interventions.
Returns:: A new counterfactual variable with both this counterfactual variable’s interventions and the given intervention(s).

Warning

Will raise a value error ff the value of a new intervention conflicts with the value of intervention already listed in this counterfactual.

Note

This function can be accessed with the matmult @ operator.

invert() → CounterfactualVariable[source]: Invert the value of the counterfactual variable.

class Distribution(children: frozenset[~y0.dsl.Variable], parents: frozenset[~y0.dsl.Variable] = <factory>)[source]

A general distribution over several child variables, conditioned by several parents.

P(X | Y) means that X is a child and Y is a parent.

Create a distribution the given variable(s) or distribution.

Parameters:

distribution – If given a Distribution, creates a probability expression directly over the distribution. If given variable or list of variables, conveniently creates a Distribution with the variable(s) as children.
args – If the first argument (distribution) was given as a single variable, the args variadic argument can be used to specify a list of additional variables.

Returns:

A Distribution object

Raises:

ValueError – If invalid combination of arguments are given.

to_text() → str[source]: Output this distribution in the internal string format.

to_y0() → str[source]: Output this distribution instance as y0 internal DSL code.

to_latex() → str[source]: Output this distribution in the LaTeX string format.

is_conditioned() → bool[source]: Return if this distribution is conditioned.

is_markov_kernel() → bool[source]: Return if this distribution a markov kernel -> one child variable and one or more conditionals.

intervene(variables: str | Variable | Iterable[str | Variable]) → Distribution[source]: Return a new distribution that has the given intervention(s) on all variables.

uncondition() → Distribution[source]: Return a new distribution that is not conditioned on the parents.

joint(children: str | Variable | Iterable[str | Variable]) → Distribution[source]

Create a new distribution including the given child variables.

Parameters:: children – The variable(s) with which this distribution’s children are extended
Returns:: A new distribution.

Note

This function can be accessed with the and & operator.

given(parents: str | Variable | Iterable[str | Variable] | Distribution) → Distribution[source]

Create a new mixed distribution additionally conditioned on the given parent variables.

Parameters:: parents – The variable(s) with which this distribution’s parents are extended
Returns:: A new distribution
Raises:: TypeError – If a distribution is given as the parents that contains conditionals

Note

This function can be accessed with the or | operator.

class Element[source]

An element in the y0 internal domain-speific language that can be converted to text, LaTeX, and code.

abstractmethod to_text() → str[source]: Output this DSL object in the internal string format.

abstractmethod to_latex() → str[source]: Output this DSL object in the LaTeX string format.

abstractmethod to_y0() → str[source]: Output this DSL object as y0 python code.

get_variables() → set[Variable][source]: Get the set of variables used in this expression.

Event

A conjunction of factual and counterfactual events

alias of dict[Variable, Intervention]

class Expression[source]

The abstract class representing all expressions.

conditional(ranges: str | Variable | Iterable[str | Variable]) → Expression[source]

Return this expression, conditioned by the given variables.

Parameters:: ranges – A variable or list of variables over which to marginalize this expression
Returns:: A fraction in which the denominator is represents the sum over the given ranges

>>> from y0.dsl import P, A, B
>>> assert P(A, B).conditional(A) == P(A, B) / Sum[B](P(A, B))
>>> assert P(A, B, C).conditional([A, B]) == P(A, B, C) / Sum[C](P(A, B, C))

normalize_marginalize(ranges: str | Variable | Iterable[str | Variable]) → Expression[source]: Return this expression, normalized by this expression marginalized by the given variables.

marginalize(ranges: str | Variable | Iterable[str | Variable]) → Expression[source]

Return this expression, marginalizing out the given variables.

Parameters:: ranges – A variable or list of variables over which to marginalize this expression
Returns:: The expression but summed over the given variables

>>> from y0.dsl import P, A, B, C
>>> assert P(A, B).marginalize(A) == Sum[A](P(A, B))
>>> assert P(A, B, C).marginalize([A, B]) == Sum[A, B](P(A, B, C))

simplify() → Expression[source]: Simplify this expression.

class Fraction(numerator: Expression, denominator: Expression)[source]

Represents a fraction of two expressions.

numerator: Expression = <dataclasses._MISSING_TYPE object>: The expression in the numerator of the fraction

denominator: Expression = <dataclasses._MISSING_TYPE object>: The expression in the denominator of the fraction

to_text() → str[source]: Output this fraction in the internal string format.

to_latex() → str[source]: Output this fraction in the LaTeX string format.

to_y0(parens: bool = True) → str[source]: Output this fraction as y0 internal DSL code.

flip() → Fraction[source]: Exchange the numerator and denominator.

simplify() → Expression[source]: Simplify this fraction.

class Intervention(name: str, star: bool | None = None)[source]

An intervention variable.

An intervention variable is usually used as a subscript in a CounterfactualVariable.

class One[source]

The multiplicative identity (1).

to_text() → str[source]: Output this identity variable in the internal string format.

to_latex() → str[source]: Output this identity instance in the LaTeX string format.

to_y0() → str[source]: Output this identity instance as y0 internal DSL code.

P = <y0.dsl.ProbabilityBuilderType object>

P is a magical object of mystery and wonder that can be used to create Probability instances.

It itself is a singleton instance of ProbabilityBuilderType and can be used wither via the ProbabilityBuilderType.__call__(), as if it were a function like P(Y) or it can be used as a combination with the ProbabilityBuilderType.__getitem__() and a call, like P[X](Y) to denote interventions using the do-Calculus \(L_2\) notation. Here are some examples:

A univariate distribution can be created either with a string or a Variable:

>>> from y0.dsl import P, A
>>> P('A') == P(A)

Multivariate Distributions

A joint distribution can be created with several strings or Variable instances with variadic arguments:

>>> from y0.dsl import P, A, B
>>> P(A, B) == P('A', 'B')

A joint distribution can also be created with a single argument that is either an iterable of either strings or Variable instances

>>> from y0.dsl import P, A, B
>>> P((A, B)) == P([A, B]) == P(('A', 'B')) == P(['A', 'B'])

This even extends to fancy generators, for which you can omit the parentheses:

Creation with a fancy generator of variables:

>>> from y0.dsl import P, A, B
>>> P(Variable(name) for name in 'AB') == P(name for name in 'AB') == P(A, B)

Conditional Distributions

Creation with a conditional distribution:

>>> from y0.dsl import P, A, B
>>> P(A | B)

Creation with a mixed joint/conditional distribution:

>>> from y0.dsl import P, A, B, C
>>> P(A & B | C)

Specifying an Intervention with L2 do-Calculus Notation

Intervene on a single variable:

>>> from y0.dsl import P, X, Y
>>> P[X](Y) == P(Y @ X)

Intervene on multiple children:

>>> from y0.dsl import P, X, Y, Z
>>> P[X](Y, Z) == P(Y @ X & Z @ X)

Intervene on multiple parents:

>>> from y0.dsl import P, W, X, Y, Z
>>> P[X](Y | (W, Z)) == P(Y @ X | (W @ X, Z @ X)):

Intervene on both children and parents:

>>> from y0.dsl import P, X, Y, Z
>>> P[X](Y | Z) == P(Y @ X | Z @ X)

Intervene on X on top of previous interventions:

>>> from y0.dsl import P, X, Y, Z
>>> P[X](Y @ Z) == P(Y @ X @ Z)

Allow mixing with L3, where each variable can have different interventions:

>>> from y0.dsl import P, W, X, Y, Z
>>> P[X](Y @ Z | W) == P(Y @ X @ Z | W @ X)

Specifying Multiple Interventions with L2 do-Calculus Notation

Multiple interventions on a single variable:

>>> from y0.dsl import P, X1, X2, Y
>>> P[X1, X2](Y) == P(Y @ X)

Multiple interventions on multiple children:

>>> from y0.dsl import P, X1, X2, Y, Z
>>> P[X1, X2](Y, Z) == P(Y @ X1 @ X2 & Z @ X1 @ X2)

… and so on

Population: alias of Variable

class PopulationProbability(distribution: Distribution, population: Variable)[source]

A probability that is annotated with a population.

>>> from y0.dsl import PP, Pi1, Y, X
>>> # Make a population-annotated probability of Y
>>> PP[Pi1](Y)
>>> # Make a conditioned population of Y @ X
>>> PP[Pi1][X](Y)

Related publications:

Surrogate Outcomes and Transportability (Tikka and Karvanen, 2018)

to_y0() → str[source]: Output this probability instance as y0 internal DSL code.

to_text() → str[source]: Output this probability in the internal string format.

to_latex() → str[source]: Output this probability in the LaTeX string format.

class Probability(distribution: Distribution)[source]

The probability over a distribution.

distribution: Distribution = <dataclasses._MISSING_TYPE object>: The distribution over which the probability is expressed

Create a distribution the given variable(s) or distribution.

Parameters:

distribution – If given a Distribution, creates a probability expression directly over the distribution. If given variable or list of variables, conveniently creates a Distribution with the variable(s) as children.
args – If the first argument (distribution) was given as a single variable, the args variadic argument can be used to specify a list of additional variables.
interventions – An optional variable or variables to use as interventions.

Returns:

A probability object

to_text() → str[source]: Output this probability in the internal string format.

to_y0() → str[source]: Output this probability instance as y0 internal DSL code.

to_latex() → str[source]: Output this probability in the LaTeX string format.

property parents: frozenset[Variable]: Get the distribution’s parents.

property children: frozenset[Variable]: Get the distribution’s children.

is_conditioned() → bool[source]: Return if this distribution is conditioned.

is_markov_kernel() → bool[source]: Return if this distribution a markov kernel -> one child variable and one or more conditionals.

intervene(variables: str | Variable | Iterable[str | Variable]) → Probability[source]: Return a new probability where the underlying distribution has been intervened by the given variables.

uncondition() → Probability[source]

Return a new probability where the underlying distribution is no longer conditioned by the parents.

Returns:: A new probability over a distribution over the children and parents of the previous distribution

>>> from y0.dsl import P, A, B
>>> P(A | B).uncondition() == P(A, B)

conditional(ranges: str | Variable | Iterable[str | Variable]) → Expression[source]

Return this expression, conditioned by the given variables.

Parameters:: ranges – A variable or list of variables over which to marginalize this expression
Returns:: A fraction in which the denominator is represents the sum over the given ranges

>>> from y0.dsl import P, A, B
>>> assert P(A, B).conditional(A) == P(A, B) / Sum[B](P(A, B))
>>> assert P(A, B, C).conditional([A, B]) == P(A, B, C) / Sum[C](P(A, B, C))

class Product(expressions: tuple[Expression, ...])[source]

Represent the product of several probability expressions.

classmethod safe(expressions: Expression | Iterable[Expression]) → Expression[source]

Construct a product from any iterable of expressions.

Parameters:: expressions – An expression or iterable of expressions which should be multiplied
Returns:: A Product object

Standard usage, same as the normal __init__:

>>> from y0.dsl import Product, X, Y, A, P
>>> Product.safe((P(X, Y),))

Use a list or other iterable:

>>> Product.safe([P(X), P(Y | X)])

Use an inline generator:

>>> Product.safe(P(v) for v in [X, Y])

Use a single expression:

>>> Product.safe(P(X, Y))

simplify() → Expression[source]

Simplify this product.

if there’s products inside, recursively simplify
if there’s fractions inside, slurp them together
if there’s a zero inside, give zero
throw away ones

to_text() → str[source]: Output this product in the internal string format.

to_y0() → str[source]: Output this product instance as y0 internal DSL code.

to_latex() → str[source]: Output this product in the LaTeX string format.

Q: alias of QFactor

class QFactor(domain: frozenset[Variable], codomain: frozenset[Variable])[source]

A function from the variables in the domain to a probability function over variables in the codomain.

classmethod safe(domain: str | Variable | Iterable[str | Variable], *args: str | Variable, codomain: str | Variable | Iterable[str | Variable]) → QFactor[source]: Create a Q factor with various input types.

to_text() → str[source]: Output this Q factor in the internal string format.

to_latex() → str[source]: Output this Q factor in the LaTeX string format.

to_y0() → str[source]: Output this Q factor instance as y0 internal DSL code.

class Sum(expression: Expression, ranges: frozenset[Variable])[source]

Represent the sum over an expression over an optional set of variables.

expression: Expression = <dataclasses._MISSING_TYPE object>: The expression over which the sum is done

ranges: frozenset[Variable] = <dataclasses._MISSING_TYPE object>: The variables over which the sum is done. Defaults to an empty list, meaning no variables.

classmethod safe(expression: Expression, ranges: str | Variable | Iterable[str | Variable], *, simplify: bool = False) → Expression[source]

Construct a sum from an expression and a permissive set of things in the ranges.

Parameters:

expression – The expression over which the sum is done
ranges – The variable or list of variables over which the sum is done
simplify – Should the sum be simplified using Sum.simplify()?

Returns:

A Sum object

Standard usage, same as the normal __init__:

>>> from y0.dsl import Sum, X, Y, A, P
>>> Sum.safe(P(X, Y), (X,))

Use a list or other iterable:

>>> Sum.safe(P(X, Y), [X])

Use a single variable:

>>> Sum.safe(P(X, Y), X)

simplify() → Expression[source]: Simplify this sum.

to_text() → str[source]: Output this sum in the internal string format.

to_latex() → str[source]: Output this sum in the LaTeX string format.

to_y0() → str[source]: Output this sum instance as y0 internal DSL code.

class Variable(name: str, star: bool | None = None)[source]

A variable, typically with a single letter.

name: str = <dataclasses._MISSING_TYPE object>: The name of the variable

star: bool | None = None: The star status of the variable. None means it’s a variable, False means it’s the same as the value for the variable, and True means it’s a different value from the variable.

classmethod norm(name: str | Variable) → Variable[source]: Automatically upgrade a string to a variable.

get_base() → Variable[source]: Return the base variable, with no other nonsense.

to_text() → str[source]: Output this variable in the internal string format.

to_sympy() → sympy.Symbol[source]: Get the object for sympy.

to_latex() → str[source]

Output this variable in the LaTeX string format.

Returns:: The LaTeX representaton of this variable.

>>> Variable("X").to_latex()
'X'
>>> Variable("X", star=True).to_latex()
'X^{+}'
>>> Variable("X", star=False).to_latex()
'X^{-}'
>>> Variable("X1").to_latex()
'X_1'
>>> Variable("X1", star=True).to_latex()
'{X_1}^{+}'
>>> Variable("X12").to_latex()
'X_{12}'
>>> Variable("X12", star=True).to_latex()
'{X_{12}}^{+}'

to_y0() → str[source]: Output this variable instance as y0 internal DSL code.

intervene(variables: str | Variable | Iterable[str | Variable]) → CounterfactualVariable[source]

Intervene on this variable with the given variable(s).

Parameters:: variables – The variable(s) used to extend this variable as it is changed to a counterfactual variable
Returns:: A new counterfactual variable over this variable with the given intervention(s).

Note

This function can be accessed with the matmult @ operator.

given(parents: str | Variable | Iterable[str | Variable] | Distribution) → Distribution[source]

Create a distribution in which this variable is conditioned on the given variable(s).

The new distribution is a Markov Kernel.

Parameters:: parents – A variable or list of variables to include as conditions in the new conditional distribution
Returns:: A new conditional probability distribution
Raises:: TypeError – If a distribution is given as the parents that contains conditionals

Note

This function can be accessed with the or | operator.

joint(children: str | Variable | Iterable[str | Variable]) → Distribution[source]

Create a joint distribution between this variable and the given variable(s).

Parameters:: children – The variable(s) for use with this variable in a joint distribution
Returns:: A new joint distribution over this variable and the given variables.

Note

This function can be accessed with the and & operator.

invert() → Variable[source]: Create an Intervention variable that is different from what was observed (with a star).

class Zero[source]

The additive identity (0).

to_text() → str[source]: Output this identity variable in the internal string format.

to_latex() → str[source]: Output this identity instance in the LaTeX string format.

to_y0() → str[source]: Output this identity instance as y0 internal DSL code.

ensure_ordering(expression: Expression, *, ordering: None | Iterable[str | Variable] = None) → Sequence[Variable][source]

Get a canonical ordering of the variables in the expression, or pass one through.

The canonical ordering of the variables in a given expression is based on the alphabetical sort order of the variables based on their names.

Parameters:

expression – The expression to get a canonical ordering from.
ordering – A given ordering to pass through if not none, otherwise calculate it.

Returns:

The ordering

vmap_adj(adjacency_dict: Mapping[str | Variable, Iterable[str | Variable]]) → dict[Variable, list[Variable]][source]: Map an adjacency dictionary of strings to variables.

vmap_pairs(edges: Iterable[tuple[str | Variable, str | Variable]]) → list[tuple[Variable, Variable]][source]: Map a pair of strings to pairs of variables.