Bases: ComputationGraphPhenotype

ScorePhenotype is a CompositePhenotype that performs arithmetic operations using the boolean column of its component phenotypes and populations the value column. It should be used for calculating medical scores such as CHADSVASC, HASBLED, etc.

--> See the comparison table of CompositePhenotype classes


Name Type Description Default
expression ComputationGraph

The arithmetic expression to be evaluated composed of phenotypes combined by python arithmetic operations.

return_date Union[str, Phenotype]

The date to be returned for the phenotype. Can be "first", "last", or a Phenotype object.

name str

The name of the phenotype.



Name Type Description
table PhenotypeTable

The resulting phenotype table after filtering (None until execute is called)


# Create component phenotypes individually
hypertension = Phenotype(Codelist('hypertension'))
hf = Phenotype(Codelist('chf'))
age_gt_45 = AgePhenotype(min_age=GreaterThan(45))

# Create the ScorePhenotype that defines a score which is 2*age + 1 if
# hypertension or chf are present, respectively. Notice that the boolean
# column of the component phenotypes are used for calculation and the value
# column is populated of the ScorePhenotype table.
pt = ScorePhenotype(
    expression = 2 * age_gt_45 + hypertension + chf,

Source code in phenex/phenotypes/
    def __init__(
        expression: ComputationGraph,
        return_date: Union[str, Phenotype] = "first",
        name: str = None,
        super(ScorePhenotype, self).__init__(

namespaced_table property

A PhenotypeTable has generic column names 'person_id', 'boolean', 'event_date', and 'value'. The namespaced_table appends the phenotype name to all of these columns. This is useful when joining multiple phenotype tables together.


Name Type Description
table Table

The namespaced table for the current phenotype.


ComputationGraphPhenotypes have multiple possible date columns. To work with these date columns, which may be null, we perform a coalesce operation for each date column, which allows operations such as 'least' and 'greatest' to work correctly.


Name Type Description Default

The Ibis table object (e.g., joined_table).



Type Description

Ibis expression representing the COALESCE of the columns.

Source code in phenex/phenotypes/
def _coalesce_all_date_columns(self, table):
    ComputationGraphPhenotypes have multiple possible date columns. To work with these date columns, which may be null, we perform a coalesce operation for each date column, which allows operations such as 'least' and 'greatest' to work correctly.

        table: The Ibis table object (e.g., joined_table).

        Ibis expression representing the COALESCE of the columns.
    coalesce_expressions = []

    names = [col for col in table.columns if "EVENT_DATE" in col]

    for i in range(len(names)):
        rotated_names = names[i:] + names[:i]
        coalesce_expr = ibis.coalesce(
            *(getattr(table, col) for col in rotated_names)

    return coalesce_expressions


Executes the score phenotype processing logic.


Name Type Description Default
tables Dict[str, Table]

A dictionary where the keys are table names and the values are Table objects.



Name Type Description
PhenotypeTable PhenotypeTable

The resulting phenotype table containing the required columns.

Source code in phenex/phenotypes/
def _execute(self, tables: Dict[str, Table]) -> PhenotypeTable:
    Executes the score phenotype processing logic.

        tables (Dict[str, Table]): A dictionary where the keys are table names and the values are Table objects.

        PhenotypeTable: The resulting phenotype table containing the required columns.
    joined_table = hstack(self.children, tables["PERSON"].select("PERSON_ID"))

    if self.populate == "value" and self.operate_on == "boolean":
        for child in self.children:
            column_name = f"{}_BOOLEAN"
            joined_table = joined_table.mutate(
                **{column_name: joined_table[column_name].cast(float)}

    if self.populate == "value":
        _expression = self.expression.get_value_expression(
            joined_table, operate_on=self.operate_on
        joined_table = joined_table.mutate(VALUE=_expression)
        # Arithmetic operations imply a boolean 'and' of children i.e. child1 + child two implies child1 and child2. if there are any null values in value calculations this is because one of the children is null, so we filter them out as the implied boolean condition is not met.
        joined_table = joined_table.filter(joined_table["VALUE"].notnull())

    elif self.populate == "boolean":
        _expression = self.expression.get_boolean_expression(
            joined_table, operate_on=self.operate_on
        joined_table = joined_table.mutate(BOOLEAN=_expression)

    # Return the first or last event date
    date_columns = self._coalesce_all_date_columns(joined_table)
    if self.return_date == "first":
        joined_table = joined_table.mutate(EVENT_DATE=ibis.least(*date_columns))
    elif self.return_date == "last":
        joined_table = joined_table.mutate(EVENT_DATE=ibis.greatest(*date_columns))
    elif self.return_date == "all":
        joined_table = self._return_all_dates(joined_table, date_columns)
    elif isinstance(self.return_date, Phenotype):
        joined_table = joined_table.mutate(
            EVENT_DATE=getattr(joined_table, f"{}_EVENT_DATE")
        joined_table = joined_table.mutate(EVENT_DATE=ibis.null(date))

    # Reduce the table to only include rows where the boolean column is True
    if self.reduce:
        joined_table = joined_table.filter(joined_table.BOOLEAN == True)

    # Add a null value column if it doesn't exist, for example in the case of a LogicPhenotype
    schema = joined_table.schema()
    if "VALUE" not in schema.names:
        joined_table = joined_table.mutate(VALUE=ibis.null())

    return joined_table

_return_all_dates(table, date_columns)

If return date = all, we want to return all the dates on which phenotype criteria are fulfilled; this is a union of all the non-null dates in any leaf phenotype date columns.


Name Type Description Default

The Ibis table object (e.g., joined_table) that contains all leaf phenotypes stacked horizontally


List of base columns as ibis objects



Type Description

Ibis expression representing the UNION of all non null dates.

Source code in phenex/phenotypes/
def _return_all_dates(self, table, date_columns):
    If return date = all, we want to return all the dates on which phenotype criteria are fulfilled; this is a union of all the non-null dates in any leaf phenotype date columns.

        table: The Ibis table object (e.g., joined_table) that contains all leaf phenotypes stacked horizontally
        date_columns: List of base columns as ibis objects

        Ibis expression representing the UNION of all non null dates.
    # get all the non-null dates for each date column
    non_null_dates_by_date_col = []
    for date_col in date_columns:
        non_null_dates = table.filter(date_col.notnull()).mutate(

    # do the union of all the non-null dates
    all_dates = non_null_dates_by_date_col[0]
    for non_null_dates in non_null_dates_by_date_col[1:]:
        all_dates = all_dates.union(non_null_dates)
    return all_dates


Executes the phenotype computation for the current object and its children. This method recursively iterates over the children of the current object and calls their execute method if their table attribute is None.


Name Type Description Default
tables Dict[str, PhenexTable]

A dictionary mapping table names to PhenexTable objects. See phenex.mappers.DomainsDictionary.get_mapped_tables().



Name Type Description
table PhenotypeTable

The resulting phenotype table containing the required columns. The PhenotypeTable will contain the columns: PERSON_ID, EVENT_DATE, VALUE. DATE is determined by the return_date parameter. VALUE is different for each phenotype. For example, AgePhenotype will return the age in the VALUE column. A MeasurementPhenotype will return the observed value for the measurement. See the specific phenotype of interest to understand more.

Source code in phenex/phenotypes/
def execute(self, tables: Dict[str, Table]) -> PhenotypeTable:
    Executes the phenotype computation for the current object and its children. This method recursively iterates over the children of the current object and calls their execute method if their table attribute is None.

        tables (Dict[str, PhenexTable]): A dictionary mapping table names to PhenexTable objects. See phenex.mappers.DomainsDictionary.get_mapped_tables().

        table (PhenotypeTable): The resulting phenotype table containing the required columns. The PhenotypeTable will contain the columns: PERSON_ID, EVENT_DATE, VALUE. DATE is determined by the return_date parameter. VALUE is different for each phenotype. For example, AgePhenotype will return the age in the VALUE column. A MeasurementPhenotype will return the observed value for the measurement. See the specific phenotype of interest to understand more.
    """"Phenotype '{}': executing...")
    for child in self.children:
        if child.table is None:
                f"Phenotype {}: executing child phenotype '{}'..."
                f"Phenotype {}: skipping already computed child phenotype '{}'."

    table = self._execute(tables).mutate(BOOLEAN=True)

    if not set(PHENOTYPE_TABLE_COLUMNS) <= set(table.columns):
        raise ValueError(
            f"Phenotype {} must return columns {PHENOTYPE_TABLE_COLUMNS}. Found {table.columns}."

    self.table =
    # for some reason, having NULL datatype screws up writing the table to disk; here we make explicit cast
    if type(self.table.schema()["VALUE"]) == ibis.expr.datatypes.core.Null:
        self.table = self.table.cast({"VALUE": "float64"})

    assert is_phenex_phenotype_table(self.table)"Phenotype '{}': execution completed.")
    return self.table