Skip to content

LogicPhenotype

Bases: ComputationGraphPhenotype

LogicPhenotype is a composite phenotype that performs boolean operations using the boolean column of its component phenotypes and populations the boolean column of the resulting phenotype table. It should be used in any instance where multiple phenotypes are logically combined, for example, does a patient have diabetes AND hypertension, etc.

--> See the comparison table of CompositePhenotype classes

Parameters:

Name Type Description Default
expression ComputationGraph

The logical expression to be evaluated composed of phenotypes combined by python arithmetic operations.

required
return_date Union[str, Phenotype]

The date to be returned for the phenotype. Can be "first", "last", or a Phenotype object.

'first'
name str

The name of the phenotype.

None

Attributes:

Name Type Description
table PhenotypeTable

The resulting phenotype table after filtering (None until execute is called)

Source code in phenex/phenotypes/computation_graph_phenotypes.py
class LogicPhenotype(ComputationGraphPhenotype):
    """
    LogicPhenotype is a composite phenotype that performs boolean operations using the **boolean** column of its component phenotypes and populations the **boolean** column of the resulting phenotype table. It should be used in any instance where multiple phenotypes are logically combined, for example, does a patient have diabetes AND hypertension, etc.

    --> See the comparison table of CompositePhenotype classes

    Parameters:
        expression: The logical expression to be evaluated composed of phenotypes combined by python arithmetic operations.
        return_date: The date to be returned for the phenotype. Can be "first", "last", or a Phenotype object.
        name: The name of the phenotype.

    Attributes:
        table (PhenotypeTable): The resulting phenotype table after filtering (None until execute is called)
    """

    def __init__(
        self,
        expression: ComputationGraph,
        return_date: Union[str, Phenotype] = "first",
        name: str = None,
        **kwargs,
    ):
        super(LogicPhenotype, self).__init__(
            name=name,
            expression=expression,
            return_date=return_date,
            operate_on="boolean",
            populate="boolean",
            reduce=True,
        )

namespaced_table property

A PhenotypeTable has generic column names 'person_id', 'boolean', 'event_date', and 'value'. The namespaced_table appends the phenotype name to all of these columns. This is useful when joining multiple phenotype tables together.

Returns:

Name Type Description
table Table

The namespaced table for the current phenotype.

_coalesce_all_date_columns(table)

ComputationGraphPhenotypes have multiple possible date columns. To work with these date columns, which may be null, we perform a coalesce operation for each date column, which allows operations such as 'least' and 'greatest' to work correctly.

Parameters:

Name Type Description Default
table

The Ibis table object (e.g., joined_table).

required

Returns:

Type Description

Ibis expression representing the COALESCE of the columns.

Source code in phenex/phenotypes/computation_graph_phenotypes.py
def _coalesce_all_date_columns(self, table):
    """
    ComputationGraphPhenotypes have multiple possible date columns. To work with these date columns, which may be null, we perform a coalesce operation for each date column, which allows operations such as 'least' and 'greatest' to work correctly.

    Args:
        table: The Ibis table object (e.g., joined_table).

    Returns:
        Ibis expression representing the COALESCE of the columns.
    """
    coalesce_expressions = []

    names = [col for col in table.columns if "EVENT_DATE" in col]

    for i in range(len(names)):
        rotated_names = names[i:] + names[:i]
        coalesce_expr = ibis.coalesce(
            *(getattr(table, col) for col in rotated_names)
        )
        coalesce_expressions.append(coalesce_expr)

    return coalesce_expressions

_execute(tables)

Executes the score phenotype processing logic.

Parameters:

Name Type Description Default
tables Dict[str, Table]

A dictionary where the keys are table names and the values are Table objects.

required

Returns:

Name Type Description
PhenotypeTable PhenotypeTable

The resulting phenotype table containing the required columns.

Source code in phenex/phenotypes/computation_graph_phenotypes.py
def _execute(self, tables: Dict[str, Table]) -> PhenotypeTable:
    """
    Executes the score phenotype processing logic.

    Args:
        tables (Dict[str, Table]): A dictionary where the keys are table names and the values are Table objects.

    Returns:
        PhenotypeTable: The resulting phenotype table containing the required columns.
    """
    joined_table = hstack(self.children, tables["PERSON"].select("PERSON_ID"))

    if self.populate == "value" and self.operate_on == "boolean":
        for child in self.children:
            column_name = f"{child.name}_BOOLEAN"
            joined_table = joined_table.mutate(
                **{column_name: joined_table[column_name].cast(float)}
            )

    if self.populate == "value":
        _expression = self.expression.get_value_expression(
            joined_table, operate_on=self.operate_on
        )
        joined_table = joined_table.mutate(VALUE=_expression)
        # Arithmetic operations imply a boolean 'and' of children i.e. child1 + child two implies child1 and child2. if there are any null values in value calculations this is because one of the children is null, so we filter them out as the implied boolean condition is not met.
        joined_table = joined_table.filter(joined_table["VALUE"].notnull())

    elif self.populate == "boolean":
        _expression = self.expression.get_boolean_expression(
            joined_table, operate_on=self.operate_on
        )
        joined_table = joined_table.mutate(BOOLEAN=_expression)

    # Return the first or last event date
    date_columns = self._coalesce_all_date_columns(joined_table)
    if self.return_date == "first":
        joined_table = joined_table.mutate(EVENT_DATE=ibis.least(*date_columns))
    elif self.return_date == "last":
        joined_table = joined_table.mutate(EVENT_DATE=ibis.greatest(*date_columns))
    elif self.return_date == "all":
        joined_table = self._return_all_dates(joined_table, date_columns)
    elif isinstance(self.return_date, Phenotype):
        joined_table = joined_table.mutate(
            EVENT_DATE=getattr(joined_table, f"{self.return_date.name}_EVENT_DATE")
        )
    else:
        joined_table = joined_table.mutate(EVENT_DATE=ibis.null(date))

    # Reduce the table to only include rows where the boolean column is True
    if self.reduce:
        joined_table = joined_table.filter(joined_table.BOOLEAN == True)

    # Add a null value column if it doesn't exist, for example in the case of a LogicPhenotype
    schema = joined_table.schema()
    if "VALUE" not in schema.names:
        joined_table = joined_table.mutate(VALUE=ibis.null())

    return joined_table

_return_all_dates(table, date_columns)

If return date = all, we want to return all the dates on which phenotype criteria are fulfilled; this is a union of all the non-null dates in any leaf phenotype date columns.

Parameters:

Name Type Description Default
table

The Ibis table object (e.g., joined_table) that contains all leaf phenotypes stacked horizontally

required
date_columns

List of base columns as ibis objects

required

Returns:

Type Description

Ibis expression representing the UNION of all non null dates.

Source code in phenex/phenotypes/computation_graph_phenotypes.py
def _return_all_dates(self, table, date_columns):
    """
    If return date = all, we want to return all the dates on which phenotype criteria are fulfilled; this is a union of all the non-null dates in any leaf phenotype date columns.

    Args:
        table: The Ibis table object (e.g., joined_table) that contains all leaf phenotypes stacked horizontally
        date_columns: List of base columns as ibis objects

    Returns:
        Ibis expression representing the UNION of all non null dates.
    """
    # get all the non-null dates for each date column
    non_null_dates_by_date_col = []
    for date_col in date_columns:
        non_null_dates = table.filter(date_col.notnull()).mutate(
            EVENT_DATE=date_col
        )
        non_null_dates_by_date_col.append(non_null_dates)

    # do the union of all the non-null dates
    all_dates = non_null_dates_by_date_col[0]
    for non_null_dates in non_null_dates_by_date_col[1:]:
        all_dates = all_dates.union(non_null_dates)
    return all_dates

execute(tables)

Executes the phenotype computation for the current object and its children. This method recursively iterates over the children of the current object and calls their execute method if their table attribute is None.

Parameters:

Name Type Description Default
tables Dict[str, PhenexTable]

A dictionary mapping table names to PhenexTable objects. See phenex.mappers.DomainsDictionary.get_mapped_tables().

required

Returns:

Name Type Description
table PhenotypeTable

The resulting phenotype table containing the required columns. The PhenotypeTable will contain the columns: PERSON_ID, EVENT_DATE, VALUE. DATE is determined by the return_date parameter. VALUE is different for each phenotype. For example, AgePhenotype will return the age in the VALUE column. A MeasurementPhenotype will return the observed value for the measurement. See the specific phenotype of interest to understand more.

Source code in phenex/phenotypes/phenotype.py
def execute(self, tables: Dict[str, Table]) -> PhenotypeTable:
    """
    Executes the phenotype computation for the current object and its children. This method recursively iterates over the children of the current object and calls their execute method if their table attribute is None.

    Args:
        tables (Dict[str, PhenexTable]): A dictionary mapping table names to PhenexTable objects. See phenex.mappers.DomainsDictionary.get_mapped_tables().

    Returns:
        table (PhenotypeTable): The resulting phenotype table containing the required columns. The PhenotypeTable will contain the columns: PERSON_ID, EVENT_DATE, VALUE. DATE is determined by the return_date parameter. VALUE is different for each phenotype. For example, AgePhenotype will return the age in the VALUE column. A MeasurementPhenotype will return the observed value for the measurement. See the specific phenotype of interest to understand more.
    """
    logger.info(f"Phenotype '{self.name}': executing...")
    for child in self.children:
        if child.table is None:
            logger.debug(
                f"Phenotype {self.name}: executing child phenotype '{child.name}'..."
            )
            child.execute(tables)
        else:
            logger.debug(
                f"Phenotype {self.name}: skipping already computed child phenotype '{child.name}'."
            )

    table = self._execute(tables).mutate(BOOLEAN=True)

    if not set(PHENOTYPE_TABLE_COLUMNS) <= set(table.columns):
        raise ValueError(
            f"Phenotype {self.name} must return columns {PHENOTYPE_TABLE_COLUMNS}. Found {table.columns}."
        )

    self.table = table.select(PHENOTYPE_TABLE_COLUMNS)
    # for some reason, having NULL datatype screws up writing the table to disk; here we make explicit cast
    if type(self.table.schema()["VALUE"]) == ibis.expr.datatypes.core.Null:
        self.table = self.table.cast({"VALUE": "float64"})

    assert is_phenex_phenotype_table(self.table)
    logger.info(f"Phenotype '{self.name}': execution completed.")
    return self.table