CategoricalPhenotype

`CategoricalPhenotype`

Bases: Phenotype

CategoricalPhenotype calculates phenotype whose VALUE is discrete, such for sex, race, or ethnicity.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name of the phenotype.	`None`
`domain`	`str`	Domain of the phenotype.	`None`
`allowed_values`	`List`	List of allowed values for the categorical variable. If not passed, all values are returned.	`None`
`column_name`	`str`	Name of the column containing the required categorical variable.	`None`

Source code in phenex/phenotypes/categorical_phenotype.py

class CategoricalPhenotype(Phenotype):
    """
    CategoricalPhenotype calculates phenotype whose VALUE is discrete, such for sex, race, or ethnicity.

    Parameters:
        name: Name of the phenotype.
        domain: Domain of the phenotype.
        allowed_values: List of allowed values for the categorical variable. If not passed, all values are returned.
        column_name: Name of the column containing the required categorical variable.
    """

    def __init__(
        self,
        name: str = None,
        domain: str = None,
        allowed_values: List = None,
        column_name: str = None,
        **kwargs,
    ):
        self.name = name
        self.categorical_filter = CategoricalFilter(
            allowed_values=allowed_values, domain=domain, column_name=column_name
        )
        super(CategoricalPhenotype, self).__init__(**kwargs)

    def _execute(self, tables: Dict[str, "PhenexTable"]) -> PhenotypeTable:
        table = tables[self.categorical_filter.domain]
        table = self.categorical_filter._filter(table)
        return table.mutate(
            VALUE=table[self.categorical_filter.column_name], EVENT_DATE=ibis.null(date)
        )

`namespaced_table` `property`

A PhenotypeTable has generic column names 'person_id', 'boolean', 'event_date', and 'value'. The namespaced_table appends the phenotype name to all of these columns. This is useful when joining multiple phenotype tables together.

Returns:

Name	Type	Description
`table`	`Table`	The namespaced table for the current phenotype.

`execute(tables)`

Executes the phenotype computation for the current object and its children. This method recursively iterates over the children of the current object and calls their execute method if their table attribute is None.

Parameters:

Name	Type	Description	Default
`tables`	`Dict[str, PhenexTable]`	A dictionary mapping table names to PhenexTable objects. See phenex.mappers.DomainsDictionary.get_mapped_tables().	required

Returns:

Name	Type	Description
`table`	`PhenotypeTable`	The resulting phenotype table containing the required columns. The PhenotypeTable will contain the columns: PERSON_ID, EVENT_DATE, VALUE. DATE is determined by the return_date parameter. VALUE is different for each phenotype. For example, AgePhenotype will return the age in the VALUE column. A MeasurementPhenotype will return the observed value for the measurement. See the specific phenotype of interest to understand more.

Source code in phenex/phenotypes/phenotype.py

def execute(self, tables: Dict[str, Table]) -> PhenotypeTable:
    """
    Executes the phenotype computation for the current object and its children. This method recursively iterates over the children of the current object and calls their execute method if their table attribute is None.

    Args:
        tables (Dict[str, PhenexTable]): A dictionary mapping table names to PhenexTable objects. See phenex.mappers.DomainsDictionary.get_mapped_tables().

    Returns:
        table (PhenotypeTable): The resulting phenotype table containing the required columns. The PhenotypeTable will contain the columns: PERSON_ID, EVENT_DATE, VALUE. DATE is determined by the return_date parameter. VALUE is different for each phenotype. For example, AgePhenotype will return the age in the VALUE column. A MeasurementPhenotype will return the observed value for the measurement. See the specific phenotype of interest to understand more.
    """
    logger.info(f"Phenotype '{self.name}': executing...")
    for child in self.children:
        if child.table is None:
            logger.debug(
                f"Phenotype {self.name}: executing child phenotype '{child.name}'..."
            )
            child.execute(tables)
        else:
            logger.debug(
                f"Phenotype {self.name}: skipping already computed child phenotype '{child.name}'."
            )

    table = self._execute(tables).mutate(BOOLEAN=True)

    if not set(PHENOTYPE_TABLE_COLUMNS) <= set(table.columns):
        raise ValueError(
            f"Phenotype {self.name} must return columns {PHENOTYPE_TABLE_COLUMNS}. Found {table.columns}."
        )

    self.table = table.select(PHENOTYPE_TABLE_COLUMNS)
    # for some reason, having NULL datatype screws up writing the table to disk; here we make explicit cast
    if type(self.table.schema()["VALUE"]) == ibis.expr.datatypes.core.Null:
        self.table = self.table.cast({"VALUE": "float64"})

    assert is_phenex_phenotype_table(self.table)
    logger.info(f"Phenotype '{self.name}': execution completed.")
    return self.table

`HospitalizationPhenotype`

Bases: Phenotype

HospitalizationPhenotype filters an EncounterTable to identify inpatient events based on the encounter_type column. It uses a CategoricalFilter to filter for inpatient events and can apply additional date and time range filters.

Attributes:

Name	Type	Description
`name`		The name of the phenotype.
`domain`		The domain of the phenotype, default is 'ENCOUNTER'.
`column_name`		The name of the column to filter on, default is 'ENCOUNTER_TYPE'.
`allowed_values`		List of allowed values for the encounter_type column, default is ['inpatient'].
`date_range`		A date range filter to apply.
`relative_time_range`		A relative time range filter or a list of filters to apply.
`return_date`		Specifies whether to return the 'first', 'last', 'nearest', or 'all' event dates. Default is 'first'.
`table`		The resulting phenotype table after filtering.
`children`		List of child phenotypes.

Methods:

Name	Description
`_execute`	Dict[str, Table]) -> PhenotypeTable: Executes the filtering process on the provided tables and returns the filtered phenotype table.

Source code in phenex/phenotypes/categorical_phenotype.py

class HospitalizationPhenotype(Phenotype):
    """
    HospitalizationPhenotype filters an EncounterTable to identify inpatient events based on the encounter_type column.
    It uses a CategoricalFilter to filter for inpatient events and can apply additional date and time range filters.

    Attributes:
        name: The name of the phenotype.
        domain: The domain of the phenotype, default is 'ENCOUNTER'.
        column_name: The name of the column to filter on, default is 'ENCOUNTER_TYPE'.
        allowed_values: List of allowed values for the encounter_type column, default is ['inpatient'].
        date_range: A date range filter to apply.
        relative_time_range: A relative time range filter or a list of filters to apply.
        return_date: Specifies whether to return the 'first', 'last', 'nearest', or 'all' event dates. Default is 'first'.
        table: The resulting phenotype table after filtering.
        children: List of child phenotypes.

    Methods:
        _execute(tables: Dict[str, Table]) -> PhenotypeTable:
            Executes the filtering process on the provided tables and returns the filtered phenotype table.
    """

    def __init__(
        self,
        domain,
        column_name: str,
        allowed_values: List[str],
        name=None,
        date_range: DateFilter = None,
        relative_time_range: Union[
            RelativeTimeRangeFilter, List[RelativeTimeRangeFilter]
        ] = None,
        return_date="first",
    ):
        super(HospitalizationPhenotype, self).__init__()

        self.categorical_filter = CategoricalFilter(
            column_name=column_name, allowed_values=allowed_values
        )
        self.name = name
        self.date_range = date_range
        self.return_date = return_date
        assert self.return_date in [
            "first",
            "last",
            "nearest",
            "all",
        ], f"Unknown return_date: {return_date}"
        self.table = None
        self.domain = domain
        if isinstance(relative_time_range, RelativeTimeRangeFilter):
            relative_time_range = [relative_time_range]

        self.relative_time_range = relative_time_range
        if self.relative_time_range is not None:
            for rtr in self.relative_time_range:
                if rtr.anchor_phenotype is not None:
                    self.children.append(rtr.anchor_phenotype)

    def _execute(self, tables) -> PhenotypeTable:
        code_table = tables[self.domain]
        code_table = self._perform_categorical_filtering(code_table)
        code_table = self._perform_time_filtering(code_table)
        code_table = self._perform_date_selection(code_table)
        return select_phenotype_columns(code_table)

    def _perform_categorical_filtering(self, code_table):
        assert is_phenex_code_table(code_table)
        code_table = self.categorical_filter.filter(code_table)
        return code_table

    def _perform_time_filtering(self, code_table):
        if self.date_range is not None:
            code_table = self.date_range.filter(code_table)
        if self.relative_time_range is not None:
            for rtr in self.relative_time_range:
                code_table = rtr.filter(code_table)
        return code_table

    def _perform_date_selection(self, code_table):
        if self.return_date is None or self.return_date == "all":
            return code_table
        if self.return_date == "first":
            aggregator = First()
        elif self.return_date == "last":
            aggregator = Last()
        else:
            raise ValueError(f"Unknown return_date: {self.return_date}")
        return aggregator.aggregate(code_table)

`namespaced_table` `property`

A PhenotypeTable has generic column names 'person_id', 'boolean', 'event_date', and 'value'. The namespaced_table appends the phenotype name to all of these columns. This is useful when joining multiple phenotype tables together.

Returns:

Name	Type	Description
`table`	`Table`	The namespaced table for the current phenotype.

`execute(tables)`

Executes the phenotype computation for the current object and its children. This method recursively iterates over the children of the current object and calls their execute method if their table attribute is None.

Parameters:

Name	Type	Description	Default
`tables`	`Dict[str, PhenexTable]`	A dictionary mapping table names to PhenexTable objects. See phenex.mappers.DomainsDictionary.get_mapped_tables().	required

Returns:

Name	Type	Description
`table`	`PhenotypeTable`	The resulting phenotype table containing the required columns. The PhenotypeTable will contain the columns: PERSON_ID, EVENT_DATE, VALUE. DATE is determined by the return_date parameter. VALUE is different for each phenotype. For example, AgePhenotype will return the age in the VALUE column. A MeasurementPhenotype will return the observed value for the measurement. See the specific phenotype of interest to understand more.

Source code in phenex/phenotypes/phenotype.py

def execute(self, tables: Dict[str, Table]) -> PhenotypeTable:
    """
    Executes the phenotype computation for the current object and its children. This method recursively iterates over the children of the current object and calls their execute method if their table attribute is None.

    Args:
        tables (Dict[str, PhenexTable]): A dictionary mapping table names to PhenexTable objects. See phenex.mappers.DomainsDictionary.get_mapped_tables().

    Returns:
        table (PhenotypeTable): The resulting phenotype table containing the required columns. The PhenotypeTable will contain the columns: PERSON_ID, EVENT_DATE, VALUE. DATE is determined by the return_date parameter. VALUE is different for each phenotype. For example, AgePhenotype will return the age in the VALUE column. A MeasurementPhenotype will return the observed value for the measurement. See the specific phenotype of interest to understand more.
    """
    logger.info(f"Phenotype '{self.name}': executing...")
    for child in self.children:
        if child.table is None:
            logger.debug(
                f"Phenotype {self.name}: executing child phenotype '{child.name}'..."
            )
            child.execute(tables)
        else:
            logger.debug(
                f"Phenotype {self.name}: skipping already computed child phenotype '{child.name}'."
            )

    table = self._execute(tables).mutate(BOOLEAN=True)

    if not set(PHENOTYPE_TABLE_COLUMNS) <= set(table.columns):
        raise ValueError(
            f"Phenotype {self.name} must return columns {PHENOTYPE_TABLE_COLUMNS}. Found {table.columns}."
        )

    self.table = table.select(PHENOTYPE_TABLE_COLUMNS)
    # for some reason, having NULL datatype screws up writing the table to disk; here we make explicit cast
    if type(self.table.schema()["VALUE"]) == ibis.expr.datatypes.core.Null:
        self.table = self.table.cast({"VALUE": "float64"})

    assert is_phenex_phenotype_table(self.table)
    logger.info(f"Phenotype '{self.name}': execution completed.")
    return self.table

CategoricalPhenotype