Skip to content

Mappers

Mappers define how source database columns are mapped to PhenEx's internal representation. Each mapper class corresponds to a specific source table and declares which source columns map to PhenEx's standard fields (PERSON_ID, EVENT_DATE, CODE, VALUE, etc.).

PhenEx ships with a complete set of mappers for the OMOP CDM. These are bundled into the OMOPDomains dictionary, which is the main entry point for working with OMOP data.

Using mappers

In most cases you interact with mappers through a DomainsDictionary rather than instantiating them directly.

With a Snowflake database

from phenex.mappers import OMOPDomains
from phenex.ibis_connect import SnowflakeConnector

con = SnowflakeConnector()  # requires configuration
mapped_tables = OMOPDomains.get_mapped_tables(con)

With mock data for local testing

from phenex.mappers import OMOPDomains
from phenex.sim import DomainsMocker

mocker = DomainsMocker(domains_dict=OMOPDomains, n_patients=1000)
mapped_tables = mocker.get_mapped_tables()

Concept ID vs Source Value mappers

OMOP tables store codes in two ways:

  • Concept ID columns (e.g. CONDITION_CONCEPT_ID) contain OMOP standard concept IDs.
  • Source value columns (e.g. CONDITION_SOURCE_VALUE) contain the original vocabulary codes (ICD-10, CPT, NDC, etc.).

PhenEx provides a mapper for each. Use the concept ID mapper when your codelist contains OMOP concept IDs, and the source value mapper when your codelist contains native vocabulary codes.

Domain Concept ID mapper Source value mapper
Conditions OMOPConditionOccurenceTable OMOPConditionOccurrenceSourceTable
Procedures OMOPProcedureOccurrenceTable OMOPProcedureOccurrenceSourceTable
Drugs OMOPDrugExposureTable OMOPDrugExposureSourceTable
Person OMOPPersonTable OMOPPersonTableSource

OMOP mapper classes

Bases: PhenexPersonTable

Source code in phenex/mappers.py
class OMOPPersonTable(PhenexPersonTable):
    NAME_TABLE = "PERSON"
    DEFAULT_MAPPING = {"PERSON_ID": "PERSON_ID", "DATE_OF_BIRTH": "BIRTH_DATETIME"}
    JOIN_KEYS = {
        "OMOPConditionOccurenceTable": ["PERSON_ID"],
        "OMOPVisitOccurrenceTable": ["PERSON_ID"],
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: CodeTable

Source code in phenex/mappers.py
class OMOPConditionOccurenceTable(CodeTable):
    NAME_TABLE = "CONDITION_OCCURRENCE"
    JOIN_KEYS = {
        "OMOPPersonTable": ["PERSON_ID"],
        "OMOPVisitOccurrenceTable": [
            "PERSON_ID",
            "VISIT_OCCURRENCE_ID",
        ],
    }
    DEFAULT_MAPPING = {
        "PERSON_ID": "PERSON_ID",
        "EVENT_DATE": "CONDITION_START_DATE",
        "CODE": "CONDITION_CONCEPT_ID",
    }
    PATHS = {
        "OMOPVisitDetailTable": ["OMOPVisitOccurrenceTable"],
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: CodeTable

Source code in phenex/mappers.py
class OMOPProcedureOccurrenceTable(CodeTable):
    NAME_TABLE = "PROCEDURE_OCCURRENCE"
    JOIN_KEYS = {
        "OMOPPersonTable": ["PERSON_ID"],
        "OMOPVisitOccurrenceTable": ["PERSON_ID", "VISIT_OCCURRENCE_ID"],
    }
    DEFAULT_MAPPING = {
        "PERSON_ID": "PERSON_ID",
        "EVENT_DATE": "PROCEDURE_DATE",
        "CODE": "PROCEDURE_CONCEPT_ID",
    }
    PATHS = {
        "OMOPVisitDetailTable": ["OMOPVisitOccurrenceTable"],
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: CodeTable

Source code in phenex/mappers.py
class OMOPDrugExposureTable(CodeTable):
    NAME_TABLE = "DRUG_EXPOSURE"
    JOIN_KEYS = {
        "OMOPPersonTable": ["PERSON_ID"],
        "OMOPVisitOccurrenceTable": ["PERSON_ID", "VISIT_OCCURRENCE_ID"],
    }
    DEFAULT_MAPPING = {
        "PERSON_ID": "PERSON_ID",
        "EVENT_DATE": "DRUG_EXPOSURE_START_DATE",
        "CODE": "DRUG_CONCEPT_ID",
    }

    PATHS = {
        "OMOPVisitDetailTable": ["OMOPVisitOccurrenceTable"],
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: PhenexTable

Source code in phenex/mappers.py
class OMOPDeathTable(PhenexTable):
    NAME_TABLE = "DEATH"
    JOIN_KEYS = {"OMOPPersonTable": ["PERSON_ID"]}
    KNOWN_FIELDS = ["PERSON_ID", "DATE_OF_DEATH"]
    DEFAULT_MAPPING = {"PERSON_ID": "PERSON_ID", "DATE_OF_DEATH": "DEATH_DATE"}

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: MeasurementTable

Source code in phenex/mappers.py
class OMOPObservationTable(MeasurementTable):
    NAME_TABLE = "OBSERVATION"
    JOIN_KEYS = {
        "OMOPPersonTable": ["PERSON_ID"],
        "OMOPVisitOccurrenceTable": [
            "PERSON_ID",
            "VISIT_OCCURRENCE_ID",
        ],
    }
    DEFAULT_MAPPING = {
        "PERSON_ID": "PERSON_ID",
        "EVENT_DATE": "OBSERVATION_DATE",
        "CODE": "OBSERVATION_CONCEPT_ID",
        "VALUE": "VALUE_AS_NUMBER",
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: MeasurementTable

Source code in phenex/mappers.py
class OMOPMeasurementTable(MeasurementTable):
    NAME_TABLE = "MEASUREMENT"
    JOIN_KEYS = {
        "OMOPPersonTable": ["PERSON_ID"],
        "OMOPVisitOccurrenceTable": [
            "PERSON_ID",
            "VISIT_OCCURRENCE_ID",
        ],
        "OMOPVisitDetailTable": [
            "PERSON_ID",
            "VISIT_DETAIL_ID",
        ],
    }
    DEFAULT_MAPPING = {
        "PERSON_ID": "PERSON_ID",
        "EVENT_DATE": "MEASUREMENT_DATE",
        "CODE": "MEASUREMENT_CONCEPT_ID",
        "VALUE": "VALUE_AS_NUMBER",
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: PhenexObservationPeriodTable

Source code in phenex/mappers.py
class OMOPVisitOccurrenceTable(PhenexObservationPeriodTable):
    NAME_TABLE = "VISIT_OCCURRENCE"
    JOIN_KEYS = {
        "OMOPPersonTable": ["PERSON_ID"],
        "OMOPConditionOccurenceTable": ["PERSON_ID", "VISIT_OCCURRENCE_ID"],
        "OMOPDrugExposureTable": ["PERSON_ID", "VISIT_OCCURRENCE_ID"],
        "OMOPVisitDetailTable": ["PERSON_ID", "VISIT_OCCURRENCE_ID"],
    }
    DEFAULT_MAPPING = {
        "PERSON_ID": "PERSON_ID",
        "START_DATE": "VISIT_START_DATE",
        "END_DATE": "VISIT_END_DATE",
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: PhenexObservationPeriodTable

Source code in phenex/mappers.py
class OMOPVisitDetailTable(PhenexObservationPeriodTable):
    NAME_TABLE = "VISIT_DETAIL"
    RELATIONSHIPS = {
        "OMOPPersonTable": ["PERSON_ID"],
        "OMOPVisitOccurrenceTable": ["PERSON_ID", "VISIT_OCCURRENCE_ID"],
    }
    DEFAULT_MAPPING = {
        "PERSON_ID": "PERSON_ID",
        "START_DATE": "VISIT_DETAIL_START_DATE",
        "END_DATE": "VISIT_DETAIL_END_DATE",
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: PhenexObservationPeriodTable

Source code in phenex/mappers.py
class OMOPObservationPeriodTable(PhenexObservationPeriodTable):
    NAME_TABLE = "OBSERVATION_PERIOD"
    JOIN_KEYS = {"OMOPPersonTable": ["PERSON_ID"]}
    DEFAULT_MAPPING = {
        "PERSON_ID": "PERSON_ID",
        "START_DATE": "OBSERVATION_PERIOD_START_DATE",
        "END_DATE": "OBSERVATION_PERIOD_END_DATE",
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: CodeTable

Source code in phenex/mappers.py
class OMOPConditionOccurrenceSourceTable(CodeTable):
    NAME_TABLE = "CONDITION_OCCURRENCE"
    JOIN_KEYS = {
        "OMOPPersonTable": ["PERSON_ID"],
        "OMOPVisitOccurrenceTable": ["PERSON_ID", "VISIT_OCCURRENCE_ID"],
    }
    DEFAULT_MAPPING = {
        "PERSON_ID": "PERSON_ID",
        "EVENT_DATE": "CONDITION_START_DATE",
        "CODE": "CONDITION_SOURCE_VALUE",
    }
    PATHS = {
        "OMOPVisitDetailTable": ["OMOPVisitOccurrenceTable"],
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: CodeTable

Source code in phenex/mappers.py
class OMOPProcedureOccurrenceSourceTable(CodeTable):
    NAME_TABLE = "PROCEDURE_OCCURRENCE"
    JOIN_KEYS = {
        "OMOPPersonTable": ["PERSON_ID"],
        "OMOPVisitOccurrenceTable": ["PERSON_ID", "VISIT_OCCURRENCE_ID"],
    }
    DEFAULT_MAPPING = {
        "PERSON_ID": "PERSON_ID",
        "EVENT_DATE": "PROCEDURE_DATE",
        "CODE": "PROCEDURE_SOURCE_VALUE",
    }
    PATHS = {
        "OMOPVisitDetailTable": ["OMOPVisitOccurrenceTable"],
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: CodeTable

Source code in phenex/mappers.py
class OMOPDrugExposureSourceTable(CodeTable):
    NAME_TABLE = "DRUG_EXPOSURE"
    JOIN_KEYS = {
        "OMOPPersonTable": ["PERSON_ID"],
        "OMOPVisitOccurrenceTable": ["PERSON_ID", "VISIT_OCCURRENCE_ID"],
    }
    DEFAULT_MAPPING = {
        "PERSON_ID": "PERSON_ID",
        "EVENT_DATE": "DRUG_EXPOSURE_START_DATE",
        "CODE": "DRUG_SOURCE_VALUE",
    }
    PATHS = {
        "OMOPVisitDetailTable": ["OMOPVisitOccurrenceTable"],
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }

Bases: PhenexPersonTable

Source code in phenex/mappers.py
class OMOPPersonTableSource(PhenexPersonTable):
    NAME_TABLE = "PERSON"
    JOIN_KEYS = {
        "OMOPConditionOccurenceTable": ["PERSON_ID"],
        "OMOPVisitOccurrenceTable": ["PERSON_ID"],
    }
    DEFAULT_MAPPING = {
        "PERSON_ID": "PERSON_ID",
        "DATE_OF_BIRTH": "BIRTH_DATETIME",
        "YEAR_OF_BIRTH": "YEAR_OF_BIRTH",
        "SEX": "GENDER_SOURCE_VALUE",
        "ETHNICITY": "ETHNICITY_SOURCE_VALUE",
    }

__init__(table, name=None, column_mapping={})

Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.

Source code in phenex/tables.py
def __init__(self, table, name=None, column_mapping={}):
    """
    Instantiate a PhenexTable, possibly overriding NAME_TABLE and COLUMN_MAPPING.
    """

    if not isinstance(table, Table):
        raise TypeError(
            f"Cannot instantiatiate {self.__class__.__name__} from {type(table)}. Must be ibis Table."
        )

    self.NAME_TABLE = name or self.NAME_TABLE

    self.column_mapping = self._get_column_mapping(column_mapping)
    self._table = table.mutate(
        **self._resolve_column_mapping(table, self.column_mapping)
    )

    for key in self.REQUIRED_FIELDS:
        try:
            getattr(self._table, key)
        except AttributeError:
            raise ValueError(f"Required field {key} not defined in COLUMN_MAPPING.")

    self._add_phenotype_table_relationship()

filter(expr)

Filter the table by an Ibis Expression or using a PhenExFilter.

Source code in phenex/tables.py
def filter(self, expr):
    """
    Filter the table by an Ibis Expression or using a PhenExFilter.
    """
    input_columns = self.columns
    if isinstance(expr, ibis.expr.types.Expr) or isinstance(expr, list):
        filtered_table = self.table.filter(expr)
    else:
        filtered_table = expr.filter(self)

    return type(self)(
        filtered_table.select(input_columns),
        name=self.NAME_TABLE,
        column_mapping=self.column_mapping,
    )

from_dict(data) classmethod

Reconstruct a PhenexTable class reference from serialized data.

Note: This returns the class itself, not an instance, since we cannot reconstruct the actual table data without a database connection.

Parameters:

Name Type Description Default
data dict

Serialized class configuration

required

Returns:

Type Description

The PhenexTable subclass

Source code in phenex/tables.py
@classmethod
def from_dict(cls, data: dict):
    """
    Reconstruct a PhenexTable class reference from serialized data.

    Note: This returns the class itself, not an instance, since we cannot
    reconstruct the actual table data without a database connection.

    Args:
        data: Serialized class configuration

    Returns:
        The PhenexTable subclass
    """
    # The class should already exist in the module, just return it
    return cls

join(other, *args, domains=None, **kwargs)

The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.

Source code in phenex/tables.py
def join(self, other: "PhenexTable", *args, domains=None, **kwargs):
    """
    The join method performs a join of PhenexTables, using autojoin functionality if Phenex is able to find the table types specified in PATHS.
    """
    if isinstance(other, Table):
        return type(self)(self.table.join(other, *args, **kwargs))

    if not isinstance(other, PhenexTable):
        raise TypeError(f"Expected a PhenexTable instance, got {type(other)}")
    if len(args):
        # if user specifies join keys and join type, simply perform join as specified
        return type(self)(self.table.join(other.table, *args, **kwargs))

    # Do an autojoin by finding a path from the left to the right table and sequentially joining as necessary
    # joined table is the sequentially joined table
    # current table is the table for the left join in the current iteration
    joined_table = current_left_table = self
    logger.debug(
        f"Starting autojoin from {self.__class__.__name__} to {other.__class__.__name__}"
    )

    for right_table_class_name in self._find_path(other):
        # get the next right table
        right_table_search_results = [
            v
            for k, v in domains.items()
            if v.__class__.__name__ == right_table_class_name
        ]
        logger.debug(
            f"Searching for {right_table_class_name} in domains: {list(domains.keys())}"
        )
        logger.debug(
            f"Found {len(right_table_search_results)} matches for {right_table_class_name}"
        )

        if len(right_table_search_results) != 1:
            raise ValueError(
                f"Unable to find unqiue {right_table_class_name} required to join {other.__class__.__name__}"
            )
        right_table = right_table_search_results[0]
        print(
            f"\tJoining : {current_left_table.__class__.__name__} to {right_table.__class__.__name__}"
        )

        # join keys are defined by the left table; in theory should enforce symmetry
        join_keys = current_left_table.JOIN_KEYS[right_table_class_name]

        # Build join predicate(s) - supports symmetric and asymmetric joins
        # Symmetric: ["COLUMN"] or ["COL1", "COL2"] - same column names in both tables
        # Asymmetric: [("LEFT_COL", "RIGHT_COL")] - different column names
        # Mixed: ["COL1", ("LEFT_COL", "RIGHT_COL")]
        predicates = []
        for join_key in join_keys:
            if isinstance(join_key, str):
                # Symmetric: column exists in both tables with same name
                predicates.append(joined_table[join_key] == right_table[join_key])
            elif isinstance(join_key, (tuple, list)) and len(join_key) == 2:
                # Asymmetric: (left_col, right_col) - different column names
                left_col, right_col = join_key
                predicates.append(joined_table[left_col] == right_table[right_col])
            else:
                raise ValueError(
                    f"Invalid join key format: {join_key}. Must be either a string or a 2-element tuple/list."
                )

        # Combine all predicates with AND
        if len(predicates) == 1:
            join_predicate = predicates[0]
        else:
            join_predicate = predicates[0]
            for pred in predicates[1:]:
                join_predicate = join_predicate & pred

        columns = list(set(joined_table.columns + right_table.columns))
        # subset columns, making sure to set type of table to the very left table (self)
        joined_table = type(self)(
            joined_table.join(right_table, join_predicate, **kwargs).select(columns)
        )
        current_left_table = right_table
    return joined_table

to_dict() classmethod

Serialize the PhenexTable class configuration (not the data).

This serializes the class-level attributes that define the table mapping, but not the actual ibis table data which cannot be serialized.

Returns:

Name Type Description
dict dict

Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.

Source code in phenex/tables.py
@classmethod
def to_dict(cls) -> dict:
    """
    Serialize the PhenexTable class configuration (not the data).

    This serializes the class-level attributes that define the table mapping,
    but not the actual ibis table data which cannot be serialized.

    Returns:
        dict: Class configuration including NAME_TABLE, JOIN_KEYS, DEFAULT_MAPPING, etc.
    """
    return {
        "__table_class__": cls.__name__,
        "__module__": cls.__module__,
        "NAME_TABLE": cls.NAME_TABLE,
        "JOIN_KEYS": cls.JOIN_KEYS,
        "KNOWN_FIELDS": cls.KNOWN_FIELDS,
        "DEFAULT_MAPPING": cls.DEFAULT_MAPPING,
        "PATHS": cls.PATHS,
        "REQUIRED_FIELDS": cls.REQUIRED_FIELDS,
    }