ArithmeticPhenotype Tutorial¶
The ArithmeticPhenotype allows us to perform simple mathematical operations such as addition, subtraction, division and multiplication with the output of other phenotypes.
There are two obvious use cases for this in RWD :
- calculating medical scores, such as the CHADSVASC score or the Charlson Comorbidity Index
- calculating a derived measurement value, such as Body Mass Index, which is calculated using height and weight.
In this tutorial, we will see how to calculate CHASVASC and how to calculate BMI.
Calculating scores¶
Like the LogicPhenotype, the Arithmetic phenotype operates on other phenotypes; we refer to the phenotypes that an ArithmeticPhenotype operates on the 'component phenotypes'.
In order to perform arithmetic, we need to associate a value to patients fulfilling criteria of a component phenotype. By default, this is done by associating the value of '1' with all patients that fulfill the criteria of a component phenotype. Patients that do not fulfill the component phenotype criteria are associated with a value of '0'.
Let's see how this works on a simple example of the CHADSVASC score. We will assume that Codelists already exist for each component phenotype.
Step 1 : Create all component phenotypes¶
# Step 1 : First create all component phenotypes
c = CodelistPhenotype(
codelist=Codelist("heart_failure"),
domain="condition_occurrence",
relative_time_range = ONEYEAR_PREINDEX
)
h = CodelistPhenotype(
codelist=Codelist("hypertension"),
domain="condition_occurrence",
relative_time_range = ONEYEAR_PREINDEX
)
a75 = AgePhenotype(
min_age=GreaterThanOrEqualTo(75),
relative_time_range = ONEYEAR_PREINDEX
)
d = CodelistPhenotype(
codelist=Codelist("diabetes_and_impaired_glucose_tolerance"),
domain="condition_occurrence",
relative_time_range = ONEYEAR_PREINDEX
)
s = CodelistPhenotype(
codelist=Codelist("stroke"),
domain="condition_occurrence",
relative_time_range = ONEYEAR_PREINDEX
)
v = CodelistPhenotype(
codelist=Codelist("peripheral_artery_disease"),
domain="condition_occurrence",
relative_time_range = ONEYEAR_PREINDEX
)
a65to74 = AgePhenotype(
min_age=GreaterThanOrEqualTo(65),
max_age=LessThanOrEqualTo(74),
relative_time_range = ONEYEAR_PREINDEX
)
sc = SexPhenotype(allowed_values=[2]) # female is defined as a value of 2 in our optum data base
Step 2 : Create ArithmeticPhenotype¶
We can then create our arithmetic phenotype by combining our phenotypes with mathematical operations. We do this for chadsvasc by adding up all the component phenotype values. Recall that the default value for a component phenotype is 1; if we want another value associated with the component phenotype, we perform multiplication with that value (see that age>75 and sex category a75,s respectively are associated with the value of 2)
chadsvasc = ScorePhenotype(
name = "chadsvasc",
expression = c + h + a75 * 2 + d + s * 2 + v + a65to74 + sc,
)
Calculating derived measurement values¶
MeasurementPhenotypes are unique in that, if the return_value keyword argument is set, they are associated with a value. ArithmeticPhenotype will operate on the returned value of MeasurementPhenotypes, allowing us to calculate derived values from measurement values.
This is useful for the example of body mass index, which is defined as weight divided by height in meters to the power of 2.
As seen in the above example, the steps are to (1) define our component phenotypes and (2) create the arithmetic phenotype that combines them with our mathematical operations
Step 1 : Create all component phenotypes¶
# define our component phenotypes
h = MeasurementPhenotype(
name="height",
codelist=Codelist("HEIGHT"),
domain="measurement",
relative_time_range = ONEYEAR_PREINDEX,
value_aggregation="mean",
return_value="all"
)
w = MeasurementPhenotype(
name="weight",
codelist=Codelist("WEIGHT"),
domain="measurement",
relative_time_range = ONEYEAR_PREINDEX,
value_aggregation="mean",
return_value="all"
)
Step 2: Create ArithmeticPhenotype¶
# calculate the bmi
bmi = ArithmeticPhenotype(
name="bmi",
expression = w / (h / 100) ** 2,
)
Setting value filters¶
With ArithmeticPhenotype, like MeasurementPhenotype, we can define value_filters that allow us to subset patients that fulfill some filtering criteria.
For example, I may be interested only in patients with a BMI greater or equal to 30.
# calculate the bmi
bmi = ArithmeticPhenotype(
name="bmi",
logic=w / (h / 100) ** 2,
value_filter=ValueFilter(">=", 30, "value"),
)