Binomial Extra Variance and Trend Test • drcHelper

library(drcHelper)
library(tidyverse)
library(conflicted)
conflict_prefer("filter","dplyr")
conflict_prefer("group_by","dplyr")
conflict_prefer("select","dplyr")
source("../knitr-setup.R")

Background

One very standard procedure for dealing with binomial or quantal data in Ecotox area is to test first if extra-binomial variation exists when there are replications, if no, continue with Cochran-Armitage trend test to derive NOEC; if yes, then use the Cochran-Armitage test with Rao-Scott correction. The question is, are we going to use replicate data or are we going to use pooled data in the following Cochran-Armitage test. If the latter, testing within-group overdispersion is necessary for pooling the data together; if the former, the overdispersion should be designed to test for extra-binomial variation (overdispersion) in the context of dose-response data, specifically:

Assumes a trend exists (linear dose-response relationship)
Tests if the observed variation around this expected trend is greater than what binomial distribution would predict.

It is used to validate the Cochran-Armitage trend test assumptions. This actually comes together with the issue of scoring in the Cochran-Armitage test.

Scoring Method	When to Use	Advantages	Disadvantages
Actual doses	Known dose-response relationship	Biologically meaningful	Sensitive to dose spacing
Ranks (`seq_along`)	Unknown relationship	Robust to outlying doses	May miss non-linear patterns
Log-transformed doses	Exponential relationships	Handles wide dose ranges	Requires exponential relationships
Equally spaced	Exploratory analysis	Simple interpretation	May not reflect true spacing

Test Statistic Calculation

Going back to standard Cochran-Armitage test. The core calculation follows the standard Cochran-Armitage formula:

\[ Z = -\sqrt{\frac{N}{R(N-R)}} \cdot \frac{\sum_{i=1}^k (r_i - \frac{n_i R}{N}) d_i}{\sqrt{\sum_{i=1}^k \frac{n_i d_i^2}{N} - \left(\sum_{i=1}^k \frac{n_i d_i}{N}\right)^2}} \]

Where: - $ N = n_i $ (total sample size) - $ R = r_i $ (total cases) - $ n_i $ = totals in group i - $ r_i $ = cases in group i
- $ d_i $ = dose rating for group i

or considering a two-way contingency table with R rows and C columns, we can write the test statistics in a different way.

The null hypothesis is the binomial proportion in each group ($p_i=n_i/n_{total}$) is the same across all dose levels.

On a separate note, the default implementation of Cochran-Armitage test usually uses the normal distribution as the test statistic distribution. It is possible to calculate exact p-values by specifying “exact” options. However, it requires enumerating all possible configurations of cases across groups while maintaining the marginal totals, which becomes computationally prohibitive for large samples. Most practical implementations use the asymptotic approximation or Monte Carlo methods for the exact test.

The Ben O’Neill’s version of Tarone’s Test

The implementation of Ben O’Neill’s Tarone’s test takes .. 1. It pools all the data and calculates an overall proportion: estimate <- sum(M)/sum(N) 2. It calculates a Pearson chi-square-like statistic S based on deviations from the overall proportion: S <- ifelse(estimate == 1, sum(N), sum((M - N * estimate)^2/(estimate * (1 - estimate)))). Note that the denominator is the variance under binomial assumption 3. Then it calculates the statistic by standardization, converting it to a z-statistic : statistic <- (S - sum(N))/sqrt(2 * sum(N * (N - 1))) 4. Uses a normal approximation for p-value: p.value <- 2 * pnorm(-abs(statistic), 0, 1)

This tests: $H_0$: All groups have the same underlying proportion (pooled estimate) $H_a$: Extra-binomial variation exists (overdispersion relative to pooled proportion)

It doesn’t assume any particular dose-response relationship. It’s more like testing for heterogeneity between groups.

An example is given below:

 #Generate example data
N <- c(30, 32, 40, 28, 29, 35, 30, 34, 31, 39)
M <- c( 9, 10, 22, 15,  8, 19, 16, 19, 15, 10)
Tarone.test(N, M)
#> 
#>  Tarone's Z test
#> 
#> data:  M successes from N trials
#> z = 2.5988, p-value = 0.009355
#> alternative hypothesis: true dispersion parameter is greater than 0
#> sample estimates:
#> proportion parameter 
#>            0.4359756

Comparison with Trend-Based Tarone’s Test:

Trend-based Tarone’s test can be implemented by excluding a trend first followed by the z-test or chi-square test. Details please see the example below.

Which Should You Use?

The O’Neill’s version is simpler and tests for basic heterogeneity between groups. However, things become more complicated when there are replicates, how do we derive a $p_i$ for $i_{th}$ dose level? Are there within group overdispersion already? Do we use incidence occurred for each replicate, or use an average of incidence rates from each replicate, or do we pool all incidences from all replicates to calculate a pooled average?

If we have one “average” proportion for each dose group, then what is the total should be used? Because in analogous to ANOVA, in the case we used average of a replicate, then the sample size from that dose group is becoming the number of replicates instead of number of all individuals. When each replicate have the same amount of individuals, then there is no difference between using average of average or pooled incidence rates. but the sample size in each dose group is still a concern.

Trend-based version is more appropriate when you expect a dose-response relationship and want to test if there’s extra variation beyond the trend using the replicate data.

For NOEC determination with Cochran-Armitage, the trend-based version is more theoretically appropriate since CA test assumes a dose-response trend.

Example Data

We do a comparison of various approaches using an example dataset.

Note that the arcsine transformation is used to “stretch out” data points that range between 0 and 1. This transformation is typically used when dealing with proportions and percentages. This has been the lab routine procedure for analyzing Daphnia acute studies. However, in generic OECD statistical guidances, the recommended approach is the step down Cochran-Armitage test, in combine nation with Tarone test as a pre-test for overdispersion.

Ideally, inn the normal statistical procedure outside the regulatory statistics, we would conduct GLM binomial regression first to see if there is a trend by treating concentration levels as continuous predictor or to calculate an NOEC by treating concentration levels as categorical predictor. However, often in this type of study data, we have the quasi-complete separation where there are perfect zeros in the lower doses and some responses the few high dose levels, creating a “cliff” in the data. The GLM algorithm will struggle to estimate parameters and produce very large standard error estimations which will result insignificance even if there are high mortality or incidence rate in high dose groups. There are several approaches to deal with this separation issue. For example, Firth’s method adds a penalty term that stabilizes estimation, giving much more reasonable standard errors and interpretable results. We can also pool low-dose levels to do the regression. For small samples with separation, exact logistic regression (elrm) can be conducted but it is computationally intensive.

In this example data, we encounter this separation issue, which prevents us from using the GLM approach as a reliable anchor point for meaningful NOEC comparisons without conducting a more robust analysis. Instead, we directly compare NOEC calculation with the two approaches:

arcsine sqrt transformation and then test normality and variance homogeneity, monotonicity test, followed by either Williams’ or Dunnett’s or non-parametric alternatives.
Tarone’s test for overdispersion in binomial data, followed by step-down Cochran-Armitage test.

However, in option 2, in terms testing overdispersion, just as in ANOVA, normality should be tested on residuals, overdispersion should be tested on residuals removing the trend. In terms of the Cochran-Armitage test, the scores can be directly the dose, or using the rank, for example, default coding would be just seq_along(c(0.10,30,50,80)). This creates the question how to define the trend and how to remove the trend. To make it consistent, we make sure that

The overdispersion phi parameter should be calculated from the trend-adjusted residuals, not the raw residuals. $\phi = \frac{S_{adj}}{df_{adj}}$ where $S_{adj}$ is the sum of squared standardized residuals after trend removal.
The linear trend is properly estimated and removed before calculating overdispersion.
The linear trend is calculated using the same scoring as that will be used in the Cochran-Armitage test.

With different scoring approaches, we can obtain very different results. In this example data, the dramatic difference ($\phi$ = 1.97 for doses vs $\phi$ = 6.01 for ranks) suggests that the choice of scoring method is crucial, and the rank-based scoring is detecting much stronger overdispersion after trend removal.

##proj_dir <- here::here()
##quantal_dat_nested <- readRDS(paste0(proj_dir,"/data-raw/quantal_dat_nested.rds"))
data("quantal_dat_nested")
second_arcsin_data <- quantal_dat_nested$data[[2]] %>% mutate(arcsin_prop = asin(sqrt(immob)),Treatment = factor(Treatment, levels = c("Control", sort(as.numeric(unique(Treatment[Treatment != "Control"])))))
)

# Full comparison
comparison_results <- drcHelper:::compare_tarone_scoring(
    successes = second_arcsin_data$Immobile,
    totals = second_arcsin_data$Total,
    doses = second_arcsin_data$Dose
)
#> 
#> === Running analysis with doses scoring ===
#> 
#> === Running analysis with ranks scoring ===
#> 
#> === Running analysis with log_doses scoring ===
#> 
#> === Running analysis with equal_spaced scoring ===
#> 
#> ================================================================================
#> COMPARISON ACROSS SCORING METHODS
#> ================================================================================
#>  Scoring_Method Phi_Estimate Trend_Z_Stat Trend_P_Value Overdispersion
#>           doses        1.966        1.366        0.1719             No
#>           ranks        6.006        7.080        0.0000          Yes *
#>       log_doses        7.639        9.388        0.0000          Yes *
#>    equal_spaced        6.006        7.080        0.0000          Yes *
#>                      Recommendation
#>  Strong Rao-Scott correction needed
#>  Strong Rao-Scott correction needed
#>  Strong Rao-Scott correction needed
#>  Strong Rao-Scott correction needed
#> 
#> Groupwise Overdispersion Summary:
#>  Dose N_Replicates Proportion Tarone_P_Value Overdispersed
#>     0            6      0.000             NA   Cannot test
#>     5            6      0.000             NA   Cannot test
#>    10            6      0.000             NA   Cannot test
#>    20            6      0.000             NA   Cannot test
#>    40            6      0.067         0.5801            No
#>    80            6      0.633         0.9704            No

# Create publication-ready summary
summary_table <- drcHelper:::create_summary_table(comparison_results)
#> Summary Table for Publication:
#> ==============================
#>  Scoring Method Phi (Overdispersion) Trend Test Z P-value Significant?
#>           doses                1.966        1.366  0.1719           No
#>           ranks                6.006        7.080  0.0000        Yes *
#>       log_doses                7.639        9.388  0.0000        Yes *
#>    equal_spaced                6.006        7.080  0.0000        Yes *
#>             Analysis Recommendation
#>  Strong Rao-Scott correction needed
#>  Strong Rao-Scott correction needed
#>  Strong Rao-Scott correction needed
#>  Strong Rao-Scott correction needed
#> 
#> Interpretation Guide:
#> - φ > 1.2: Consider overdispersion correction
#> - φ > 1.5: Strong evidence of overdispersion
#> - * indicates statistical significance (p < 0.05)
#> - Groupwise tests examine within-dose variation
#> - Trend-adjusted tests examine overdispersion after removing dose-response trend

Perform Tarone’s test followed by stepdown CA test and compare the results

# Test the fixed function
phi_comparison_ranks <- drcHelper:::compare_phi_methods(
  successes = second_arcsin_data$Immobile,
  totals = second_arcsin_data$Total,
  doses = second_arcsin_data$Dose,
  scoring = "ranks"
)
#> Calculating phi estimates...
#> Simple method phi: 18.85714 
#> Trend-adjusted method phi: 5.992918 
#> 
#> Phi Estimation Method Comparison (ranks scoring):
#> ============================================================ 
#>                        Method Phi_Estimate
#>  Simple (no trend adjustment)       18.857
#>                Trend-adjusted        5.993
#>                                     Description
#>            Based on residuals from overall mean
#>  Based on residuals after removing linear trend
#>                               Use_When
#>  No clear dose-response trend expected
#>      Clear dose-response trend present
#> 
#> Recommendation:
#> Simple method shows higher overdispersion.
#> May indicate overdispersion not related to dose-response trend.

# Comprehensive comparison
comprehensive_results <- drcHelper:::comprehensive_phi_comparison(
  successes = second_arcsin_data$Immobile,
  totals = second_arcsin_data$Total,
  doses = second_arcsin_data$Dose
)
#> Comprehensive Phi Comparison Across Methods
#> ====================================================================== 
#>  Scoring_Method simple trend_adjusted Difference
#>           doses   18.9           1.95      -16.9
#>           ranks   18.9           5.99      -12.9
#>       log_doses   18.9           7.63      -11.2
#>    equal_spaced   18.9           5.99      -12.9
#>                     Interpretation
#>  Strong overdispersion (trend-adj)
#>  Strong overdispersion (trend-adj)
#>  Strong overdispersion (trend-adj)
#>  Strong overdispersion (trend-adj)
#> 
#> Key Findings:
#> - Simple method assumes no dose-response trend
#> - Trend-adjusted method removes linear dose-response before testing overdispersion
#> - Large differences suggest trend-related vs. random overdispersion

# Step-down test with simple phi method
stepdown_simple <- stepDownTrendTestBinom(
  successes = second_arcsin_data$Immobile,
  totals = second_arcsin_data$Total,
  doses = second_arcsin_data$Dose,
  scoring = "rank",
  alternative = "greater",
  rao_scott = TRUE,
  phi_method = "simple"
)
#> Estimated phi = 18.857 using simple method with ranks scoring

# Step-down test with trend-adjusted phi method
stepdown_trend <- stepDownTrendTestBinom(
  successes = second_arcsin_data$Immobile,
  totals = second_arcsin_data$Total,
  doses = second_arcsin_data$Dose,
  scoring = "rank",
  alternative = "greater",
  rao_scott = TRUE,
  phi_method = "trend_adjusted"
)
#> Estimated phi = 5.993 using trend_adjusted method with ranks scoring

# Test with ranks scoring and trend-adjusted phi
analysis_ranks_trend <- drcHelper:::complete_trend_analysis(
  successes = second_arcsin_data$Immobile,
  totals = second_arcsin_data$Total,
  doses = second_arcsin_data$Dose,
  scoring = "ranks",
  alternative = "greater",
  phi_method = "trend_adjusted"
)
#> Performing complete trend analysis
#> Scoring method: ranks 
#> Phi estimation method: trend_adjusted 
#> ============================================================ 
#> 
#> Step 1: Testing for overdispersion...
#> Tarone test phi estimate: 6.006 
#> Simple phi estimate: 18.857 
#> Trend-adjusted phi estimate: 5.993 
#> Using phi = 5.993 for Rao-Scott correction
#> 
#> Step 2: Step-down Cochran-Armitage trend test...
#> Overdispersion detected (phi = 5.993 ), applying Rao-Scott correction...

print(analysis_ranks_trend)
#> $scoring_method
#> [1] "ranks"
#> 
#> $phi_method
#> [1] "trend_adjusted"
#> 
#> $phi_estimates
#> $phi_estimates$simple
#> [1] 18.85714
#> 
#> $phi_estimates$trend_adjusted
#> [1] 5.992918
#> 
#> $phi_estimates$tarone
#> [1] 6.006433
#> 
#> $phi_estimates$used
#> [1] 5.992918
#> 
#> 
#> $tarone_analysis
#> $scoring_method
#> [1] "ranks"
#> 
#> $doses
#> [1]  0  5 10 20 40 80
#> 
#> $scores
#> [1] 1 2 3 4 5 6
#> 
#> $dose_summary
#> # A tibble: 6 × 4
#>   doses successes totals n_replicates
#>   <dbl>     <dbl>  <dbl>        <int>
#> 1     0         0     30            6
#> 2     5         0     30            6
#> 3    10         0     30            6
#> 4    20         0     30            6
#> 5    40         2     30            6
#> 6    80        19     30            6
#> 
#> $groupwise_tarone
#> $groupwise_tarone$summary
#>   dose n_replicates z_statistic   p_value significant note
#> 1    0            6         NaN       NaN          NA     
#> 2    5            6         NaN       NaN          NA     
#> 3   10            6         NaN       NaN          NA     
#> 4   20            6         NaN       NaN          NA     
#> 5   40            6 -0.55328334 0.5800694       FALSE     
#> 6   80            6 -0.03706204 0.9704355       FALSE     
#> 
#> $groupwise_tarone$detailed_results
#> $groupwise_tarone$detailed_results[[1]]
#> 
#>  Tarone's Z test
#> 
#> data:  successes successes from totals trials
#> z = NaN, p-value = NA
#> alternative hypothesis: true dispersion parameter is greater than 0
#> sample estimates:
#> proportion parameter 
#>                    0 
#> 
#> 
#> $groupwise_tarone$detailed_results[[2]]
#> 
#>  Tarone's Z test
#> 
#> data:  successes successes from totals trials
#> z = NaN, p-value = NA
#> alternative hypothesis: true dispersion parameter is greater than 0
#> sample estimates:
#> proportion parameter 
#>                    0 
#> 
#> 
#> $groupwise_tarone$detailed_results[[3]]
#> 
#>  Tarone's Z test
#> 
#> data:  successes successes from totals trials
#> z = NaN, p-value = NA
#> alternative hypothesis: true dispersion parameter is greater than 0
#> sample estimates:
#> proportion parameter 
#>                    0 
#> 
#> 
#> $groupwise_tarone$detailed_results[[4]]
#> 
#>  Tarone's Z test
#> 
#> data:  successes successes from totals trials
#> z = NaN, p-value = NA
#> alternative hypothesis: true dispersion parameter is greater than 0
#> sample estimates:
#> proportion parameter 
#>                    0 
#> 
#> 
#> $groupwise_tarone$detailed_results[[5]]
#> 
#>  Tarone's Z test
#> 
#> data:  successes successes from totals trials
#> z = -0.55328, p-value = 0.5801
#> alternative hypothesis: true dispersion parameter is greater than 0
#> sample estimates:
#> proportion parameter 
#>           0.06666667 
#> 
#> 
#> $groupwise_tarone$detailed_results[[6]]
#> 
#>  Tarone's Z test
#> 
#> data:  successes successes from totals trials
#> z = -0.037062, p-value = 0.9704
#> alternative hypothesis: true dispersion parameter is greater than 0
#> sample estimates:
#> proportion parameter 
#>            0.6333333 
#> 
#> 
#> 
#> 
#> $trend_adjusted_tarone
#> 
#>  Tarone's test for overdispersion (trend-adjusted)
#> 
#> data:  successes from totals with trend removed
#> = 7.0802, p-value = 1.44e-12
#> alternative hypothesis: two.sided
#> 
#> 
#> $phi_estimate
#> [1] 6.006433
#> 
#> $trend_fit
#> $trend_fit$beta
#> [1] 0.09619048
#> 
#> $trend_fit$trend_proportions
#> [1] 0.00100000 0.00100000 0.06857143 0.16476190 0.26095238 0.35714286
#> 
#> $trend_fit$expected_under_trend
#> [1]  0.030000  0.030000  2.057143  4.942857  7.828571 10.714286
#> 
#> $trend_fit$residuals_after_trend
#> [1] -0.030000 -0.030000 -2.057143 -4.942857 -5.828571  8.285714
#> 
#> $trend_fit$variance_components
#> [1] 0.029970 0.029970 1.916082 4.128463 5.785687 6.887755
#> 
#> 
#> attr(,"class")
#> [1] "TaroneTrendTest"
#> 
#> $stepdown_standard
#> 
#> Step-down Cochran-Armitage test for trend (ranks scoring) (ranks scoring) 
#> 
#> data: successes out of totals at doses 0, 5, 10, 20, 40, 80 with ranks scoring 
#> 
#> Step-down results:
#>              Doses_Included Statistic      P_Value
#> Step 1 0, 5, 10, 20, 40, 80  6.865561 3.311526e-12
#> Step 2     0, 5, 10, 20, 40  2.013468 2.203270e-02
#> Step 3         0, 5, 10, 20  0.000000 5.000000e-01
#> Step 4             0, 5, 10  0.000000 5.000000e-01
#> Step 5                 0, 5  0.000000 5.000000e-01
#> 
#> Alternative hypothesis: greater 
#> NOEC: 20 
#> LOEC: 40 
#> 
#> $stepdown_corrected
#> 
#> Step-down Rao-Scott corrected Cochran-Armitage test for trend (ranks scoring) (ranks scoring) 
#> 
#> data: successes out of totals at doses 0, 5, 10, 20, 40, 80 with ranks scoring 
#> 
#> Step-down results:
#>              Doses_Included Statistic     P_Value
#> Step 1 0, 5, 10, 20, 40, 80 2.8045091 0.002519663
#> Step 2     0, 5, 10, 20, 40 0.8224805 0.205401748
#> Step 3         0, 5, 10, 20 0.0000000 0.500000000
#> Step 4             0, 5, 10 0.0000000 0.500000000
#> Step 5                 0, 5 0.0000000 0.500000000
#> 
#> Alternative hypothesis: greater 
#> NOEC: 40 
#> LOEC: 80 
#> 
#> attr(,"class")
#> [1] "completeTrendAnalysis"

The False Significance Problem

The traditional CA trend test may not be the best tool for some dose-response patterns. In particular, CA test ignores within-group dispersion - it’s purely about group means. Even if over-dispersion is corrected within each group, it is not adjusted by the corresponding trend that is assumed in the testing procedure. It could happen that there is no over-dispersion within each treatment group, but the residuals after the trend removal are actually overdispersed, which is exactly what happened in the specific data example. Secondly, Scoring systems can mask biological reality - equal weight to unequal biological changes. Thirdly, linear trend assumption fails with cliff data - the 40 mg a.i./L “significance” can be an artifact of the testing assumption and procedure.

The data pattern: 0%, 0%, 0%, 0%, 6.7%, 63.3%

What happens in step-down testing: Step 1: Test all groups (0,5,10,20,40,80) - SIGNIFICANT (driven by 80 mg a.i./L) Step 2: Test (0,5,10,20,40) - May be SIGNIFICANT (driven by 40 mg a.i./L vs others) Step 3: Test (0,5,10,20) - NOT significant

Result: NOEC = 20mg, LOEC = 40 mg a.i./L But biologically: 40mg shows minimal effect (6.7%) The real biological effect starts at 80 mg a.i./L (63.3%)

This happens because: 1. CA test uses LINEAR trend assumption 2. The huge 80 mg a.i./L effect ‘pulls’ the trend line 3. 40 mg a.i./La ppears significant relative to this trend 4. Actual biological threshold (cliff) is ignored

Given these limitations, there could be more suitable approaches.

Pairwise Comparisons with Control
Change-Point Detection
BMD
Other custom threshold tests

Pairwise Comparison with Control

second_arcsin_data <- second_arcsin_data %>% mutate(Active = Total-Immobile)
res1<- compare_to_control_fisher(second_arcsin_data,factor_col = "Treatment",success_col = "Immobile",failure_col = "Active",
                          p.adjust.method = "none")
res1
#>   Treatment p_value odds_ratio ci_lower ci_upper p_adjusted
#> 1         5  1.0000          0        0      Inf     1.0000
#> 2        10  1.0000          0        0      Inf     1.0000
#> 3        20  1.0000          0        0      Inf     1.0000
#> 4        40  0.4915          0        0   5.2956     0.4915
#> 5        80  0.0000          0        0   0.1005     0.0000
res2 <- compare_to_control_fisher(second_arcsin_data,factor_col = "Treatment",success_col = "Immobile",failure_col = "Active", p.adjust.method = "holm")
res2
#>   Treatment p_value odds_ratio ci_lower ci_upper p_adjusted
#> 1         5  1.0000          0        0      Inf          1
#> 2        10  1.0000          0        0      Inf          1
#> 3        20  1.0000          0        0      Inf          1
#> 4        40  0.4915          0        0   5.2956          1
#> 5        80  0.0000          0        0   0.1005          0

Exact Cochran-Armitage on replicate averages

Another important aspect is the difference between the exact test p-value and the asymptotic p-value, apart from the sample size being used.

Dose	test.statistic	exact.pvalue	asymptotic.pvalue	sampe.size
40	-1.798	0.2	0.03609	6
40	-2.517	0.03893	0.005921	30
80	-4.053	0.0002546	2.531e-05	6
80	-8.945	2.12e-17	1.862e-19	30

ca_dose_40 <- second_arcsin_data %>% dplyr::group_by(Dose) %>% summarise(nrep=n(),p=mean(immob),ntot=sum(Total),p2= sum(Immobile)/sum(Total)) %>% filter(Dose!= 80)
a1 <- CATTexact::catt_exact(ca_dose_40$Dose,ca_dose_40$nrep,ceiling(ca_dose_40$p*ca_dose_40$nrep))
a2 <- CATTexact::catt_exact(ca_dose_40$Dose,ca_dose_40$ntot,ceiling(ca_dose_40$p*ca_dose_40$ntot))


ca_dose_80 <- second_arcsin_data %>% dplyr::group_by(Dose) %>% summarise(nrep=n(),p=mean(immob),ntot=sum(Total),p2= sum(Immobile)/sum(Total)) 
b1 <- CATTexact::catt_exact(ca_dose_80$Dose,ca_dose_80$nrep,ceiling(ca_dose_80$p*ca_dose_80$nrep))
b2 <- CATTexact::catt_exact(ca_dose_80$Dose,ca_dose_80$ntot,ceiling(ca_dose_80$p*ca_dose_80$ntot))
rbind(data.frame(Dose=40,a1,sampe.size=6),
      data.frame(Dose=40,a2,sampe.size=30),
      data.frame(Dose=80,b1,sampe.size=6),
      data.frame(Dose=80,b2,sampe.size=30))

Change Point Analysis

#' Detect change points in dose-response
detect_change_point <- function(successes, totals, doses) {
  
  group_data <- data.frame(successes, totals, doses) %>%
    dplyr::group_by(doses) %>%
    dplyr::summarise(
      successes = sum(successes),
      totals = sum(totals),
      proportion = sum(successes) / sum(totals),
      .groups = 'drop'
    ) %>%
    dplyr::arrange(doses)
  
  cat("Change Point Analysis:\n")
  cat("=====================\n")
  
  # Look for largest jump in proportions
  prop_diffs <- diff(group_data$proportion)
  max_jump_idx <- which.max(prop_diffs)
  
  cat("Proportion differences between consecutive doses:\n")
  diff_df <- data.frame(
    From_Dose = group_data$doses[-nrow(group_data)],
    To_Dose = group_data$doses[-1],
    Proportion_Jump = round(prop_diffs, 3)
  )
  print(diff_df, row.names = FALSE)
  
  cat("\nLargest jump: from", group_data$doses[max_jump_idx], 
      "to", group_data$doses[max_jump_idx + 1], "\n")
  cat("Jump size:", round(prop_diffs[max_jump_idx], 3), "\n")
  
  cat("\nSuggested biological threshold:", group_data$doses[max_jump_idx + 1], "\n")
  
  return(list(
    change_point = group_data$doses[max_jump_idx + 1],
    jump_size = prop_diffs[max_jump_idx]
  ))
}

change_point <- detect_change_point(
  second_arcsin_data$Immobile,
  second_arcsin_data$Total,
  second_arcsin_data$Dose
)
#> Change Point Analysis:
#> =====================
#> Proportion differences between consecutive doses:
#>  From_Dose To_Dose Proportion_Jump
#>          0       5           0.000
#>          5      10           0.000
#>         10      20           0.000
#>         20      40           0.067
#>         40      80           0.567
#> 
#> Largest jump: from 40 to 80 
#> Jump size: 0.567 
#> 
#> Suggested biological threshold: 80