CERNO test

CERNO test
Field	Bioinformatics, Statistics, Data Science

Publication of the Method

The CERNO statistic was published in a 2008 study on interferon-beta-regulated gene expression in relapsing–remitting multiple sclerosis.^[1] It was subsequently used in transcriptomic and proteomic studies.^[2] The test was further described in the supplementary materials of a 2013 pharmacogenomics study.^[3]

An independent, comprehensive evaluation of the algorithm was published by Zyla et al. in 2019.^[4]

Remove ads

Methodology

Summarize

Perspective

This section needs additional citations for verification. (June 2025)

The CERNO test evaluates whether the ranks of a set of genes or features within a genome-wide ranking (from most to least significant by any metric) are collectively more extreme than would be expected by chance. This makes it sensitive to sets with even a few strongly ranked members, rather than requiring uniform or over-a-threshold significance of all genes in the set.

The test statistic for a gene set of size k in a ranked list of N genes is:^[4]

S=-2\sum _{i=1}^{k}\ln \left({\frac {r_{i}}{N}}\right)

where r_i is the rank of the ith gene in the set. Under the null hypothesis of random rank distribution, S follows a chi-square distribution with 2k degrees of freedom.

Remove ads

Comparison with Other Methods

Zyla et al. noted some advantages of CERNO, including that it showed the highest reproducibility of the methods they investigated, as well as good sensitivity, prioritization and low computational time. That study notes the non-parametric method is robust to ranking metrics, as well as sample and gene set size.^[4]

CERNO is Related to Fisher's Method of Combining Tests

Summarize

Perspective

The CERNO test is mathematically related to Fisher's method of combining p-values for independent statistical tests. Fisher's method is known for its favorable asymptotic properties, especially as measured by Bahadur efficiency,^[5] which describes the rate at which the observed significance of a test statistic converges to zero as the sample size increases. Tests with higher Bahadur efficiency exhibit rapid convergence.

Littell and Folks (1971) demonstrated the asymptotic optimality of Fisher's method of combining tests, showing that for independent tests, the negative logarithm of the significance level (−2log(significance)) diverges to infinity at the fastest possible rate among combination tests.^[6]

In contrast, the Kolmogorov–Smirnov test, which is the basis for several gene set analysis methods, was shown by Hwang (1982) to have much lower Bahadur efficiency compared to the chi-squared test.^[7] The Kolmogorov–Smirnov test is "always well worse" than the chi-squared test in this measure. This is relevant as the CERNO statistic S follows a chi-square distribution with 2k degrees of freedom.

As the Kolmogorov–Smirnov test is the basis of many commonly used gene set enrichment analysis methods, CERNO—which reflects Fisher's combined test properties—may offer statistical power or efficiency advantages in this context.^{[citation needed]}

Remove ads

Publication of the Method

Methodology

Comparison with Other Methods

CERNO is Related to Fisher's Method of Combining Tests

Software

See also

References

Wikiwand - on