Top Qs
Timeline
Chat
Perspective

CERNO test

Statistical test used in gene set analysis From Wikipedia, the free encyclopedia

Remove ads

CERNO (Coincident Extreme Ranks in Numerical Observations) is a non-parametric, rank-based statistical test that evaluates the distribution of ranks for a subset of samples that have been labeled (the labels defining the subset). The method has been used in gene set and pathway analysis. In this applied context, the method assesses whether a predefined set of genes, proteins, or other features shows coincident enrichment for high or low ranks within a globally ranked list.

Quick facts Field ...
Remove ads

Publication of the Method

The CERNO statistic was published in a 2008 study on interferon-beta-regulated gene expression in relapsing–remitting multiple sclerosis.[1] It was subsequently used in transcriptomic and proteomic studies.[2] The test was further described in the supplementary materials of a 2013 pharmacogenomics study.[3]

An independent, comprehensive evaluation of the algorithm was published by Zyla et al. in 2019.[4]

Remove ads

Methodology

Summarize
Perspective

The CERNO test evaluates whether the ranks of a set of genes or features within a genome-wide ranking (from most to least significant by any metric) are collectively more extreme than would be expected by chance. This makes it sensitive to sets with even a few strongly ranked members, rather than requiring uniform or over-a-threshold significance of all genes in the set.

The test statistic for a gene set of size k in a ranked list of N genes is:[4]

where ri is the rank of the ith gene in the set. Under the null hypothesis of random rank distribution, S follows a chi-square distribution with 2k degrees of freedom.

Remove ads

Comparison with Other Methods

Zyla et al. noted some advantages of CERNO, including that it showed the highest reproducibility of the methods they investigated, as well as good sensitivity, prioritization and low computational time. That study notes the non-parametric method is robust to ranking metrics, as well as sample and gene set size.[4]

Summarize
Perspective

The CERNO test is mathematically related to Fisher's method of combining p-values for independent statistical tests. Fisher's method is known for its favorable asymptotic properties, especially as measured by Bahadur efficiency,[5] which describes the rate at which the observed significance of a test statistic converges to zero as the sample size increases. Tests with higher Bahadur efficiency exhibit rapid convergence.

Littell and Folks (1971) demonstrated the asymptotic optimality of Fisher's method of combining tests, showing that for independent tests, the negative logarithm of the significance level (−2log(significance)) diverges to infinity at the fastest possible rate among combination tests.[6]

In contrast, the Kolmogorov–Smirnov test, which is the basis for several gene set analysis methods, was shown by Hwang (1982) to have much lower Bahadur efficiency compared to the chi-squared test.[7] The Kolmogorov–Smirnov test is "always well worse" than the chi-squared test in this measure. This is relevant as the CERNO statistic S follows a chi-square distribution with 2k degrees of freedom.

As the Kolmogorov–Smirnov test is the basis of many commonly used gene set enrichment analysis methods, CERNO—which reflects Fisher's combined test properties—may offer statistical power or efficiency advantages in this context.[citation needed]

Remove ads

Software

The CERNO method is easily implemented due to its simple mathematical form. CERNO has been implemented in the tmod R package.[8][9][10]

See also

References

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads