The INI has a new website!

This is a legacy webpage. Please visit the new site to ensure you are seeing up to date information.

Skip to content



Large-scale multiple testing: finding needles in a haystack

Cai, T (Pennsylvania )
Monday 23 June 2008, 15:30-16:30

Seminar Room 1, Newton Institute


Due to advances in technology, it has become increasingly common in scientific investigations to collect vast amount of data with complex structures. Examples include microarray studies, fMRI analysis, and astronomical surveys. The analysis of these data sets poses many statistical challenges not present in smaller scale studies. In these studies, it is often required to test thousands and even millions of hypotheses simultaneously. Conventional multiple testing procedures are based on thresholding the ordered p-values. In this talk, we consider large-scale multiple testing from a compound decision theoretical point of view by treating it as a constrained optimization problem. The solution to this optimization problem yields an oracle procedure. A data-driven procedure is then constructed to mimic the performance of the oracle and is shown to be asymptotically optimal. In particular, the results show that, although p-value is appropriate for testing a single hypothesis, it fails to serve as the fundamental building block in large-scale multiple testing. Time permitting, I will also discuss simultaneous testing of grouped hypotheses.

This is joint work with Wenguang Sun (University of Pennsylvania).

Related Links


[pdf ]




The video for this talk should appear here if JavaScript is enabled.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.

Back to top ∧