Affymetrix exon arrays were designed as a tool for monitoring the relative expression levels of hundreds of thousands of known and predicted exons with a view to detecting alternative splicing events. In the article listed below, we have characterized a systematic bias of the exon array platform that leads to an overestimation of alternative splicing events in genes that are differentially expressed.
COSIE is an R function (www.r-project.org) that for a given set of exon arrays corrects for the observed bias and improves the detection of alternative splicing. It adjusts splicing indices for exons, especially for those that belong to differentially expressed genes. For this adjustment, COSIE uses parameters that are specific for each probeset (see download section below) which were trained from a large number of published exon arrays. The downside of this approach is that such parameters cannot be estimated for all probesets on the microarray. Based on our training set, COSIE corrects 95.1% of the probesets. Separate parameter files are provided for both the full and core sets, including all probesets that are linked to transcripts. We recommend the use of the core set that was also used in the cited study below. The full set is not as well characterized.
SI_* returned by COSIE contains the net probeset expressions after factoring out gene expression and exon array bias (pre splicing index). To obtain the differential probeset inclusion rates (final splicing indices), two columns of SI_* simply need to be subtracted from one another. tclevel_* contains the transcript levels used internally by COSIE (in case someone needs them) but they are not required by the user when detecting alternatively included probesets.
A typical exon array data analysis workflow may look as follows: