Per-chip Normalizations

You will usually want to perform a per-chip normalization, which controls for chip-wide variations in intensity. This variation could be due to inconsistent washing, inconsistent sample preparation, or other microarray production or microfluidics imperfections. GeneSpring will not allow you to perform more than one per-chip normalization, as they all address the same issue.

If you have flags assigned to your data, select which data you would like used in your per-chip normalization from the Use genes marked pull-down menu.

Use Positive Control Genes

Some chips come with positive controls (mRNA from another genome or housekeeping genes , which are used to control for differences in the amount of exposure between samples. The formula for this difference is:

(signal strength of gene A in sample X)

(median signal of the positive controls in sample X)

To use Positive Control Genes

Create a separate positive control file by listing the names of your positive controls in the first column of a spreadsheet and saving in tab-delimited text format.
  1. Under Per chip normalizations click Use positive control genes .
  2. Browse to find your positive control file.
  3. Enter a cutoff in the Use Values Over box telling GeneSpring not to do the normalization if the median of your chip is below this cutoff.

One caveat regarding normalizing to positive controls

This normalization will not control for variations in the total harvest of mRNA across samples. If you are concerned about this variation, you may want to instead normalize to the distribution of all genes.

Normalizing to the Distribution of All Genes

The most common way to control for systematic variation is by normalizing to the distribution of all genes. The formula for this is:

(signal strength of gene A in sample X)

(specified percentile of all of the measurements taken in sample X)

To Use Distribution of All Genes

Under Per chip normalizations in the Experiment Normalizations window click Use distribution of all genes .
  1. Typically you will use the default percentile (50th).
  2. Enter a cutoff in the Use Values Over box telling GeneSpring not to do the normalization if the median of your chip is below this cutoff.

One caveat

This sort of normalization assumes that the median signal of the genes on the chip stays relatively constant throughout the experiment. If the total number of expressed genes in the experiment changes dramatically due to true biological activity (causing the median of one chip to be much higher than another), then you have masked your true expression values by normalizing to the median of each chip. For such an experiment, you may want to consider normalizing to something other than the median or you may want to instead normalize to positive controls.

Region Normalization

If you have more than one chip assigned to a sample, and you would like to normalize them separately, you can do a region normalization. You can also do a region normalization if you would like to normalize a region of a particular chip separately from the rest of the chip. To do this, you will need to load your data through the Experiment Wizard (see Region Normalization). If after loading your data you would like to change the way your regions are designated, you can do so in the Experiment Normalizations window under Region Designators.

The Affine Background Correction

If negative values form a large fraction of your data set, GeneSpring may automatically do what is known as the affine background correction. If a large percentage of your data is negative, normalization can be a problem; for instance, the median, which GeneSpring divides your data by in Use Distribution of All Genes, can be very small or even negative.

In such cases, GeneSpring will readjust the background level for your data by adding a constant to all raw control strengths such that the 10th percentile is set equal to 0. The affine background correction is applied only when the 10th percentile is more negative than the median of the data is positive. You will get a warning message when loading your data if the correction is applied. Also, in the Gene Inspector, control strengths adjusted by this correction are flagged with asterisks.

To tell GeneSpring If and When to Apply the Affine Background Correction

The Options pull-down menu in the Experiment Normalization window allows you to do this.

Use simple ratio

Tells GeneSpring to never use the affine background correction. If the control value is negative GeneSpring will produce a warning message and will not do the normalization.

Use ratio with background correction

Tells GeneSpring to always use the affine correction. You will only want to select this option if no background subtraction has been performed on your data, as it forces the 10th percentile to be 0 (as if it were considering 10 percent of the data background). As nearly all image analysis software has already done background subtraction, this should be a rarely used option.

Use background correction if needed

Tells GeneSpring to use the affine correction as needed to compensate for negative values.

Use Constant Values

If you are using a technology that calculates its own number for normalization you will want to use constant values. For instance, Affymetrix's Global Scaling TM centers your data around 2500; in this case you would need to normalize your data to 2500 to center it around 1.

(signal strength of gene A in sample X)

(hard number in sample X)

To use Constant Values

Under Per chip normalizations click Use constant values .
  1. Specify the hard number for each of your samples.