SJUT Cerebellum mRNA M430 (Mar05) PDNN
Accession number: GN55
Summary:
This March 2005 data freeze provides estimates of mRNA expression in adult cerebellum of 48 lines of mice including 45 BXD recombinant inbred strains, C57BL/6J, DBA/2J, and F1 hybrids. Data were generated by a consortium of investigators at St. Jude Children's Research Hospital (SJ) and the University of Tennessee Health Science Center (UT). Cerebellar samples were hybridized in small pools (n = 3) to Affymetrix M430A and B arrays. Data were processed using the PositionDependent Nearest Neighbor (PDNN) method developed by Zhang and colleagues (2003). To simplify comparison between transforms, PDNN values of each array were adjusted to an average of 8 units and a standard deviation of 2 units.
About the cases used to generate this set of data:
We have exploited a set of BXD recombinant inbred strains. All BXD lines are derived crossed between C57BL/6J (B6 or B) and DBA/2J (D2 or D). Both B and D parental strains have been almost fully sequenced (8x coverage for B6 by a public consortium and approximately 1.5x coverage for D by Celera Discovery Systems) and data for 1.75 millioin B vs D SNPs are incorporated into WebQTLs genetic maps for the BXDs. BXD2 through BXD32 were produced by Benjamin A. Taylor starting in the late 1970s. BXD33 through 42 were also produced by Taylor, but they were generated in the 1990s. These strains are all available from The Jackson Laboratory, Bar Harbor, Maine. BXD43 through BXD99 were produced by Lu Lu, Jeremy Peirce, Lee M. Silver, and Robert W. Williams in the late 1990s and early 2000s using advanced intercross progeny (Peirce et al. 2004).
Most BXD animals were generated inhouse at the University of Tennessee Health Science Center by Lu Lu and Robert Williams using stock obtained from The Jackson Laboratory between 1999 and 2004. All BXD strains with numbers above 42 are new advanced intecross type BXDs (Peirce et al. 2004) that are current available from UTHSC. Additional cases were provided by Glenn Rosen, John Mountz, and HuiChen Hsu. These cases were bred either at The Jackson Laboratory (GR) or at the University of Alabama (JM and HCH).
About the tissue used to generate this set of data:
The March 2005 data set consists of a total of 102 array pairs (Affymetrix 430A and 430B) from 49 different genotypes. Each sample consists of whole cerebellum taken from three adult animals of the same age and sex. Two sets of technical replicates (BXD14 n = 2; BXD29 n = 3) were combined before generating group means; giving a total of 99 biologically independent data sets. The two reciprocal F1s (D2B6F1 and B6D2F1) were combined to give a single F1 mean estimate of gene expression. 430A and 430B arrays were processed in three large batches. The first batch (May03 data) consists of 17 samples from 17 strains balanced by sex (8M and 9F). The second batch consists of 29 samples, and includes biological replicates, 2 technical replicates, and data for 10 new strains. The third batch consists of 56 samples, and also includes biological replicates, 2 technical replicates, and data for 15 additional strains.
Replication and Sample Balance: Our goal is to obtain data for independent biological sample pools from both sexes for each strain. Six of 48 genotypes are still represented by single samples: BXD5, BXD13, BXD20, BXD23, BXD27 are femaleonly strains, whereas BXD25, BXD77, BXD90 are maleonly. Ten strains are represented by three independent samples with the following breakdown by sex: C57BL/6J (1F 2M), DBA/2J (2F 1M), B6D2F1 (1F 2M), BXD2 (2F 1M), BXD11 (2F 1M), BXD28 (2F 1M), BXD36 (1F 1M), BXD40 (2F 1M), BXD51 (1F 2M), BXD60 (1F 2M), BXD92 (2F 1M).
The age range of samples is relatively narrow. Only 18 samples were taken from animals older than 99 days and only two samples are older than 7 months of age. BXD11 includes an extra (third) 441dayold female sample and the BXD28 includes an extra 427dayold sample.
RNA was extracted at UTHSC by Lu Lu, Zhiping Jia, and Hongtao Zhai.
All samples were subsequently processed at the Hartwell Center Affymetrix laboratory at SJCRH by Jay Morris.
The table below summarizes informaton on strain, sex, age, sample name, and batch number.
Id  Strain 
Sex 
Age 
SampleName  BatchID  Source 
1  C57BL/6J  F  116  R0773C  2  UAB 
2  C57BL/6J  M  109  R0054C  1  JAX 
3  C57BL/6J  M  71  R1450C  3  UTM DG 
4  DBA/2J  F  71  R0175C  1  UAB 
5  DBA/2J  F  91  R0782C  2  UAB 
6  DBA/2J  M  62  R1121C  3  UTM RW 
7  B6D2F1  F  60  R1115C  3  UTM RW 
8  B6D2F1  M  94  R0347C  1  JAX 
9  B6D2F1  M  127  R0766C  2  UTM JB 
10  D2B6F1  F  57  R1067C  3  UTM RW 
11  D2B6F1  M  60  R1387C  3  UTM RW 
12  BXD1  F  57  R0813C  2  UAB 
13  BXD1  M  181  R1151C  3  UTM JB 
14  BXD2  F  142  R0751C  1  UAB 
15  BXD2  F  78  R0774C  2  UAB 
16  BXD2  M  61  R1503C  3  HarvardU GR 
17  BXD5  F  56  R0802C  2  UMemphis 
18  BXD6  F  92  R0719C  1  UMemphis 
19  BXD6  M  92  R0720C  3  UMemphis 
20  BXD8  F  72  R0173C  1  UAB 
21  BXD8  M  59  R1484C  3  HarvardU GR 
22  BXD9  F  86  R0736C  3  UMemphis 
23  BXD9  M  86  R0737C  1  UMemphis 
24  BXD11  F  441  R0200C  1  UAB 
25  BXD11  F  97  R0791C  3  UAB 
26  BXD11  M  92  R0790C  2  UMemphis 
27  BXD12  F  130  R0776C  2  UAB 
28  BXD12  M  64  R0756C  2  UMemphis 
29  BXD13  F  86  R1144C  3  UMemphis 
30  BXD14  F  190  R0794C  2  UAB 
31  BXD14  F  190  R0794C  3  UAB 
32  BXD14  M  91  R0758C  2  UMemphis 
33  BXD14  M  65  R1130C  3  UTM RW 
34  BXD15  F  60  R1491C  3  HarvardU GR 
35  BXD15  M  61  R1499C  3  HarvardU GR 
36  BXD16  F  163  R0750C  1  UAB 
37  BXD16  M  61  R1572C  3  HarvardU GR 
38  BXD19  F  61  R0772C  2  UAB 
39  BXD19  M  157  R1230C  3  UTM JB 
40  BXD20  F  59  R1488C  3  HarvardU GR 
41  BXD21  F  116  R0711C  1  UAB 
42  BXD21  M  64  R0803C  2  UMemphis 
43  BXD22  F  65  R0174C  1  UAB 
44  BXD22  M  59  R1489C  3  HarvardU GR 
45  BXD23  F  88  R0814C  2  UAB 
46  BXD24  F  71  R0805C  2  UMemphis 
47  BXD24  M  71  R0759C  2  UMemphis 
48  BXD25  M  90  R0429C  1  UTM RW 
49  BXD27  F  60  R1496C  3  HarvardU GR 
50  BXD28  F  113  R0785C  2  UTM RW 
51  BXD28  M  79  R0739C  3  UMemphis 
52  BXD29  F  82  R0777C  2  UAB 
53  BXD29  M  76  R0714C  1  UMemphis 
54  BXD29  M  76  R0714C  2  UMemphis 
55  BXD29  M  76  R0714C  3  UMemphis 
56  BXD31  F  142  R0816C  2  UAB 
57  BXD31  M  61  R1142C  3  UTM RW 
58  BXD32  F  62  R0778C  2  UAB 
59  BXD32  M  218  R0786C  2  UAB 
60  BXD33  F  184  R0793C  2  UAB 
61  BXD33  M  124  R0715C  1  UAB 
62  BXD34  F  56  R0725C  1  UMemphis 
63  BXD34  M  91  R0789C  2  UMemphis 
64  BXD36  F  64  R1667C  3  UTM RW 
65  BXD36  M  61  R1212C  3  UMemphis 
66  BXD38  F  55  R0781C  2  UAB 
67  BXD38  M  65  R0761C  2  UMemphis 
68  BXD39  F  59  R1490C  3  HarvardU GR 
69  BXD39  M  165  R0723C  1  UAB 
70  BXD40  F  56  R0718C  2  UMemphis 
71  BXD40  M  73  R0812C  2  UMemphis 
72  BXD42  F  100  R0799C  2  UAB 
73  BXD42  M  97  R0709C  1  UMemphis 
74  BXD43  F  61  R1200C  3  UTM RW 
75  BXD43  M  63  R1182C  3  UTM RW 
76  BXD44  F  61  R1188C  3  UTM RW 
77  BXD44  M  58  R1073C  3  UTM RW 
78  BXD45  F  63  R1404C  3  UTM RW 
79  BXD45  M  93  R1506C  3  UTM RW 
80  BXD48  F  64  R1158C  3  UTM RW 
81  BXD48  M  65  R1165C  3  UTM RW 
82  BXD51  F  66  R1666C  3  UTM RW 
83  BXD51  M  62  R1180C  3  UTM RW 
84  BXD51  M  79  R1671C  3  UTM RW 
85  BXD60  F  64  R1160C  3  UTM RW 
86  BXD60  M  61  R1103C  3  UTM RW 
87  BXD60  M  99  R1669C  3  UTM RW 
88  BXD62  M  61  R1149C  3  UTM RW 
89  BXD62  M  60  R1668C  3  UTM RW 
90  BXD69  F  60  R1440C  3  UTM RW 
91  BXD69  M  64  R1197C  3  UTM RW 
92  BXD73  F  60  R1276C  3  UTM RW 
93  BXD73  M  77  R1665C  3  UTM RW 
94  BXD77  M  62  R1424C  3  UTM RW 
95  BXD85  F  79  R1486C  3  UTM RW 
96  BXD85  M  79  R1487C  3  UTM RW 
97  BXD86  F  58  R1408C  3  UTM RW 
98  BXD86  M  58  R1412C  3  UTM RW 
99  BXD90  M  74  R1664C  3  UTM RW 
100  BXD92  F  62  R1391C  3  UTM RW 
101  BXD92  F  63  R1670C  3  UTM RW 
102  BXD92  M  59  R1308C  3  UTM RW 

About data processing:
Probe (cell) level data from the CEL file: These CEL values produced by GCOS are 75% quantiles from a set of 91 pixel values per cell.
 Step 1: We added an offset of 1.0 unit to each cell signal to ensure that all values could be logged without generating negative values. We then computed the log base 2 of each cell.
 Step 2: We performed a quantile normalization for the log base 2 values for the total set of 104 arrays (all three batches) using the same initial steps used by the RMA transform.
 Step 3: We computed the Z scores for each cell value.
 Step 4: We multiplied all Z scores by 2.
 Step 5: We added 8 to the value of all Z scores. The consequence of this simple set of transformations is to produce a set of Z scores that have a mean of 8, a variance of 4, and a standard deviation of 2. The advantage of this modified Z score is that a twofold difference in expression level corresponds approximately to a 1 unit difference.
 Step 6: We corrected for technical variance introduced by three large batches at the probe level. To do this we determined the ratio of the batch mean to the mean of all three batches and used this as a single multiplicative probespecific batch correction factor. The consequence of this simple correction is that the mean probe signal value for each of the three batches is the same.
 Step 7a: The 430A and 430B arrays include a set of 100 shared probe sets (a total of 2200 probes) that have identical sequences. These probes and probe sets provide a way to calibrate expression of the 430A and 430B arrays to a common scale. To bring the two arrays into alignment, we regressed Z scores of the common set of probes to obtain a linear regression correction to rescale the 430B arrays to the 430A array. In our case this involved multiplying all 430B Z scores by the slope of the regression and adding or subtracting a small offset. The result of this step is that the mean of the 430A expression is fixed at a value of 8, whereas that of the 430B chip is typically reduced to 7. The average of the merged 430A and 430B array data set is approximately 7.5.
 Step 7b: We recentered the merged 430A and 430B data sets to a mean of 8 and a standard deviation of 2. This involved reapplying Steps 3 through 5.
 Step 8: Finally, we computed the arithmetic mean of the values for the set of microarrays for each strain. Technical replciates were averaged before computing the mean for independent biological samples. Note, that we have not (yet) corrected for variance introduced by differences in sex, age, source of animals, or any interaction terms. We have not corrected for background beyond the background correction implemented by Affymetrix in generating the CEL file. We eventually hope to add statistical controls and adjustments for these variables.
Probe set data: The expression data were processed by Yanhua Qu (UTHSC) using the PositionDependent Nearest Neighbor (PDNN) method developed by Zhang and colleagues (2003). The normalized CEL files were read into the PerfectMatch. The same simple steps described above were also applied to the initial PDNN probe set expression estimates. A 1unit difference represents roughly a twofold difference in expression level. Expression levels below 5 are usually close to background noise levels.
About the chromosome and megabase position values:
The chromosomal locations of probe sets and gene markers on the 430A and 430B microarrays were determined by BLAT analysis using the Mouse Genome Sequencing Consortium May 2004 Assembly (see http://genome.ucsc.edu/cgibin/hgBlat?command=start&org=mouse). We thank Dr. Yan Cui (UTHSC) for allowing us to use his Linux cluster to perform this analysis.
Data source acknowledgment:
Data were generated with funds contributed equally by The UTHSCSJCRH Cerebellum Transcriptome Profiling Consortium. Our members include:
 Tom Curran
 Dan Goldowitz
 Kristin Hamre
 Lu Lu
 Peter McKinnon
 Jim Morgan
 Clayton Naeve
 Richard Smeyne
 Robert Williams
 The Center of Genomics and Bioinformatics at UTHSC
 The Hartwell Center at SJCRH
Information about this text file:
This text file originally generated by RWW and YHQ, March 21, 2005. Updated by RWW, March 23, 2005.
