HBP/Rosen Striatum M430v2 (April05) PDNN Clean 
Accession number: GN68
Summary:
PREFERRED DATA SET. This April 2005 data freeze provides estimates of mRNA expression in the striatum (caudate nucleus of the forebrain) of 31 lines of mice including C57BL/6J, DBA/2J, and 29 BXD recombinant inbred strains. This data set excludes eleven arrays associated with high numbers of outliers (clean). Data were generated using Affymetrix Mouse Genome 430 2.0 short oligomer microarrays at Beth Israel Deaconess Medical Center (BIDMC, Boston MA) by Glenn D. Rosen with the support of a Human Brain Project (HBP) grant. Approximately 250 brain samples (males and females) from 31 strains were used in this experiment. This data set includes 48 arrays that passed very stringent quality control procedures. Data were processed using the PDNN method of Zhang. To simplify comparison among transforms, PDNN values of each array have been adjusted to an average expression of 8 units and a standard deviation of 2 units.
About the cases used to generate this set of data:
We have used a set of BXD recombinant inbred strains generated by crossing C57BL/6J (B6 or B) with DBA/2J (D2 or D). The BXDs are particularly useful for systems genetics because both parental strains have been sequenced (8x coverage of B6 and 1.5x coverage of D). Physical maps in WebQTL incorporate approximately 1.75 million B vs D SNPs from Celera. BXD2 through BXD32 were bred by Benjamin A. Taylor starting in the late 1970s. BXD33 through 42 were bred by Taylor in the 1990s. These strains are available from The Jackson Laboratory.
About the tissue used to generate this set of data:
Animals were obtained from The Jackson Laboratory and housed for several weeks at BIDMC until they reached ~2 months of age (range from 55 to 62 days). Mice were killed by cervical dislocation and brains were removed and placed in RNAlater for 20 to 25 minutes prior to dissection. Cerebella and olfactory bulbs were removed; brains were hemisected, and both striata were dissected using a medial approach by GD Rosen that typically yields 5 to 7 mg of tissue per side.
All striatal dissections were performed by one person (GD Rosen) using a midsagittal approach that minimizes the likelihood of contamination across tissues. This dissection recovers most, but not all, of neostraitum. We have histologically examined dissected tissue and have found that no evidence of inclusion of cortical or thalamic tissue at the margins. We have further confirmed the dissections by comparative assays for acetylcholinesterase (AChE) protein levels using Western blots. The concentration of AChE in the striatum is far higher than that in cortex or cerebellum. A pool of dissected tissue from 3 or 4 adults (approximately 25 to 30 mg of tissue) of the same strain, sex, and age was collected in one session and used to generate cRNA samples.
RNA was extracted by Rosen and colleagues and was then processed by the BIDMC Genomics Core. Labeled cRNA was generated using the Amersham Biosciences cRNA synthesis kit protocol.
Replication and Sample Balance: Our goal is to obtain data for independent biological sample pools from at least one sample from each sex for all BXD strains. We have not yet achieved this goal. Fifteen of 31 strains are represented by male and female samples. The remaining 16 strains are still represented by single sex samples: BXD6 (F), BXD9 (F), BXD11 (F), BXD12(F), BXD13 (F), BXD14 (M), BXD19 (F), BXD20 (F), BXD22 (M), BXD24 (M), BXD27 (F), BXD28 (F), BXD32 (M), BXD39 (M), C57BL/6J (M), and DBA/2J (M).
Batch Structure: This data set consists of arrays processed in three batches with several "reruns" for the first batch. All arrays were run using a single protocol. All data have been corrected for batch effects as described below.
The table below lists the arrays by strain, sex, sample name, and batch ID. Each array was hybridized to a pool of mRNA from 3 to 4 mice. All mice were between 55 and 62 days.
Id | Strain |
Sex |
Sample_name |
BatchId |
1 | C57BL/6J | M | Chip41_Batch02_B6_M_Str | Batch02 |
2 | BXD1 | F | Chip03_Batch03_BXD1_F_Str | Batch03 |
3 | BXD1 | M | Chip04_Batch03_BXD1_M_Str | Batch03 |
4 | BXD2 | F | Chip20_Rerun01_BXD2_F_Str | Rerun01 |
5 | BXD2 | M | Chip05_Batch01_BXD2_M_Str | Batch01 |
6 | BXD5 | F | Chip10_Batch03_BXD5_F_Str | Batch03 |
7 | BXD5 | M | Chip12_Batch03_BXD5_M_Str | Batch03 |
8 | BXD6 | F | Chip38_Batch02_BXD6_F_Str | Batch02 |
9 | BXD8 | F | Chip07_Batch03_BXD8_F_Str | Batch03 |
10 | BXD8 | M | Chip02_Batch03_BXD8_M_Str | Batch03 |
11 | BXD9 | F | Chip16_Batch01_BXD9_F_Str | Batch01 |
12 | BXD11 | F | Chip31_Batch02_BXD11_F_Str | Batch02 |
13 | BXD12 | F | Chip11_Batch01_BXD12_F_Str | Batch01 |
14 | BXD13 | F | Chip33_Batch02_BXD13_F_Str | Batch02 |
15 | BXD14 | M | Chip47_Rerun01_BXD14_M_Str | Rerun01 |
16 | BXD15 | F | Chip21_Batch01_BXD15_F_Str | Batch01 |
17 | BXD15 | M | Chip13_Batch01_BXD15_M_Str | Batch01 |
18 | BXD16 | F | Chip36_Batch02_BXD16_F_Str | Batch02 |
19 | BXD16 | M | Chip44_Rerun01_BXD16_M_Str | Rerun01 |
20 | BXD18 | F | Chip15_Batch03_BXD18_F_Str | Batch03 |
21 | BXD18 | M | Chip19_Batch03_BXD18_M_Str | Batch03 |
22 | BXD19 | F | Chip19_Batch01_BXD19_F_Str | Batch01 |
23 | BXD20 | F | Chip14_Batch03_BXD20_F_Str | Batch03 |
24 | BXD21 | F | Chip18_Batch01_BXD21_F_Str | Batch01 |
25 | BXD21 | M | Chip09_Batch01_BXD21_M_Str | Batch01 |
26 | BXD22 | M | Chip13_Batch03_BXD22_M_Str | Batch03 |
27 | BXD24 | M | Chip17_Batch03_BXD24_M_Str | Batch03 |
28 | BXD27 | F | Chip29_Batch02_BXD27_F_Str | Batch02 |
29 | BXD28 | F | Chip06_Batch01_BXD28_F_Str | Batch01 |
30 | BXD29 | F | Chip45_Batch02_BXD29_F_Str | Batch02 |
31 | BXD29 | M | Chip42_Batch02_BXD29_M_Str | Batch02 |
32 | BXD31 | F | Chip14_Batch01_BXD31_F_Str | Batch01 |
33 | BXD31 | M | Chip09_Batch03_BXD31_M_Str | Batch03 |
34 | BXD32 | M | Chip30_Batch02_BXD32_M_Str | Batch02 |
35 | BXD33 | F | Chip27_Rerun01_BXD33_F_Str | Rerun01 |
36 | BXD33 | M | Chip34_Batch02_BXD33_M_Str | Batch02 |
37 | BXD34 | F | Chip03_Batch01_BXD34_F_Str | Batch01 |
38 | BXD34 | M | Chip07_Batch01_BXD34_M_Str | Batch01 |
39 | BXD38 | F | Chip17_Batch01_BXD38_F_Str | Batch01 |
40 | BXD38 | M | Chip24_Batch01_BXD38_M_Str | Batch01 |
41 | BXD39 | M | Chip20_Batch03_BXD39_M_Str | Batch03 |
42 | BXD39 | F | Chip23_Batch03_BXD39_F_Str | Batch03 |
43 | BXD39 | M | Chip43_Rerun01_BXD39_M_Str | Rerun01 |
44 | BXD40 | F | Chip08_Rerun01_BXD40_F_Str | Rerun01 |
45 | BXD40 | M | Chip22_Batch01_BXD40_M_Str | Batch01 |
46 | BXD42 | F | Chip35_Batch02_BXD42_F_Str | Batch02 |
47 | BXD42 | M | Chip32_Batch02_BXD42_M_Str | Batch02 |
48 | DBA/2J | M | Chip05_Batch03_D2_M_Str | Batch03 |
|
About the array platfrom :
Affymetrix Mouse Genome 430 2.0 array: The 430v2 array consists of 992936 useful 25-nucleotide probes that estimate the expression of approximately 39,000 transcripts (many are near duplicates). The array sequences were selected late in 2002 using Unigene Build 107. The array nominally contains the same probe sequence as the 430A and B series. However, we have found that roughy 75000 probes differ from those on A and B arrays.
About data processing:
Probe (cell) level data from the CEL file: These CEL values produced by GCOS are 75% quantiles from a set of 91 pixel values per cell.
- Step 1: We added an offset of 1.0 unit to each cell signal to ensure that all values could be logged without generating negative values. We then computed the log base 2 of each cell.
- Step 2: We performed a quantile normalization of the log base 2 values for the total set of arrays (processed as two batches) using the same initial steps used by the RMA transform.
- Step 3: We computed the Z scores for each cell value.
- Step 4: We multiplied all Z scores by 2.
- Step 5: We added 8 to the value of all Z scores. The consequence of this simple set of transformations is to produce a set of Z scores that have a mean of 8, a variance of 4, and a standard deviation of 2. The advantage of this modified Z score is that a two-fold difference in expression level corresponds approximately to a 1 unit difference.
- Step 6: We eliminated much of the systematic technical variance introduced by the batches at the probe level. To do this we calculated the ratio of each batch mean to the mean of all batches and used this as a single multiplicative probe-specific batch correction factor. The consequence of this simple correction is that the mean probe signal value for each batch is the same.
- Step 7: Finally, we computed the arithmetic mean of the values for the set of microarrays for each strain. Technical replicates were averaged before computing the mean for independent biological samples. Note, that we have not (yet) corrected for variance introduced by differences in sex or any interaction terms. We have not corrected for background beyond the background correction implemented by Affymetrix in generating the CEL file. We eventually hope to add statistical controls and adjustments for some of these variables.
Probe set data from the CHP file: The expression values were
generated using PDNN. The same simple steps described above
were also applied to these values. Every microarray data set
therefore has a mean expression of 8 with a standard deviation of 2.
A 1 unit difference represents roughly a two-fold difference
in expression level. Expression levels below 5 are usually close to
background noise levels.
Data quality control: A total of 62 samples passed RNA quality control.
Probe level QC: Log2 probe data of all arrays were inspected in DataDesk before and after quantile normalization. Inspection involved examining scatterplots of pairs of arrays for signal homogeneity (i.e., high correlation and linearity of the bivariate plots) and looking at all pairs of correlation coefficients (62x61/2). Arrays with probe data that was not homogeneous when compared to any other arrays was flagged. If the correlation at the probe level was less than approximately 0.92 we deleted that array data set. Three arrays we lost during this process (BXD19_M_Str_Batch03, BXD23_F_Str_Batch03, and BXD24_F_Str_Batch03).
Probe set level QC: The final normalized array data were evaluated for outliers. This involved counting the number of times that the probe set value for a particular array was beyond two standard deviations of the mean. This outlier analysis was carried out using the PDNN, RMA and MAS5 transforms and outliers across different levels of expression. Eleven arrays were associated with an average of more than 8% outlier probe sets across all transforms and at all expression levels. In contrast, most other arrays generated fewer than 5% outliers. These eleven suspect eleven arrays were elimated from this "clean" data set. The following arrays were eliminated: B6_M_Str_Batch03, BXD6_M_Str_Batch02, BXD9_M_Str_Batch01, BXD12_M_Str_Batch03, BXD14_F_Str_Batch02, BXD23_M_Str_Batch03, BXD27_M_Str_Batch02, BXD28_M_Str_Batch01, BXD36_F_Str_Batch03, BXD36_M_Str_Batch03, and D2_M_Str_Batch01.
Data source acknowledgment:
Data were generated with funds to Glenn Rosen from P20
MH62009 (see below for specifics). Samples and arrays were processed by the
Genomics Core at Beth Israel Deaconess Medical Center by Towia Libermann and colleagues.
About this text file:
This text file originally generated by GDR, RWW, and YHQ on Nov 2004. Updated by RWW Nov 17, 2004; GDR and RWW, Dec 23, 2004; RWW and GDR April 8, 2005.
|
| Web services initiated January, 1994 as Portable Dictionary of the Mouse Genome; June 15, 2001 as WebQTL; and Jan 5, 2005 as GeneNetwork.This site is currently operated by Rob Williams, Pjotr Prins, Zachary Sloan, Arthur Centeno. Design and code by Pjotr Prins, Zach Sloan, Arthur Centeno, Danny Arends, Christian Fischer, Sam Ockman, Lei Yan, Xiaodong Zhou, Christian Fernandez, Ning Liu, Rudi Alberts, Elissa Chesler, Sujoy Roy, Evan G. Williams, Alexander G. Williams, Kenneth Manly, Jintao Wang, and Robert W. Williams, colleagues. | | | GeneNetwork support from: - The UT Center for Integrative and Translational Genomics
- NIGMS Systems Genetics and Precision Medicine project (R01 GM123489, 2017-2021)
- NIDA NIDA Core Center of Excellence in Transcriptomics, Systems Genetics,and the Addictome (P30 DA044223, 2017-2022)
- NIA Translational Systems Genetics of Mitochondria, Metabolism, and Aging (R01AG043930, 2013-2018)
- NIAAA Integrative Neuroscience Initiative on Alcoholism (U01 AA016662, U01 AA013499, U24 AA013513, U01 AA014425, 2006-2017)
- NIDA, NIMH, and NIAAA (P20-DA 21131, 2001-2012)
- NCI MMHCC (U01CA105417), NCRR, BIRN, (U24 RR021760)
| | |
|
 menu_grp1
GeneNetwork Intro
Enter Trait Data
Batch Submission
 menu_grp2
Search Databases
 Trait Collections
 Human (hg19)
CEPH-2004
AD-cases-controls
AD-cases-controls-Myers
CEPH-2009
HLC
CANDLE
HB
HSB
HLT
Aging-Brain-UCI
Brain-Normal-NIH-Gibbs
GTEx
HCP
GTEx_v5
TIGEM-Retina-RNA-Seq
Islets-Gerling
GTEx_v8
EBV_T_Cells_PERKINS
 Mouse (mm10)
BXD
B6D2F2
BXD-Heart-Metals
AXBXA
AKXD
B6BTBRF2
BXH
CXB
LXS
BDF2-2005
MDP
NZBXFVB-N2
BHF2
BDF2-1999
CTB6F2
BHHBF2
HS
HS-CC
B6D2F2-PSU
B6D2RI
SOTNOT-OHSU
C57BL-6JxC57BL-6NJF2
Scripps-2013
Linsenbardt-Boehm
CMS
CIE-INIA
B6D2
BXD-Bone
CFW
EMSR
CIE-RMA
BXD-Longevity
LGSM-AI
D2GM
Retina-RGC-Rheaume
BXD_Dev
LGSM-AI-G34-A
LGSM-AI-G34-GBS
LGSM-AI-G34_39-43-GBS
LGSM-AI-G39-43-GBS
LGSM-AIG34_50-56-GBS
JAX-D2-Mono-RNA-Seq
DOD-BXD-GWI
HET3-ITP
CC
UTHSC-Cannabinoid
B6-Lens
DO
BXD-AE
DOL
 Rat (rn6)
HXBBXH
SRxSHRSPF2
HSNIH-RGSMC
HSNIH-Palmer
NWU_WKYxF344_F2
HIV-1Tg
HRDP_HXB-BXH-BP
 Macaque monkey
Macaca-fasicularis
 Drosophila
Oregon-R_x_2b3
DGRP
 Barley
SXM
QSM
 Arabidopsis thaliana
BayXSha
ColXCvi
ColXBur
 Poplar
Poplar
 Soybean
J12XJ58F2
J12XJ58F11
 Tomato
LXP
 Oryzias latipes (Japanese medaka)
MIKK
GeneWiki
Tissue Correlation
SNP Browser
Interval Analyst
QTLminer
GenomeGraph
Scriptable Interface
 Database Information
Database Schema
Data Sharing
Annotations
 menu_grp3
Movies
 Tutorials
GN Barley Tutorial
GN Powerpoint
HTML Tour
FAQ
Glossary of Terms
GN MediaWiki
menu_grp4
menu_grp5
 menu_grp6
Conditions and Limitation
Data Sharing Policy
Status and Contacts
Privacy Policy
menu_grp8
|