快捷导航

生物秀人才网[Job.bbioo.com]

打造生物医药领域人才求职与企业招聘的专业平台
[人才首页] [招聘信息] [人才库]

欢迎加入最专业的生命科学交流社区,结交业内好友,体验更多功能。

您需要 登录 才可以下载或查看,没有帐号?注册会员

x
Sun-Chong Wang, Art Petronis, "DNA Methylation Microarrays: Experimental Design and Statistical Analysis"
Chapman & Hall/CRC | 2008 | ISBN: 1420067273 | 256 pages | PDF | 16,9 MB

Providing an interface between dry-bench bioinformaticians and wet-lab biologists, DNA Methylation Microarrays: Experimental Design and Statistical Analysis presents the statistical methods and tools to analyze high-throughput epigenomic data, in particular, DNA methylation microarray data. Since these microarrays share the same underlying principles as gene expression microarrays, many of the analyses in the text also apply to microarray-based gene expression and histone modification (ChIP-on-chip) studies.

After introducing basic statistics, the book describes wet-bench technologies that produce the data for analysis and explains how to preprocess the data to remove systematic artifacts resulting from measurement imperfections. It then explores differential methylation and genomic tiling arrays. Focusing on exploratory data analysis, the next several chapters show how cluster and network analyses can link the functions and roles of unannotated DNA elements with known ones. The book concludes by surveying the open source software (R and Bioconductor), public databases, and other online resources available for microarray research.

Requiring only limited knowledge of statistics and programming, this book helps readers gain a solid understanding of the methodological foundations of DNA microarray analysis.







1 Applied Statistics 1
1.1 Descriptive statistics . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Frequency distribution . . . . . . . . . . . . . . . . . . 2
1.1.2 Central tendency and variability . . . . . . . . . . . . 2
1.1.3 Correlation . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Inferential statistics . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Probability distribution . . . . . . . . . . . . . . . . . 6
1.2.2 Central limit theorem and normal distribution . . . . 7
1.2.3 Statistical hypothesis testing . . . . . . . . . . . . . . 7
1.2.4 Two-sample t-test . . . . . . . . . . . . . . . . . . . . 9
1.2.5 Nonparametric test . . . . . . . . . . . . . . . . . . . . 9
1.2.6 One-factor ANOVA and F-test . . . . . . . . . . . . . 10
1.2.7 Simple linear regression . . . . . . . . . . . . . . . . . 11
1.2.8 Chi-square test of contingency . . . . . . . . . . . . . 13
1.2.9 Statistical power analysis . . . . . . . . . . . . . . . . 14
2 DNA Methylation Microarrays and Quality Control 17
2.1 DNA methylation microarrays . . . . . . . . . . . . . . . . . 18
2.2 Workflow of methylome experiment . . . . . . . . . . . . . . 21
2.2.1 Restriction enzyme-based enrichment . . . . . . . . . 21
2.2.2 Immunoprecipitation-based enrichment . . . . . . . . 21
2.3 Image analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Visualization of raw data . . . . . . . . . . . . . . . . . . . . 26
2.5 Reproducibility . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.1 Positive and negative controls by exogenous sequences 32
2.5.2 Intensity fold-change and p-value . . . . . . . . . . . . 32
2.5.3 DNA unmethylation profiling . . . . . . . . . . . . . . 33
2.5.4 Correlation of intensities between tiling arrays . . . . 33

3 Experimental Design 35
3.1 Goals of experiment . . . . . . . . . . . . . . . . . . . . . . . 36
3.1.1 Class comparison and class prediction . . . . . . . . . 36
3.1.2 Class discovery . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Reference design . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.1 Dye swaps . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 Balanced block design . . . . . . . . . . . . . . . . . . . . . . 39
3.4 Loop design . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
ix
x
3.5 Factorial design . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.6 Time course experimental design . . . . . . . . . . . . . . . . 47
3.7 How many samples/arrays are needed? . . . . . . . . . . . . 49
3.7.1 Biological versus technical replicates . . . . . . . . . . 49
3.7.2 Statistical power analysis . . . . . . . . . . . . . . . . 49
3.7.3 Pooling biological samples . . . . . . . . . . . . . . . . 55
3.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4 Data Normalization 59
4.1 Measure of methylation . . . . . . . . . . . . . . . . . . . . . 59
4.2 The need for normalization . . . . . . . . . . . . . . . . . . . 61
4.3 Strategy for normalization . . . . . . . . . . . . . . . . . . . 62
4.4 Two-color CpG island microarray normalization . . . . . . . 63
4.4.1 Global dependence of log methylation ratios . . . . . . 64
4.4.2 Dependence of log ratios on intensity . . . . . . . . . . 65
4.4.3 Dependence of log ratios on print-tips . . . . . . . . . 67
4.4.4 Normalized Cy3- and Cy5-intensities . . . . . . . . . . 70
4.4.5 Between-array normalization . . . . . . . . . . . . . . 71
4.5 Oligonucleotide arrays normalization . . . . . . . . . . . . . 72
4.5.1 Background correction: PM – MM? . . . . . . . . . . 72
4.5.2 Quantile normalization . . . . . . . . . . . . . . . . . . 73
4.5.3 Probeset summarization . . . . . . . . . . . . . . . . . 75
4.6 Normalization using control sequences . . . . . . . . . . . . . 76
4.7 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5 Significant Differential Methylation 81
5.1 Fold change . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2 Linear model for log-ratios or log-intensities . . . . . . . . . 84
5.2.1 Microarrays reference design or oligonucleotide chips . 84
5.2.2 Sequence-specific dye effect in two-color microarrays . 87
5.3 t -test for contrasts . . . . . . . . . . . . . . . . . . . . . . . . 88
5.4 F-test for joint contrasts . . . . . . . . . . . . . . . . . . . . 89
5.5 P-value adjustment for multiple testing . . . . . . . . . . . . 92
5.5.1 Bonferroni correction . . . . . . . . . . . . . . . . . . . 92
5.5.2 False discovery rate . . . . . . . . . . . . . . . . . . . 92
5.6 Modified t - and F-test . . . . . . . . . . . . . . . . . . . . . 94
5.7 Significant variation within and between groups . . . . . . . 95
5.7.1 Within-group variation . . . . . . . . . . . . . . . . . 95
5.7.2 Between-group variation . . . . . . . . . . . . . . . . . 96
5.8 Significant correlation with a co-variate . . . . . . . . . . . . 97
5.9 Permutation test for bisulfite sequence data . . . . . . . . . . 100
5.9.1 Euclidean distance . . . . . . . . . . . . . . . . . . . . 101
5.9.2 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.10 Missing data values . . . . . . . . . . . . . . . . . . . . . . . 103
5.11 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
xi
5.11.1 Factorial design . . . . . . . . . . . . . . . . . . . . . . 104
5.11.2 Time-course experiments . . . . . . . . . . . . . . . . 105
5.11.3 Balanced block design . . . . . . . . . . . . . . . . . . 106
5.11.4 Loop design . . . . . . . . . . . . . . . . . . . . . . . . 107

6 High-Density Genomic Tiling Arrays 109
6.1 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.1.1 Intra- and interarray normalization . . . . . . . . . . . 110
6.1.2 Sequence-based probe effects . . . . . . . . . . . . . . 110
6.2 Wilcoxon test in a sliding window . . . . . . . . . . . . . . . 112
6.2.1 Probe score or scan statistic . . . . . . . . . . . . . . . 116
6.2.2 False positive rate . . . . . . . . . . . . . . . . . . . . 116
6.3 Boundaries of methylation regions . . . . . . . . . . . . . . . 118
6.4 Multiscale analysis by wavelets . . . . . . . . . . . . . . . . . 119
6.5 Unsupervised segmentation by hidden Markov model . . . . 121
6.6 Principal component analysis and biplot . . . . . . . . . . . 125
7 Cluster Analysis 129
7.1 Measure of dissimilarity . . . . . . . . . . . . . . . . . . . . . 129
7.2 Dimensionality reduction . . . . . . . . . . . . . . . . . . . . 130
7.3 Hierarchical clustering . . . . . . . . . . . . . . . . . . . . . . 133
7.3.1 Bottom-up approach . . . . . . . . . . . . . . . . . . . 133
7.3.2 Top-down approach . . . . . . . . . . . . . . . . . . . 136
7.4 K-means clustering . . . . . . . . . . . . . . . . . . . . . . . 139
7.5 Model-based clustering . . . . . . . . . . . . . . . . . . . . . 141
7.6 Quality of clustering . . . . . . . . . . . . . . . . . . . . . . . 142
7.7 Statistically significance of clusters . . . . . . . . . . . . . . . 144
7.8 Reproducibility of clusters . . . . . . . . . . . . . . . . . . . 146
7.9 Repeated measurements . . . . . . . . . . . . . . . . . . . . . 146

8 Statistical Classification 149
8.1 Feature selection . . . . . . . . . . . . . . . . . . . . . . . . . 149
8.2 Discriminant function . . . . . . . . . . . . . . . . . . . . . . 152
8.2.1 Linear discriminant analysis . . . . . . . . . . . . . . . 153
8.2.2 Diagonal linear discriminant analysis . . . . . . . . . . 154
8.3 K-nearest neighbor . . . . . . . . . . . . . . . . . . . . . . . 154
8.4 Performance assessment . . . . . . . . . . . . . . . . . . . . . 155
8.4.1 Leave-one-out cross validation . . . . . . . . . . . . . . 156
8.4.2 Receiver operating characteristic analysis . . . . . . . 159
9 Interdependency Network of DNA Methylation 163
9.1 Graphs and networks . . . . . . . . . . . . . . . . . . . . . . 164
9.2 Partial correlation . . . . . . . . . . . . . . . . . . . . . . . . 164
9.3 Dependence networks from DNA methylation microarrays . . 165
9.4 Network analysis . . . . . . . . . . . . . . . . . . . . . . . . . 168
9.4.1 Distribution of connectivities . . . . . . . . . . . . . . 169
9.4.2 Active epigenetically regulated loci . . . . . . . . . . . 169
9.4.3 Correlation of connectivities . . . . . . . . . . . . . . . 170
9.4.4 Modularity . . . . . . . . . . . . . . . . . . . . . . . . 171
10 Time Series Experiment 179
10.1 Regulatory networks from microarray data . . . . . . . . . . 181
10.2 Dynamic model of regulation . . . . . . . . . . . . . . . . . . 182
10.3 A penalized likelihood score for parsimonious model . . . . . 182
10.4 Optimization by genetic algorithms . . . . . . . . . . . . . . 184
11 Online Annotations 187
11.1 Gene centric resources . . . . . . . . . . . . . . . . . . . . . . 187
11.1.1 GenBank: A nucleotide sequence database . . . . . . . 187
11.1.2 UniGene: An organized view of transcriptomes . . . . 188
11.1.3 RefSeq: Reviews of sequences and annotations . . . . 188
11.1.4 PubMed: A bibliographic database of biomedical journals
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
11.1.5 dbSNP: Database for nucleotide sequence variation . . 190
11.1.6 OMIM: A directory of human genes and genetic disorders
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
11.1.7 Entrez Gene: A Web portal of genes . . . . . . . . . . 190
11.2 PubMeth: A cancer methylation database . . . . . . . . . . . 192
11.3 Gene Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . 192
11.4 Kyoto Encyclopedia of Genes and Genomes . . . . . . . . . . 195
11.5 UniProt/Swiss-Prot protein knowledgebase . . . . . . . . . . 196
11.6 The International HapMap Project . . . . . . . . . . . . . . 198
11.7 UCSC human genome browser . . . . . . . . . . . . . . . . . 198

12 Public Microarray Data Repositories 205
12.1 Epigenetics Society . . . . . . . . . . . . . . . . . . . . . . . 205
12.2 Microarray Gene Expression Data Society . . . . . . . . . . . 206
12.3 Minimum Information about a Microarray Experiment . . . 206
12.4 Public repositories for high-throughput arrays . . . . . . . . 208
12.4.1 Gene Expression Omnibus at NCBI . . . . . . . . . . 208
12.4.2 ArrayExpress at EBI . . . . . . . . . . . . . . . . . . . 208
12.4.3 Center for Information Biology Gene Expression database
at DDBJ . . . . . . . . . . . . . . . . . . . . . . . 210

13 Open Source Software for Microarray Data Analysis 211
13.1 R: A language and environment for statistical computing and
graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
13.2 Bioconductor . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
13.2.1 marray package . . . . . . . . . . . . . . . . . . . . . . 215
13.2.2 affy package . . . . . . . . . . . . . . . . . . . . . . . 215
xiii
13.2.3 limma package . . . . . . . . . . . . . . . . . . . . . . 215
13.2.4 stats package . . . . . . . . . . . . . . . . . . . . . . 215
13.2.5 tilingArray package . . . . . . . . . . . . . . . . . . 217
13.2.6 Ringo package . . . . . . . . . . . . . . . . . . . . . . 217
13.2.7 cluster package . . . . . . . . . . . . . . . . . . . . . 217
13.2.8 class package . . . . . . . . . . . . . . . . . . . . . . 217
13.2.9 GeneNet package . . . . . . . . . . . . . . . . . . . . . 217
13.2.10inetwork package . . . . . . . . . . . . . . . . . . . . 217
13.2.11GOstats package . . . . . . . . . . . . . . . . . . . . . 218
13.2.12annotate package . . . . . . . . . . . . . . . . . . . . 218
References 219
Index 225

共 42 个关于本帖的回复 最后回复于 2017-4-6 14:30

潇湘靓猪 生物秀才 发表于 2010-7-1 10:34:50 | 显示全部楼层

生物秀人才网[Job.bbioo.com]

打造生物医药领域人才求职与企业招聘的专业平台
[人才首页] [招聘信息] [人才库]

谢谢啊!下来看看!
hzcdj 生物秀才 发表于 2010-7-1 10:46:48 | 显示全部楼层

生物秀人才网[Job.bbioo.com]

打造生物医药领域人才求职与企业招聘的专业平台
[人才首页] [招聘信息] [人才库]

谢谢!
hzcdj 生物秀才 发表于 2010-7-1 10:53:57 | 显示全部楼层
每部分都一样的,为什么要放一些重复的。
Methylation 生物秀才 发表于 2010-7-1 11:06:34 | 显示全部楼层
回复 4# hzcdj


    哦,不是一样的,因为文件比较大,是分割压缩成比较小的部分后上传的(part1-part5),需要将所有部分全部下载下来后才能解压成一个完整的大的的
liuhaiyun354 生物秀才 发表于 2010-7-1 11:23:27 | 显示全部楼层
看看
wzhjlau2009 生物秀才 发表于 2010-7-4 14:45:31 | 显示全部楼层
回复 1# Methylation


   
    非常感谢楼主给我们分享这么有用的学习资料!
fanfuzi09 生物秀才 发表于 2010-7-11 13:02:06 | 显示全部楼层
感谢分享
duchunpingzs 生物秀才 发表于 2010-7-11 14:10:37 | 显示全部楼层
好东西啊!谢谢楼主共享
rainykekezhu 生物秀才 发表于 2010-7-20 19:57:50 | 显示全部楼层
这个真的是个好东西!!!
您需要登录后才可以回帖 登录 | 注册会员

本版积分规则

精彩推荐

  • 实时荧光定量PCR出现很多乱峰,求大神们帮
  • 电泳单引物跑出来为什么是这个样子电泳单引
  • 美开发细菌制造生物燃料技术
  • 晒动图——活体细胞4D成像新手段,细胞筛选
  • ABclonal与您共享厦门细胞生物学学术大会

明星用户

QQ|关注生物秀|小黑屋|手机版|Archiver|生物秀 ( 沪ICP备12005474号-6 )

GMT+8, 2018-1-22 10:29 , Processed in 0.158771 second(s), 12 queries , Gzip On, Memcache On.