草人 최광민: Notes on LIMMA

-->

Scientist. Husband. Daddy. --- TOLLE. LEGE

외부자료의 인용에 있어 대한민국 저작권법(28조)과 U.S. Copyright Act (17 USC. §107)에 정의된 "저작권물의 공정한 이용원칙 | the U.S. fair use doctrine" 을 따릅니다. 저작권(© 최광민)이 명시된 모든 글과 번역문들에 대해 (1) 복제-배포, (2) 임의수정 및 자의적 본문 발췌, (3) 무단배포를 위한 화면캡처를 금하며, (4) 인용 시 URL 주소 만을 사용할 수 있습니다. [후원 | 운영] [대문으로] [방명록] [티스토리 (백업)] [신시내티]

Notes on LIMMA

Labels: Informatics
Email This BlogThis!Share to X Share to Facebook

[1]

LIMMA is a package for the analysis of gene expression microarray data, especially the use of linear models for analyzing designed experiments. LIMMA provides the ability to analyze comparisons between many RNA targets simultaneously. It has features which make the analyses stable even for experiments with small number of arrays – this is achieved by borrowing information across genes. Briefly, rather than estimating within-group variability (denominator of t-test) over and over again for each gene, pool the information from many similar genes. Moderated t-test eliminates occurrence of accidentally large values t-statistic due to accidentally small within-group variance and effectively introduce a ‘fold-change’ criterion

Limma can actually (if you use the eBayes test) analyze the data without replicates even if you have no replicates. But doing the test w/o replicates is statistically similar to using just the fold-change. so I would suggest that if have to do a "dirty" analysis like this - we have all been there one time or another :-) some people think that replication is a waste of money - to just sort the values by fold-change and take, say, top 50-100 genes that meet a fold change threshold, like say 1.5-2x (+/-1 on the log2 scale). If you want to derive a p-value (a highly suspect proposition given that you have no replicates...) then you could calculate an array-wide standard deviation of the fold-change values. This is actually similar to what the limma-eBayes does, it performs variance shrinkage by calculating variance as a blend of array-wide variance and gen-level variance (using replicates).

[2]

The approach requires one or two matrices to be specified. The first is the `design matrix` which indicates in effect which RNA samples have been applied to each array. The linear model is specified by the design matrix. The second is the `contrast matrix` which specifies which comparisons you would like to make between the RNA samples. A straightforward strategy is to extract the contrast of interest from the fit based on the design matrix.

[3]

The array weighting technique is generally useful when array quality is suspected. Especially, RNA samples from human clinical patients are typically variable in quality, so array weights is recommended for handling this type of data. So we decided to use an array weighting technique in LIMMA package. Given an appropriate design matrix, the relative reliability of each array in an experiment can be estimated by measuring how well the expression values for that array follow the linear model. This method offers a graduated approach to quality assessment by allowing poorer quality arrays, which would otherwise be discarded, to be included in an analysis but down-weighted.

###-----------------------------------------------------------------------------
### FUNC: run.lima(rma=rmaData, groups=c("control", "treated"), sampleNum=c(3,3), fc=log2(3), adj="BH", cmp="treated - control")
###-----------------------------------------------------------------------------
library( limma )
run.limma = function( rma=rmaData, groups=c("control", "treated"), sampleNum=c(3,3), fc=log2(3), adj="BH", cmp="treated - control" ) {
    myRMA = rma
    uniqTypes = groups # c( "untreated", "shRUNX1")
    f.type = factor( c(rep(uniqTypes[1], sampleNum[1] ), rep(uniqTypes[2], sampleNum[2]) ), levels=uniqTypes )
    print( f.type )
    design = model.matrix( ~ 0+f.type )
    colnames( design ) = uniqTypes

    #contrast = makeContrasts( shRNUX1-untreated, levels=design )
    contrast = makeContrasts( contrasts=cmp, levels=design )

    fit.weight = lmFit( myRMA, design ) #, weights=array.weight )
    fit.weight.contrast = contrasts.fit(fit.weight, contrast)
    fit.weight.contrast = eBayes( fit.weight.contrast )

    tmp = topTable(fit.weight.contrast, adjust=adj, p.value=pCutoff, lfc=fcCutoff, n=100000000, coef=cmp )

    return( tmp )
}

Email This BlogThis!Share to X Share to Facebook

Labels: Informatics

Scientist. Husband. Daddy. --- TOLLE. LEGE

외부자료의 인용에 있어 대한민국 저작권법(28조)과 U.S. Copyright Act (17 USC. §107)에 정의된 "저작권물의 공정한 이용원칙 | the U.S. fair use doctrine" 을 따릅니다. 저작권(© 최광민)이 명시된 모든 글과 번역문들에 대해 (1) 복제-배포, (2) 임의수정 및 자의적 본문 발췌, (3) 무단배포를 위한 화면캡처를 금하며, (4) 인용 시 URL 주소 만을 사용할 수 있습니다. [후원 | 운영] [대문으로] [방명록] [티스토리 (백업)] [신시내티]

-