2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 1 Development of Complex Curricula for Molecular Bionics and Infobionics Programs within a consortial* framework** Consortium leader PETER PAZMANY CATHOLIC UNIVERSITY Consortium members SEMMELWEIS UNIVERSITY, DIALOG CAMPUS PUBLISHER The Project has been realised with the support of the European Union and has been co-financed by the European Social Fund *** **Molekuláris bionika és Infobionika Szakok tananyagának komplex fejlesztése konzorciumi keretben ***A projekt az Európai Unió támogatásával, az Európai Szociális Alap társfinanszírozásával valósul meg. PETER PAZMANY CATHOLIC UNIVERSITY SEMMELWEIS UNIVERSITY sote_logo.jpg dk_fejlec.gif INFOBLOKK 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 2 Peter Pazmany Catholic University Faculty of Information Technology BIOMEDICAL IMAGING fMRI –Advanced StatisticalAnalysis www.itk.ppke.hu (Orvosbiológiai képalkotás ) (fMRI –Haladó statisztikai elemzési módszerek) VIKTOR GÁL, ÉVA BANKÓ 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 3 www.itk.ppke.hu The Multiple Comparison Problem • doing t-test for every voxel (~100.000) separately will hugely inflate the error-rate (i.e. the number of false positives) • if .=0.05 .5,000 falsepositive! • therefore one needs to correct for this problem of multiple comparison:– Bonferroni correction – False Discovery Rate (FDR) – Familywise Error Rate (FWE) MarshMIPunTh_ax Where is the significance threshold? Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 4 www.itk.ppke.hu Bonferroni correction • if all voxels were independent of each other, than simply: pBonf = puncorr / Nwhere N is the number of voxels • however, voxels are notindependent (e.g. neighboring voxels show different pattern, drift affects all of them equally) • thus, a very conservative correction • we need to account for the dependency structure between the test statistics Familywise Error-rate (FWE) • controls the probability of making even one error (or more) Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 5 www.itk.ppke.hu False Discovery Rate (FDR) • FDR is the proportion of false discoveries among the discoveries (rejected hypothesis) • to calculate: order the p-values p1. p2.…. pn • for a desired FDR level q: let reject: • If no such kexists reject none (i.e. nothing is significant) pi i/n i/n ×q p-value 0 1 0 1 pk Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 6 www.itk.ppke.hu Region-of-Interest (ROI) Analysis … another way out without statistical tweaks • limit the analysis to a set of voxelscomprising an area (i.e. region of interest) and then average across them to get a parameter estimate • dimension reduction: the number of predefined ROIs are usually <10 • voxelsneed to be selected individually, based on an independent contrast (e.g. localizer) to insure there is no manipulation of chosen voxelsshowing the desired effect • desirable if the location of the ROI has high individual variance • how to select voxels(for more details see Tracey et al., 2008, NeuroImage):– select all active voxelsin a given independent contrast individually (what is active? › ~puncorrected<10-4) – select the peak activity (i.e. most active voxel) in the cluster and include all voxelsin a volume (sphere, cube) around it Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 7 www.itk.ppke.hu Caveats of classical parametric statistics in fMRI • fMRI voxels ~ dense 3D matrix of low quality EEG electrodes • Distribution of error, parameters? • Time and spatial interdependence -> degrees of freedom (DOF)? • Correction for multiple univariate stats Solution: • Nonparametric (resampling, bootstrap) methods • MVPA approach; MVPA & nonparametric analysis Validation? Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 8 www.itk.ppke.hu Statistical assumptions (fixed-effect analysis): Acquired datapoints are independent in time Stimulation Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 9 www.itk.ppke.hu What is our degree of freedom? • Theoretically: ~ Number of datapoints –Number of predictors • Can be adjusted by analyzing/modelling of nonsphericity– autocorrelation structure – AR(1) , ARMA(1,1): AR + white noise – drift correction, high pass filtering – limited validity Still it is a question: – whether an experiment consisting of 1 trial (stimulus) and 1000 data points (very long baseline)is equivalent to an experiment consisting of 500 trials with 2 data points? Acquired images of a response to a stimulus are not independent! Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 10 www.itk.ppke.hu Nonparametric methods: sampling statistics • Generation of surrogate data– Surrogates are to be „similar” to the original in any relevant aspect – Surrogate stats can be computed via• Experiments without stimulation • Reshuffling (or decomposing and reshuffling) data points • Random predictor time-courses in the design matrix • Sampling statistics– Statistical characterization of the original data and the surrogates • Decision making – Based on rank order of the original Examples: randomization test, bootstrapping Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 11 www.itk.ppke.hu Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 12 www.itk.ppke.hu Recipe • pseudo-randomize the design matrix (DM) • estimate parameters from false DM • repeating these steps we can obtain a parameter distribution centered around 0, which reflect random effects • compare p estimated from the actual DM to this distribution • a similar procedure can be used to statistically evaluate the difference between the parameter estimates of two condition • The same distributions enable an effective correction for multiple comparisons– Count the average number of voxels above different threshold with false DM and compare it to the values based on the original DM Biomedical Imaging: fMRI –Advanced StatisticalAnalysis p_voxel N. active voxels: original Average n. active voxels: random FDR:orig/rand ratio 0.0005 241 1.01 0.004190871 0.001 341 2.55 0.007478006 0.0015 408 4.17 0.010220588 0.002 470 6.3 0.013404255 0.0025 527 8.31 0.015768501 0.003 569 10.32 0.018137083 0.0035 610 12.23 0.02004918 0.004 642 14.19 0.022102804 0.0045 660 15.74 0.023848485 0.005 680 17.27 0.025397059 0.0055 712 18.97 0.026643258 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 13 www.itk.ppke.hu „Bootstrap” FDR Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 14 www.itk.ppke.hu „Bootstrap” FDR Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 15 www.itk.ppke.hu Validation example:activation of the fusiform area (event related design) Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 16 www.itk.ppke.hu Standard parametric map Nonparametric map Validationexample False positive activation signal in the left ventricle Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 17 www.itk.ppke.hu Univariant-multivariant analysis in fMRI Goal • Is there any effect? Hypothesis testing • What kind of effect? • Localization of effect Complexity of the multi-dimensional signal-processing: – Separately, one dimension at a time:• Traditional: voxelwise, independent • Selecting of areas, groups of voxels (ROI: POI, VOI) and averaging– S/N may increase – correction for multiple univariate comparisons is less important – Parallel multidimensional:• Spatial or spatial-temporal patterns: – Multi-voxel pattern analysis (MVPA) – Multivariate Decomposition: ICA, PICA etc. Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 18 www.itk.ppke.hu Multi-voxel Pattern Analysis (MVPA)… potentials and requirements General Purpose: – ROI based analysis: hypothesis testing – Search-light: localization Block design, sparse event-related design – Training & test based classifiers• single event based prediction Fast event related (& block + sparse ER) design – Parametric or non-parametric significance estimation of multi-dimensional distance (based on standard GLM results) Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 19 www.itk.ppke.hu MVPA details • Multivariant analysis: decoding („mind reading”) • Classification of activity patterns:• Feature selection • Normalization • Choosing classification algorithm • Optimization-training •Test, performance estimation • Validation of efficiency• Parametric model • Bootstrap, resampling • Interpretation of results Biomedical Imaging: fMRI –Advanced StatisticalAnalysis , 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 20 www.itk.ppke.hu trials voxels Classificationalgorithm , , Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 21 www.itk.ppke.hu Feature selection • Dimension (number of voxels) should be reduced• To exclude irrelevant and noisy voxels • High dimension and small sample size undermines the classification algorithm’s• Performance • Generalization capacity • Methods:• VOI • Exlusion of noisy voxels (e.g. (based on variance) • Voxelwise univariate statistics (ANOVA, t-test): ordering voxels • Combinatorial test of MVPA on groups of voxel• Full combinatorial, Genetic algorithm etc. Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 22 www.itk.ppke.hu Classifiers (supervised learning) • Linear• Generative models (modeling conditional density functions):fast, non-iterative algorithms• Naive Bayes • Linear discriminant • Mahalanobis distance • Discriminative models (slow, iterative optimization)• Logistic regression • Linear SVM • Non-linear (interpretation difficulties)• SVM • Multi-layer neural networks Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 23 www.itk.ppke.hu Separability of the activity vectors Univariate separable Linearly separable Linearly not separable Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 24 www.itk.ppke.hu Fisher linear discriminant analysis Between class variance Within class variance JFisher(w)= maximize w w w Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 25 www.itk.ppke.hu ClassA ClassB Test vector belongs to • Class A according to euclidean distance • Class B according to Mahalanobis distance Mahalanobis distance Classify according to distance from class mean Takes non-sphericity into account Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 26 www.itk.ppke.hu Interpretation of the results • Linear• In scale invariant case, weights of the discriminator can inform about the importance of the voxels separately • Patterns can be interpreted and visualized • Non-linear• Difficulties with decoding • Different combination of dimensions (voxel subgroups) can be evaluated • Interpretation of performance• Leave-one-out • Leave-some out: training-test set• Average-variance • ROC curve • Resampling statistics Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 27 www.itk.ppke.hu Leave-one-out Training Test Training Test Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 28 www.itk.ppke.hu Truepositiverate False positive rate Good Excellent Chancelevel Hyperplane wis defined, Move threshold bias from min to max ROC curve Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 29 www.itk.ppke.hu Validation: resampling • Shuffling labels on training set • Measuring performance • Repetition ( ~1000) times bootstrap Biomedical Imaging: fMRI –Advanced StatisticalAnalysis 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 30 www.itk.ppke.hu Search-lightclassification, lineardiscriminantanalysis classific1 • At each voxel 3X3 neighbourhood • Leave-some trials out 10X • Average performance: 90% at maxima Biomedical Imaging: fMRI –Advanced StatisticalAnalysis ROI basedSVM: parameteroptimization 2011.10.04.. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 31 www.itk.ppke.hu roi_RSTSanterior_10 roi_fusiL_10 roi_LSTS_10 roi_RSTSposterior_10 tmap_facevsobj Biomedical Imaging: fMRI –Advanced StatisticalAnalysis