Multivariate Analysis 34
To present various aspects of multivariate analysis, covering data exploration, modeling and inference.
Multivariate analysis is a branch of statistics involving the consideration of objects on each of which are observed the values of a number of variables. A wide range of methods is used for the analysis of multivariate data, both unstructured and structured, and this course will give a view of the variety of methods available, as well as going into some of them in detail.
Interpretation of results will be emphasized as well as the underlying theory.
Multivariate techniques are used across the whole range of fields of statistical application: in medicine, physical and biological sciences, economics and social science, and of course in many industrial and commercial applications.
Relation to other units
As with units MATH30013 (Linear and Generalised Linear Models) and MATH 33800 (Time Series Analysis), this course is concerned with developing statistical methodology for a particular class of problems.
Applications will be implemented and presented using the statistical computing environment R (used in Probability 1 and Statistics 1).
To gain an understanding of:
- Scope of multivariate analysis;
- Multivariate normal distribution and Wishart distribution;
- Statistical inference for multivariate normal data;
- Principal components analysis;
- Scaling, classification and clustering.
And further to
- gain experience of how the variaous methods are applied, and results interpreted, in practice;
- develop the ability to implement methods computationally (with R or suitable software);
- develop the ability to evaluate the suitability of, and compare, different methods in practice.
Self assessment by working examples sheets and using solutions provided.
- General introduction to multivariate data; revision of relevant matrix and linear algebra; linear transformations.
- Properties and decompositions of non-negative definite symmetric matrices.
- Principal components analysis; derivation of principal components as eigenvectors of covariance matrix; selection of a good low-dimensional representation; interpreting principal components; scaling problems.
- The multivariate Normal distribution: definition and properties. The standardised multivariate Normal distribution. Statistical inference for the mean of a multivariate normal with known variance, and when the variance matrix is estimated. Hotelling's T-squared statistic.
- Linear discriminant analysis; maximum likelihood and Bayesian allocation rules; probability of misclassification and its relation to Mahalanobis distance.
- Classification using cluster analysis; similarity, dissimilarity and distance measures; agglomerative algorithms for clustering; single linkage and the minimum spanning tree; complete linkage; dendrograms.
- Multidimensional scaling. Classical scaling; recovering a data matrix from Euclidean distances. Relationship with principal components. Ordinal scaling; defining and minimising the stress function. Least squares monotone regression and the pool-adjacent-violators algorithm.
Reading and References
There is no one set text. Any one of the following will be useful, particularly the first one (from which the notation for the course is taken):
- K V Mardia, J T Kent and J Bibby, Multivariate Analysis, Academic Press, 1979.
- W J Krzanowski, Principles of Multivariate Analysis: A User's Perspective. Clarendon Press, 1988.
- C Chatfield and A J Collins, Introduction to Multivariate Analysis. Chapman and Hall, 1986.
- Krzanowski, W. J. and Marriott, F. H. C. Multivariate Analysis, Parts I and II. Edward Arnold. 1994.
MATH11300 Probability 1, MATH 11400 Statistics 1 and MATH 11005 Linear Algebra & Geometry. See also Assessment Methods, below.
Methods of teaching
Lectures (including both theory and illustrative applications), exercises to be done by students.
The pass mark for this unit is 50.
The final mark is calculated as follows:
- 100% Exam
NOTE: Calculators of an approved type (non-programmable, no text facility) are allowed.
For information resit arrangements, please see the re-sit page on the intranet.
Further exam information can be found on the Maths Intranet.