Inference at Scale (STAT0043)

Key information

Faculty: Faculty of Mathematical and Physical Sciences
Teaching department: Statistical Science
Credit value: 15

Restrictions: Subject to the availability of places, this module is also offered as an elective to students specialising in other fields. Information on the academic prerequisites and registration procedure is available at: /statistics/current-students/modules-statistical-science-students-other-departments.
Timetable

Alternative credit options

There are no alternative credit options available for this module.

Description

This module aims to introduce several fundamental ways by which scalability plays a role in statistical data science, namely large data (both in the number of observations and the number of covariates) and large models (with inferential, engineering and computational implications). It is primarily intended for third and fourth year undergraduates and taught postgraduates registered on the degree programmes offered by the Department of Statistical Science (including the CSML programme). The academic prerequisites for these students (in addition to their compulsory modules) are STAT0041 and STAT0042听(UG), or one of COMP0078听/听COMP0088 (PGT).

Intended Learning Outcomes

understand the statistical assumptions, pitfalls and possibilities in the analysis of high-dimensional data;
be able to scale up statistical inference and machine learning for large datasets;
be able to efficiently deploy methods for fitting and comparing complex models;
be able to lead and coordinate projects for heterogeneous and听structured data;
have a deeper understanding of trade-offs between modelling flexibility and computational costs (Level 7 only).

Applications - Technological advances have brought new ways of generating data, as well as allowing for more complex models to be developed using improved computational resources. Students seeking to reach the forefront of data science must understand how such massive datasets and models can be manipulated effectively using advanced statistical methodology and large-scale algorithms. This module will allow students with previous exposure to statistical inference and machine learning to acquire further skills to handle data and models too complex to be approached by standard methods.

Indicative Content - Formalising the challenges of inference at scale through empirical risk minimisation, theoretical convergence rates and the computational complexity of algorithms. Dealing with large-scale samples (scale 1): gradient-based optimisation, stochastic optimisation and parallelism. High-dimensional statistics (scale 2): the curse of dimensionality, dealing with a large or infinite number of variables for problems of regression, and dimensionality reduction. Dealing with expensive problems (scale 3): Gaussian process regression, active learning and Bayesian optimisation.

Key Texts - Available from .

Module deliveries for 2024/25 academic year

Intended teaching term: Term 2 听听听 Undergraduate (FHEQ Level 7)

Teaching and assessment

Mode of study: In person
Methods of assessment: 20% In-class activity

80% Exam
Mark scheme: Numeric Marks

Other information

Number of students on module in previous year: 0
Module leader: Dr Francois-xavier Briol
Who to contact for more information: stats.ugt@ucl.ac.uk

Intended teaching term: Term 2 听听听 Postgraduate (FHEQ Level 7)

Teaching and assessment

Mode of study: In person
Methods of assessment: 20% In-class activity

80% Exam
Mark scheme: Numeric Marks

Other information

Number of students on module in previous year: 14
Module leader: Dr Francois-xavier Briol
Who to contact for more information: stats.ugt@ucl.ac.uk

Intended teaching term: Term 2 听听听 Undergraduate (FHEQ Level 6)

Teaching and assessment

Mode of study: In Person
Methods of assessment: 20% In-class activity

80% Exam
Mark scheme: Numeric Marks

Other information

Number of students on module in previous year: 10
Module leader: Dr Francois-xavier Briol
Who to contact for more information: stats.ugt@ucl.ac.uk

Last updated

This module description was last updated on 19th August 2024.

听

香港六合彩