|
Mathematics for Data Scientists
|
- Distribution of random variables, conditional probability and independence, distributions of functions of random variables, limiting distributions.
- Differentiation and integration of functions; basic matrix operations; linearization; linear and nonlinear optimization techniques; clustering and similarity measures, basic computational algorithms. Includes frequent illustration of concepts using mathematical computation tools.
| |
|
Data Science Fundamentals |
- This module teaches students about how data science is performed within academic and industry (via invited talks),
- research methods and how different research strategies are applied across different disciplines, and data science techniques for processing and analysing data.
|
|
Introduction to Data Mining and Analytics |
- Overview of the field of data mining and analytics; large-scale file systems and Map-Reduce,
- measures of similarity, link analysis, frequent item sets, clustering, e-advertising as an application, recommendation systems.
- Data Mining course has a special focus on statistical model building for Data Mining for Marketing, Sales and Finance
| |
|
|
- This module will provide students with up-to-date information on current applications of data in both industry and research. The module will build on Fundamentals of Data by explaining how data is processed and applied at large-scale across a variety of different areas.
| |
|
Large-Scale Data Storage Systems |
- The design and operation of large-scale, cloud-based systems for storing data. Topics include operating system virtualization, distributed network storage; distributed computing, cloud models (IAAS, PAAS, and SAAS), and techniques for securing cloud and virtual systems.
|
|
Programming for Data Science |
- This module aims to provide students with the necessary programming skills to statistically process and explore disparate datasets using R, to become confident in using this language to create and analyse variables in order to discover patterns and relationships through the use of visualisation, testing and modelling. It also aims to provide students with experience in using object-oriented
- programming concepts and principles to read in data from both local files and databases so that it can be merged together, using record-reconciliation techniques, and then output this into a single file for processing; this will be taught using the object-oriented programming language Java.
- The teaching of both Java and R is essential here as the former is well-suited to handling data, via the creation of bespoke data objects, while the latter is good for statistically assessing data.
| |
|
|
- The module introduces time series and causal forecasting methods so that passing students will be able to prepare methodologically competent, understandable and concisely presented reports for clients.
- By the end of the course, students should be able to model causal and time series models, assess their accuracy and robustness and apply them in a real world problem domain.
| |
|
|
- The theory and practice of visualizing large, complicated data sets to clarify areas of emphasis. Human factors best practices will be presented.
- Programming with advanced visualization frameworks and practices will be demonstrated and used in group programming projects.
| |
|
|
- o complete the MS degree each student must undertake a project worth 60 credits. This is a project chosen by you to investigate a challenging but constrained Data Science problem.
- The project will integrate the subject knowledge and generic skills that you will acquire during your Masters. We offer a wide range of projects, and each student is normally allocated a different project.
- We take student preferences and capabilities into account when we allocate the projects.
- The student will also have the opportunity to propose his / her own project, subject to academic approval.
|
|
| |