Skip to main content

Unit information: Introduction to Data Analytics in 2021/22

Unit name Introduction to Data Analytics
Unit code COMSM0089
Credit points 10
Level of study M/7
Teaching block(s) Teaching Block 2 (weeks 13 - 24)
Unit director Dr. Simpson
Open unit status Not open

EMATM0048 (SDPA) or EMATM0061 (SCEM)



School/department Department of Computer Science
Faculty Faculty of Engineering

Description including Unit Aims

The sheer volume and complexity of available digital data means that traditional manual techniques and stand-alone applications for data analysis are very often no longer sufficient to process and analyse such data and provide useful information. The vast volumes of digital data that are in the form of human-readable natural language text (e.g. in English or Spanish or Chinese) enables large-scale language analysis techniques that are heavily rooted in statistics and machine learning. For example, the availability of large- scale sources of text data, such as those found on social media websites, opens up new opportunities for estimating the sentiment or opinions of large groups of people. At the same time, making sense of digital data is often possible only when it is distilled and displayed via an appropriate visualization technique, and contemporary visualization techniques also often rely on machine learning and statistical methods. This unit gives students a grounding in fundamentals both of visual analytics and of text analytics: the science of information visualisation (primarily concerned with the way that data is represented visually); and the science of extracting useful information from bodies of natural-language text.

Information visualisation topics covered by this unit include: data types and their representations, non-vectoral data, human requirements for visual analytics, scientific visualisation, visualisation quality metrics, Shneiderman’s mantra (overview first, zoom and filter, details on demand) practical visualisation tools.

Text analytics topics covered by this unit include methods for unsupervised and supervised text mining including text pre-processing, structured data extraction, clustering of documents, classification of documents, and sentiment analysis using different techniques.

Intended Learning Outcomes

Students will be able to

  1. Select and employ appropriate techniques for structured data extraction and text pre-processing.
  2. Write programs and deploy library-code for various techniques for statistical text analysis.
  3. Define and apply the principles of information visualisation.
  4. Analyse the design of visual representations of data in terms of human perception and cognition

Teaching Information

Problem-based learning combining lecture elements with practical individual work.

Assessment Information

90% Coursework: students will develop a system for automated gathering and analysis of a substantial text corpus and write a report on their findings ILO1, 2. The report should include appropriate use of visualizations, with an accompanying rationale/commentary on why the chosen methods were selected.

10% In-class tests.


If this unit has a Resource List, you will normally find a link to it in the Blackboard area for the unit. Sometimes there will be a separate link for each weekly topic.

If you are unable to access a list through Blackboard, you can also find it via the Resource Lists homepage. Search for the list by the unit name or code (e.g. COMSM0089).

How much time the unit requires
Each credit equates to 10 hours of total student input. For example a 20 credit unit will take you 200 hours of study to complete. Your total learning time is made up of contact time, directed learning tasks, independent learning and assessment activity.

See the Faculty workload statement relating to this unit for more information.

The Board of Examiners will consider all cases where students have failed or not completed the assessments required for credit. The Board considers each student's outcomes across all the units which contribute to each year's programme of study. If you have self-certificated your absence from an assessment, you will normally be required to complete it the next time it runs (this is usually in the next assessment period).
The Board of Examiners will take into account any extenuating circumstances and operates within the Regulations and Code of Practice for Taught Programmes.