Unit name | Fault Tolerant Computing and VLSI Testing |
---|---|
Unit code | COMSM0125 |
Credit points | 10 |
Level of study | M/7 |
Teaching block(s) |
Teaching Block 1 (weeks 1 - 12) |
Unit director | Professor. Pradhan |
Open unit status | Not open |
Pre-requisites |
COMS11300 and COMSM1201 |
Co-requisites |
None |
School/department | Department of Computer Science |
Faculty | Faculty of Engineering |
This course is broadly divided into two parts. Part one discusses the factors that cause system failures such as hardware defects, faults, noise, design errors and software bugs. Then a wide range of techniques are presented for discovering defects, design errors and faults. Also discussed are design methods to enhance reliability, availability and serviceability in microchips, computer systems and networks. This part includes models for evaluating the effectiveness of design techniques in terms of reliability and availability improvements versus costs in chip area, system complexity and power dissipation. Part two will introduce concepts of error correcting codes in memory and communication. Microchip test techniques, including on-line testing and built-in-self-test, are also reviewed.
Aims:
This unit seeks to acquaint you with various aspects of designing reliable and testable computer system design. Topics covered span issues at both micro-chip level as well as board and system level.
Successful completion of this unit will enable you to: understand why micro-chip fail; test for algorithms for wide range of faults including delay faults; understand reliability models of micro chips; design for testability; fault tolerant computing techniques.
Lectures (20). A further 80 hours are set aside for coursework and private study.
Coursework will consist of two parts.
A set of take home assignments worth 70% and two lab assignments worth 30%.