Backing up a UNIX system

Contents

What is a backup?

A Backup is a copy of data stored on a computer's disk. Backup copies are usually kept on magnetic tape.

Back to top

Why it's good to backup disks

Making regular backups of the disks attached to a system is essential to safeguard the integrity of the service that that system provides.

If the information stored on the disks is lost and is irretrievable, the service is compromised.

Often the information stored on disks can be regenerated, but this is not always possible and so this cannot relied upon to protect the service.

Back to top

What should be backed up?

The information stored on disks falls loosely into two categories:

  • Data files (user data and user-compiled programs)
  • System files (system software, application software, system data)

The "Data files" on a system should be backed up to protect each users' work. If part of the "Data files" filestore is lost (for example, through disk failure), the users who had data in this area will be affected but it should be possible to provide a service to all other users.

The "System files" should be backup up to protect the system as a whole. If all or part of the "System files" filestore is lost, then no service can be provided until these files are restored.

The frequency of backups depends upon:

  • how often files change
  • value of data
  • nature of the service provided
  • operational resources (for example, staff, money)

Devising a suitable 'backup strategy' will be outlined later on in this document.

Back to top

Backup media

Before a backup strategy can be devised, a survey should be carried out to see what backup devices are available to the system, and if there are none suitable, what devices could be acquired.

The most common backup media/devices are:

  • 8mm cartridge
    Often called Exabyte ("Exabyte Corporation"). Drives are either 2Gb or 5Gb. The 5Gb drives can read 2Gb tapes.
  • 1/4" cartridge
    Older tapes of this type (QIC 24, DC 600A) can store up to 60Mb. Newer ones (QIC 150, DC 6150) hold up to 150Mb. Higher capacity drives are available but less common.
  • DAT
    The most common DAT drive/tape type is 2Gb (90m). Higher capacity drives/tapes are available.

Normal filesystem backups require a high capacity medium. For this purpose, either an 8mm drive or a DAT drive will be suitable.

The capacity of 1/4" cartridge drives is too low for them to be considered for backing up filesystems. They are normally used for software distribution, file transfer, and backing up files for individual users.

Often the system to be backed up will have a tape device connected to it. If not, there will usually be a suitable drive available on the local network (preferably on the same sub-net).

If there is no suitable drive available, then one should be purchased.

Information Services can advise on locating or purchasing a drive.

Back to top

Backup commands

There are a number of backup commands available on Unix systems. Some are standard (that is, available on all systems) and some are specific to a particular manufacturer. Backup commands can be divided into two loose categories:

  • those suited to backing up individual files, or groups of files;
  • those more suitable for backing up whole filesystems (partitions).

The following table lists the most common backup/recovery utilities:

Command name(s) Small groups Filesystems Availability of files
tar Yes Possible, but not idea All systems
cpio Yes Possible, but not idea All systems
dump/restore Possible, but not idea Yes All systems
bru Possible, but not idea Yes IRIX
ufsdump/ufsrestore Possible, but not idea Yes Solaris 2

Some systems provide graphical backup/recovery utilities. These may be more convenient to use for backing up small amounts of data, or personal workstations.

It's beyond the scope of this document to describe each of these commands. However, please see the sample backup scripts for examples of how to use dump.

Back to top

Backup strategy

Once a suitable tape drive has been made available, an appropriate backup strategy can be devised. This strategy will depend upon local requirements and resources and so some thought should be given to the following:

  • How often do user's files change?
  • How critical is the integrity of the data?
  • What human and material resources are available to maintain a backup strategy?

It would be impossible to cover all possible strategies here, so a number of 'typical' strategies are presented.

Case One

Scenario

A single workstation ("wallace") with no backup device directly attached. There is an 2Gb 8mm tape drive attached to a local system ("gromit").

Wallace has one 2Gb disk drive which contains all of the system and user filestore.

The machine is used by a group of three users. These users have been consulted regarding their file usage.

Diagram of network connections and network resource for scenario 1

Possible backup strategy

Since the total amount of filestore that could possibly be backed up will not exceed 2Gb, it's feasible to do only full backups onto each backup tape. The users are happy with weekly backups and say that Fridays would be the most convenient day for this.

Data that is more than one month old is of no use on Wallace, so there is no point keeping tapes older than this. It should therefore be possible to re-cycle these tapes.

So, to implement this strategy, four 8mm tapes are needed. These should be labelled:
Wallace: Week A
Wallace: Week B
Wallace: Week C
Wallace: Week D

Recycling tapes means that in the week following Week D, the Week A tape is reused, and so on.

Because the backup drive is on another system (Gromit), it would be best not to backup Wallace during the day. A good time to start it would be last thing in the evening. This will avoid loading the local network during working hours.

It is the responsibility of one person to do the backups, but they can delegate it to any of the three users of the system.

It is advisable to replace the tapes at least every 12 months (earlier, if they show signs of wear).

An example Unix Shell Script, backup.ex1 (ex1, 1.4 KB), is available to make this backup easier.

Case Two

Scenario

A server, named Preston, serves user filestore and application software to three satellite systems; Sean, Wendolene and Feathers. Preston has two 4Gb user partitions; /users and /users2, as well as a 4Gb application software partition (/app). It also has a 100Mb root partition and an 800Mb /usr partition.

Preston has a 2Gb DAT drive and an external QIC 150 cartridge drive.

The satellite systems have their own system disk which contain just one partition, the root. Each is 1Gb in size. Each mounts (using NFS) the /users1, /users2 and /app partitions from Preston.

The user base is some 40 users, and file activity can be regarded as high.

The system is looked after by a member of staff who is responsible for maintaining a backup strategy.

In terms of file update activity, a list of partitions starting with the busiest would be:
preston:/users1
preston:/users2
preston:/usr
preston:/
sean:/
wendolene:/
feathers:/
preston:/app

Clearly, the users' data changes most often so it's vital that this is backed up as often as possible.

Also, Preston acts as NIS server for the three clients, serving, amongst other things, /etc/passwd files for each of them.

Diagram of network connections and network resource for scenario 2

Possible backup strategy

Because of the large amount of data that needs to be backed up and the capacity of the backup media, it is really only feasible to do monthly (or every 4 weeks) full backups. Furthermore, it's probably only worth doing monthly full backups of:
preston:/users1
preston:/users2
preston:/usr
preston:/

The maximum total amount of data that could be in these partitions is 4+4+0.1+0.8=8.9Gb. So, potentially this could use up to five 2Gb tapes.

Once a week, in the intervening weeks, an 'incremental' backup should be done for each of the above partitions. Incremental backups only save those files that have changed since the previous full backup.

The other regions;
sean:/
wendolene:/
feathers:/
preston:/app

need only be backed up when they change (ie when software is added or modified on /app, and/or when files are changed on the root partitions of the satellite systems). However, make sure there are at least two backups of these regions available at any one time.

This strategy would require at least twenty eight 2Gb DAT tapes in total. These would be labelled:
Preston: Full Month A (tapes 1, 2, 3, 4 and 5)
Preston: Full Month B (tapes 1, 2, 3, 4 and 5)
Preston: Full Month C (tapes 1, 2, 3, 4 and 5)v
Preston: Incr 1
Preston: Incr 2
Preston: Incr 3
Sean: Full 1
Sean: Full 2
Wendolene: Full 1
Wendolene: Full 2
Feathers: Full 1
Feathers: Full 2
Preston: App Full 1 (tapes 1 and 2)
Preston: App Full 2 (tapes 1 and 2)

A typical 12 week cycle (with backups on a Thursday) would be:
1st Thursday: "Preston: Full Month A" backup
2nd Thursday: "Preston: Incr 1" backup
3rd Thursday: "Preston: Incr 2" backup
4th Thursday: "Preston: Incr 3" backup
5th Thursday: "Preston: Full Month B" backup
6th Thursday: "Preston: Incr 1" backup
7th Thursday: "Preston: Incr 2" backup
8th Thursday: "Preston: Incr 3" backup
9th Thursday: "Preston: Full Month C" backup
10th Thursday: "Preston: Incr 1" backup
11th Thursday: "Preston: Incr 2" backup
12th Thursday: "Preston: Incr 3" backup

13th Thursday: "Preston: Full Month A" backup
etc...

The tapes for each of the other systems and the application software partition would be used alternately as updates are made.

An example Unix Shell Script (for the main 12 week cycle), backup.ex2 (ex2, 1.8 KB), is available to make this backup easier.

Case Three

Scenario

A large server, named Morph, with about 200 users registered on it. It has 4 user partitions:
/u1 2Gb
/u2 2Gb
/u3 2Gb
/u4 2Gb

In addition to this there are a number of system/application software partitions:
/ 200Mb
/usr 1.6Gb
/free 2Gb
/soft 4Gb

The /soft region contains application software which is added to and modified on a regular basis. The /free region is an unquotad region used for temporary filestore.

The machine has an average of 30 users logged in at any one time and user file activity is regarded as high.

The machine has a 5Gb 8mm drive connected. The system is looked after by a member of staff who is responsible for maintaining a backup strategy. This person has a colleague who can attend to backups if required.

Because of the nature of the service provided by this system it is essential that no user should expect to lose more than a days work if a disk fails.

Diagram of network connections and network resource for scenario 3

Possible backup strategy

The total capacity of all the user partitions on this system is quite large, and it's possible that 2 full tapes may be needed to do a full backup of these partitions. So it seems that a suitable backup strategy would involve daily incremental backups, and weekly full backups, of user data. Also, since / and /usr change on a daily basis (/usr contains daily accounting records on this system) it would be useful to do the same for these areas.

However, /soft area is updated frequently, and should be backed up at least every week. If this partition is to be backed up in full along with the full backups of the user data, then around 3 tapes would need to be lined up for each weekly full backup. This would not be practical; a full backup could take a whole day, with regular "human operator" intervention.

It would be better if a weekly full backup of the user data, / and /usr could be accompanied by an incremental backup of /soft. A full backup of /soft could be made once every four weeks.

In order that files deleted more than 4 weeks ago are still available on backup, the last full backup of the cycle (ie the one containing a full backup of /soft) should be taken out of the cycle and kept indefinitely. This tape should be labelled Archive No. n where n is a sequence number starting from 1.

The following table outlines the strategy:

Filesystem /u1 /u2 /u3 /u4 / /usr /free /soft
Size (Gb) 2 2 2 2 0.2 1.6 2 4
Max. tapes
Backups Full (Week A) F F F F F F I I 3
Full (Week B) F F F F F F I I 3
Full (Week C) F F F F F F I I 3
Full (Archive) F F F F F F F F 4
Incremental (Mon) I I I I I I - - 2
Incremental (Tue) I I I I I I - - 2
Incremental (Thu) I I I I I I - - 2
Incremental (Fri) I I I I I I - - 2

Key:
F = Full backup
I = Incremental backup
- = Not backed up

The whole strategy will require up to 21 5Gb tapes, with 4 new tapes needed every month to replace the Archive backup tapes. These tapes would be labelled as follows:
Morph: Full (Week A) (tapes 1, 2 and 3)
Morph: Full (Week B) (tapes 1, 2 and 3)

Morph: Full (Week C) (tapes 1, 2 and 3)
Morph: Archive # (Week D) (tapes 1, 2, 3 and 4)
Morph: Incr (Monday) (tapes 1 and 2)
Morph: Incr (Tuesday) (tapes 1 and 2)
Morph: Incr (Thursday) (tapes 1 and 2)

Morph: Incr (Friday) (tapes 1 and 2)

The above assumes that the full backups are done on a Wednesday.

The full backups are likely to take a long while, so it would be best to start them in the morning. The incremental backups may only require one tape, so they could be started at the end of the day.

The first 4 weeks of the backup cycle would be:
Wed: "Morph: Full (Week A)" backup
Thu: "Morph: Incr (Thursday)" backup
Fri: "Morph: Incr (Friday)" backup
Mon: "Morph: Incr (Monday)" backup
Tue: "Morph: Incr (Tuesday)" backup
Wed: "Morph: Full (Week B)" backup
Thu: "Morph: Incr (Thursday)" backup
..etc..
Wed: "Morph: Full (Week C)" backup
..etc..
Wed: "Morph: Archive 1 (Week D)" backup
..etc..
Wed: "Morph: Full (Week A)" backup
..and so on..

When the Morph: Archive 1 (Week D) backup is finished, the tapes should be filed away, and new tapes should be labelled Morph: Archive 2 (Week D) ready for the next Week D backup.

An example Unix Shell Script, backup.ex3 (ex3, 4.7 KB), is available to make this backup easier.

Back to top

Who does the backup?

Often this is the most crucial issue in establishing and maintaining an appropriate backup strategy. If no-one takes responsibility for doing the backups, then the best of strategies will soon break down.

For any strategy to work, at least one person must take charge. However, it is useful to have at least one other person who can step in to do the backups if the primary person is away.

In cases where a number of people are willing and able to do backups, then it would be useful to have a 'backups diary' or 'backups log' which could be updated when a particular backup has been done (and who did it). However, it's still important that one person takes overall responsibility.

Back to top

Tape storage/access

There are a number of issues that have to be addressed when deciding where to store tapes:

  • Convenience
  • Security
  • Safety

Convenience

The tapes have to be easy to get at (for authorised personnel). If the tapes are locked away in someones desk, then others may have difficulty getting access to them to do backups and/or retrieve files.

Security

The tapes should be stored in a place that only authorised personnel have access to. If tapes are stolen or damaged (especially if this is not immediately obvious) then the service provided by them computer system could be at risk.

Safety

As well as protecting the tapes from theft or unauthorised access, they should be protected from:

  • Fire
  • Extreme temperature (inc. direct sunlight)
  • Smoke
  • Extreme Humidity
  • Dust
  • Liquids
  • Breakage

Humidity and extreme temperature are two hazards that, if they are to be avoided, require an air-conditioned environment. This is rarely available to the 'small-scale' user.

Fire and smoke are also difficult to guard against: Fire-proof safes are expensive, and so not a feasible option for most users.

Other hazards are easier to guard against. The use of a suitable tape rack or box usually suffices.

Tape storage

Having considered all of these issues, suitable storage should be provided for the tapes. Some examples are:

  • In a fire safe in the computer room
  • On a rack in the computer room near the computer
  • On a rack in the System Administrators office
  • On a shelf in the office with the computer

Obviously, some of these are more desirable solutions than others. In all cases however, it's best to keep one set of 'full' backup tapes elsewhere, preferably in another building. This should provide a failsafe backup system, even if a fire safe, or other secure storage, is not available. Which tapes are kept 'off-site' depends on the backup strategy employed, but it's wise if they are fairly up-to-date copies.

Possible off-site locations might be:

  • The computer room in another department
  • A colleague's office (in another building)
  • At home

In all cases, the location must be secure and accessible.

Back to top

File recovery

In general, tape backups are made for two reasons:

  1. To guard against data loss due to disk failure (or other system failure), and/or
  2. To be able to retrieve files that were accidentally deleted or become corrupted.

The ability to be able to rebuild files after a system failure is essential, so the priority for the System Administrator is to 'guard against data loss'. The second item, 'to be able to retrieve files that were accidentally deleted or become corrupted', can be regarded as a service to users, and so those who run the system may decide that there are not sufficient resources (that is, people-time) to provide such a service.

The actual procedure for retrieving files from tapes depends upon the way in which they were put onto the tape. Either consult the online manuals or the reference guides for a description of each. Alternatively, seek advice (see below).

Back to top

What Information Services can offer in the way of help/consultancy

Developing a backup strategy can be a daunting task, especially for those unfamiliar with Unix systems management. Here's what Information Services can offer in the way of help and consultancy:

  • help with deciding what to backup, and when
  • help preparing the necessary software tools
  • advice on backup media/drives
  • advice on tape storage

To arrange for someone to call and discuss these issues please contact the Help Desk.

Back to top

Further information

Documents available on this subject:

  • System and Network Administration Guide (Solaris 1)
  • Solaris System Administration Guide Volume 1 (Solaris 2)
  • IRIX Advanced Site and Server Administration Guide
  • tar
  • cpio
  • dump
  • restore
  • ufsdump
  • ufsrestore

All of these documents are available online.

Back to top