Update: Storage Area Network (SAN) issues affecting services

alert

[31 January 2014]

Update, 31 January: Following the configuration changes outlined below we are confident that Blackboard, SITS, PIMS, RXWorks, Question Mark Perception  are now performing as normal as we are in a stable position in regard to all services.

However, additional work is still needed to some services to improve their configuration to enhance performance.

Some services, on the old storage service, are not working as well as we would like and we will also be working to improve those, these include the Datahub and the SFX! eJournal search.

The migration of content from the old server storage service to the new one is currently on hold as we implement the new configuration changes and improve the performance of the old server storage. We will continue with the migration and communicate those plans in the near future.

IT Services would like to apologise for the inconvenience these problems have caused.


Update 11:10, 31 January: The work done this morning affecting PIMS and other services has been completed successfully.


Update 09:00, 30 January: The work affecting SITS and other services has been completed and they should be available soon.  We will be monitoring performance closely to see if the changes have had a positive impact.


Update 14:30, 28 January:  Following the work this morning to improve SITS performance it is proposed that the additional work to move the database files will take place on Thursday 30 January between 06:00-09:00.  During this time a number of services will be unavailable. Please see the news item for details.

The work currently done to improve Blackboard and SITS also needs to be applied to other services on the new SAN including PIMS and work is planned for Friday 31 January between 08:00-8:30am. During that time a number of services will be unavailable. Please see the news item for details.


Update 08:50, 28 January: Work was completed successfully this morning to improve the performance of SITS. Changes were made to the management of the archive database files. We will be monitoring closely the performance of the service. Additional work to move all database files was not possible due to the time it was taking to copy and we may look to complete this work in the near future.


Update 08:50, 27 January: The work on Blackboard was completed successfully and we have been monitoring the service, which is performing well. We will continue to monitor the service closely as we begin teaching activity.


Update 16:20, 24 January: The performance issue with the Storage Area Network (SAN) continues to impact a number of services including SITS and Blackboard.

IT Services is continuing to work with its supplier to fully resolve the problem.

The next steps include:

  • Short term changes to Blackboard to improve its performance in readiness for the start of teaching block 2 on Monday. This will require a 20 minute Blackboard downtime, which is scheduled for 5:10pm today
  • More significant changes to SITS to resolve its current performance issue. This will require up to 3 hours downtime, which is scheduled to start at 6:00am, Tuesday 28 January
  • Implementation of additional SAN hardware to further improve performance. This will not require any downtime. The part is being couriered from U.S., and is expected to be delivered Thursday 30 January

Apologies for the inconvenience this is causing.


Update 10:15, 24 January: We are continuing to see performance problems with both the old and new server storage (SAN).  SITS/ eVision is particularly affected which may cause a slow response to access Exam Timetables, StudentInfo, MyStudents and Fees and Funding. We are continuing to try different configurations to improve the situation.

Update 17:20, 22 January: Performance problems still exist on the new storage. We are working closely with the supplier to run diagnostics and to test out some further changes. These tests will be carried out overnight and through tomorrow. Again we apologise for the inconvenience this is causing.


Update 12:50, 22 January: Further performance problems have been reported on the new storage, particularly affecting SITS, but also Blackboard, Aleph, and Student Accounting Module to varying degrees. We made a change this morning which has improved the situation but further work is needed. We are working closely with the supplier to run diagnostics and to test out some further changes. It is likely that some further downtime will be required. Again we apologise for the inconvenience this is causing.


Update 09:00, 20 January: Work on moving SITS / eVision to the new server storage has been completed and the services should be available from 9am as scheduled.

In addition, over the weekend RXWorks, CareersHub & Raisers Edge database stores have also been migrated to the new SAN.


Update 11:45, 17 January: Work on Questionmark Perception went ahead successfully this morning. The SITS/ eVision work is planned for Monday morning between 6-9am.


Update 14:20, 16 January: Blackboard seems to be running well.  We will be moving Questionmark Perception to the new server storage tomorrow,  Friday 17 January between 6-9am. On Monday, 20 January, we will move SITS / eVision (including Student Info and Exam timetables) again between 6-9am.

We have an existing plan to move all affected services to the new server storage over the next few weeks and we are reviewing that to prioritise those moves in light of the current problems.


Update 10:30, 16 January: Blackboard is now available.  We are monitoring the situation closely. The service was briefly back at 10.10am but needed to be taken down as datasebase backups were causing problems.  Those backups were halted and the service is now available.  Apologies to those who were using the service when it first came back.


Update 10:10, 16 January: We are seeing some issues with database backups having an impact on Blackboard. We are waiting for the backups to stop before making the service live.


Update 07:00, 16 January: Blackboard will be unavailable between 06:00-10:00 on Thursday 16 January as it is moved to the new server storage system (SAN).


Update 10:30, 15 January:  We are continuing to see poor performance on the SAN. We are implementing some changes to try and minimise the impact.  However, it has been decided that we will escalate the move of Blackboard and Questionmark Perception to the new SAN (see news item).  SITS was already scheduled to move next week.


Update 19:50, 14 January: Apart from the drop in performance seen this morning, which affected Blackboard, we have not seen any dramatic fall in performance of the SAN. However, some services, Questionmark Perception and Filestore in particular, were either unavailable or slow throughout the day. At around 6pm we rebooted the SAN service in an attempt to fix the problem on the advice of the suppliers. We shall be monitoring performance closely and continuing to investigate the problem. Again we apologise for the inconvenience this is causing.


Update 10:50, 14 January: At around 10am we saw a repeat of the slow performance of Blackboard and other services. Unfortunately, the fix implemented yesterday afternoon did not solve the problem.  We are continuing to investigate and apologise for the inconvenience this is causing.

Update 18:00, 13 January: We believe we have identified the cause of the problems affecting the SAN and various services. We have implemented a fix and will be monitoring the situation closely to confirm if this has resolved the problem.


We are seeing performance problems with the Storage Area Network (SAN). This is affecting a number of services and databases leading to intermittent access and slow response times.

We have been unable to identify the cause and have contacted the supplier to assist with resolving the issue.

Affected services include:

  • Blackboard
  • Questionmark Perception
  • SITS (including eVision)
  • CODA
  • Library Account (Aleph)
  • Metalib (Library)
  • Site Manager (New CMS)
  • Datahub (affecting services such as RX Works)

Other services at risk include:

  • Proactis
  • Syllabus Plus
  • CODA
  • Business Objects Infoview
  • MyBristol
  • Wiki service
  • Data Haven
  • SAFS (EFIM - Marks & Absence Tool)

IT Services apologises for the inconvenience this is causing.