Datahub FAQs

Hand holding motherboard

What is the DataHub?

The DataHub is an Oracle database which is populated from the university's administrative databases. It can be used to extract student and personnel data to a local source (for example, a report, list, spreadsheet or locally held database).

Back to top

What is the difference between the DataHub and the Portal?

The Portal is a content free web application, which serves as a gateway or interface to other web based applications. A Portal facilitates access to diverse information from a single point of entry - it aggregates information and services from elsewhere.

Because the Portal is 'content free' (in other words, its like an empty box that you can then put lots of applications into) it needs to be supplied with content. This content will be structured (sourced mainly from a database), unstructured (eg - Word documents) or application based (eg - PersonInfo).

As a database, the DataHub is a source of structured data for the Portal. This data is used in a variety of ways depending on the context.

Use of the data and developing Portal applications using the data is outside the scope of the DataHub Project. Similarly, provision of data from central or local databases by the DataHub is outside the scope of the Portal Project. The two projects are therefore closely linked. Data requirements for the Portal Project will have an impact on the DataHub Project. However, these requirements have to be weighed against the needs of other stakeholders.

For more information on the Portal Project, see the Portal Project.

Back to top

Why would I want to access the DataHub outside the University Portal?

You might want to access the DataHub outside the context of the University Portal in order to to:

  • Manipulate data and create bespoke reports
  • Join the central data held in the DataHub with local data (eg - data held in a departmental MS ACCESS database
  • Have access to central data in a simplified form
  • Have access to a unified view of central data
  • Develop applications outside the scope of the Portal Project

Back to top

Why should I use the DataHub?

Reasons for using the DataHub include:

  1. Central data - Common to everybody
    • The DataHub provides a common data source for applications including central services and departmental systems.
    • The format of the data has been standardised on input, regardless of the format in the source systems.
  2. Easily accessible
    • All staff members can request an account which can be accessed from all PC's on the universities network.
  3. Direct read only access
    • Since all access is read only, there is no need to worry about making accidental changes.
    • All updatable information must be held in your own applications.
    • Direct access to the DataHub database, removes the need to load data into individual systems using file downloads (e.g. CSV files).
    • Often direct database access is not permitted on other source systems, other than through front end screens.
  4. Masks source system changes
    • Any changes to any of the underlying systems will be hidden.
    • It removes the need to re-write applications using the DataHub.
  5. Less complex model than source systems
    • The DataHub's data model is far less complex than those of the underlying source systems, making the extracting of data much easier.
    • Source data has been combined into one model, removing the need to access multiple systems concurrently.
    • Fully documented schema model and table and column descriptions.
    • The simpler model often improves the performance when querying data.
  6. Data availability guaranteed
    • Safety mechanisms have been established to ensure that when the daily refresh of the DataHub is taking place, if a problem occurs in one of the underlying systems whereby data isn't available, the problem won't be passed onto the DataHub.
  7. Security model included
    • Individual and departmental users data access is controlled, to remove the chances of inadvertently publishing sensitive data.
    • Further details are discussed later in the section 'DataHub Security'.

Back to top

When shouldn't I use the DataHub?

Reasons for NOT using the DataHub include:

  1. Not dynamic (data refreshed nightly)
    • As the data only refreshes once every day, it is necessary to have to wait for any changes to the source systems to become available.
    • The daily refresh can be extremely useful for reconciling data.
  2. Only limited historic data
    • DataHub only displays details of current members of the University.
  3. Volatile
    • The records are completely re-loaded each day.
  4. Limited summary data
    • Data is mainly only stored in a detailed format
    • Very few summary tables (More planned for future use)
  5. Not as detailed as source system data
    • Only commonly required data is stored within the DataHub.
    • Detailed information, will still come from the relevant source system.
  6. Changes may take a long time
    • Before any major changes are made they must be reviewed by the project board/DataHub users forum.
    • Only one developer who carries out all of the changes.
    • The reason for being so rigid in applying changes to the DataHub is to ensure they collectively benefit all users.

Back to top

Where is the data held?

The data is held in a central Oracle database maintained by Information Systems.

Back to top

How does the data get there?

The DataHub takes its data from the corporate databases: SITS, PIMS, etc. The data is loaded nightly.

Back to top

How do I access to the DataHub?

  • Access is generally via ODBC from the DataHub to a local application (e.g. - Microsoft ACCESS database).
  • If you have an application which already uses the DataHub as a DataSource (e.g. - a website which takes its information from the DataHub), you already have access to the DataHub via your user account. However, further set-up might be required if your access requirements have changed.

Back to top

Can I access the DataHub outside the university?

You can connect to the DataHub from outside the university via a secure server or using the UoB-VPN service. All data sent over the network is encrypted.

Back to top

Do I need to be shown how to use the DataHub?

Currently no formal training is offered in using the DataHub. However:

  • The DataHub is a database. Using it requires some understanding of database tools (ODBC, a RDBMS package - e.g. Microsoft Access, designing and building queries (using SQL or Access query designer or wizards).
  • Further set-up of your local database might be required in order to use the DataHub. This can be done by Information Services, via the person responsible for supporting your local database.

Back to top

What is the DataHaven?

The DataHaven provides a centrally-supported Oracle database service for staff to create their own databases for research or administrative purposes.  If desired, data can be linked to the DataHub or to local, departmental, data stores.  One of the key advantages of storing data in the DataHaven rather than local data stores is that the database is centrally administered and backed up.

Back to top

Is it possible to add new data to the DataHub?

It is possible to have new data added to the DataHub. At present, you'll need to enter a change requests with your requirements.

In future you will be directed to a web form to submit this request online.

Back to top

Why do multiple versions of each table exist?

As a result of the schema name change from 'BRS' (in version 1) to 'DATAHUB' (in version 2), multiple versions of each table will be displayed when selecting tables in an object browser.

  • The version with the 'DATAHUB' prefix (e.g. DATAHUB.FACULTY) is the new version 2 table, which should be chosen for any new tables.
  • The version with the 'BRS' prefix (e.g. BRS.FACULTY) is a synonym pointing to the new version of the table in the 'DATAHUB' schema, and is available for backwards compatibility purposes.
  • Another version with no prefix or a 'PUBLIC' prefix (e.g. PUBLIC.FACULTY) may also be displayed and is used for applications which haven't specified a schema owner in their code.

Tables in the 'DATAHUB' schema should be used as the default choice.

Back to top

Can I change my account password?

Yes, you will be able to change your password if you have access to Oracle's SQL*Plus, by login in with your account and typing:

ALTER USER <account_name> IDENTIFIED BY <new_password>;

You will only be permitted to change your own password.

Back to top

How should I tune my SQL queries?

Because of the complex nature of tuning SQL queries to run more efficiently, this isn't covered in much detail on this web site, since it may be specific to the method you use for accessing the database. Large amounts of information on this subject, is however available on the Intranet. Useful search topics include, cost and rule based optimisation, using explain plan and SQL tuning.

Hints when writing DataHub queries:

  • We recommend using cost based optimisation on the DataHub
  • If you store your own tables in an oracle database, it is recommended to analyze the schema periodically. This provides information about the tables, which the databases optimiser uses to decide how to run the queries most efficiently, using the cost based approach.
  • For large queries learn how to use and interpret Oracles explain plan functionality.
  • When joining tables in your own database with multiple DataHub tables or DataHub tables which can be filtered, optimise your queries using Collocated Inline Views. Since in this instance the DataHub is a remote database, collocated inline views prevent multiple remote accesses and enable remote filters, both of which can speed up queries enormously. For instance:
    Change the query from.....

    SELECT loc.username,
    per.initials||' '||p.surname sname,
    loc.column1,
    sr.route_code,
    sr.curriculum_year
    FROM person_open@datahub per,
    current_stu_reg@datahubsr,
    local_table loc
    WHERE rem.common_username=loc.username
    AND per.person_id=sr.person_id
    AND sr.organisation_code='&dept'
    AND sr.attendance_code='FT'
    AND loc.column_1='X';


    to.....

    SELECT loc.username,
    rem.sname,
    loc.column1,
    rem.route_code,
    rem.curriculum_year
    FROM (SELECT per.common_username,
    per.initials||' '||p.surname name,
    sr.route_code,
    sr.curriculum_year
    FROM person_open@datahubper,
    current_stu_reg@datahub sr
    WHERE per.person_id = sr.person_id
    AND sr.organisation_code = '&dept'
    AND sr.attendance_code = 'FT') rem,
    local_table loc
    WHERE rem.common_username=loc.username
    AND loc.column_1='X';

Back to top