DDE-Module 7 - Cataloguing
DDE-Module 7 - Cataloguing
CATALOGUING
Version 3.2 (November 2020)
© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Lead responsibility for The Data Administrator, overseeing and helping Data
implementing this Stewards / Data Specialists.
standard in each
Data Leader for final sign off.
Government Entity:
When to use this After Data Inventory and Prioritisation has identified the
module of the Dubai current batch of priority datasets, these should be
Data Manual: Catalogued following this standard, ahead of Ingestion into
Dubai Pulse
© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Overview
• This document helps Entities ensure their data is correctly classified as required by
the Dubai Data Law, with additional cataloguing to facilitate the discovery and re-
use of the data. It is an overview standard, that introduces and contextualizes four
more detailed standards that should be used during the cataloguing process: on
Classification, Data Formats, Metadata and Data Quality. It sets out:
• A summary of the requirements for data cataloguing
• An overview of the process within which these four data standards should be
deployed
• How your Entity can demonstrate conformance with this module of the Dubai Data
Manual.
Requirements
It is required that all datasets due to be ingested and published in Dubai Pulse are fully
catalogued. That is, they must be:
It is recommended that departments also take steps to meet more of the Data Quality
recommendations for their most valuable data and include Optional Metadata as
recommended in the Metadata module.
It is required that the Data Leader for a Government Entity should sign a Dubai Data
Compliance Statement, confirming that all requirements of these standards have been
met.
© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License
4
Process
Once the prioritised inventory has been completed and approved by the Dubai Data
Establishment, the preparation for ingestion and publication of datasets should proceed in
batches - starting from the top of the approved priority order.
Each batch should be catalogued following the process described in this document, and
summarized in the diagram below.
© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License
5
CATALOGUING
Step 1
The Data Administrator is responsible for overseeing and helping the Entities’ Data
Stewards and Data Specialists complete steps 2 – 6 below. The Data Administrator should:
• Establish a clear internal timetable for this cataloguing process, aligned with
programme milestones for data ingestion into Dubai Pulse
• Ensure that Data Stewards and Data Specialists are fully briefed on their roles and
on the requirements of this and related modules of the Dubai Data Manual
Facilitate opportunities for Data Stewards and Data Specialists to come together
and exchange experience and lessons learned through the process.
Step 2: Classify
Classify the dataset as either Open Data or Shared Data, with Shared Data further
classified as Sensitive, Confidential or Secret data. Detailed guidance on the process to
follow is given in the module of the Dubai Data Manual on Classification.
Once this is done, you might be left with the original dataset and one or more derived (or
‘child’) datasets which have been modified to allow an Open classification. Both the
original and derived datasets should be catalogued separately in the following steps.
Step 3: Format
Decide on an appropriate format in which to make the data available, and produce a
sample dataset in the format. Detailed guidance on the process to follow is given in the
module of the Dubai Data Manual on Data Formats.
Step 4: Metadata
Describe each dataset with metadata ensuring that all Core Metadata fields in the
Metadata module are complete and as many Optional Metadata fields as can be easily
filled in.
Assess the data against the quality dimensions described in the Data Quality module.
Easy-to-meet recommendations should be met and the Data Steward should produce a
short report detailing the current data quality of the dataset and any known issues and
limitations.
It is recommended that for valuable data, the Data Steward and department should take
steps to implement the best practices recommended guidelines in the Data Quality
module.
Step 6: Combine
All of this information (on classification, format, metadata and quality) should be
associated together in a single ‘dataset’ with the conformant data sample file, the
© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License
6
metadata, and the details of the business processes needed to support quality publication
(what needs doing, who is responsible, timelines).
Once this is complete for each dataset, the Data Steward (who may have been working
with a data specialist and others within the Entity to complete the process) should send
everything to the Data Administrator.
The Data Administrator should review each submission and ensure every dataset in the
current batch has been correctly catalogued using the conformance criteria for each
module (Classification, Quality, Formats, Metadata).
All cataloguing information for the current batch of datasets should be brought together
either in the Data Inventory or a data management portal for further conformance checks
and easy ingestion into the platform.
The Data Leader must satisfy themselves that each dataset has been correctly catalogued,
using the conformance criteria described below.
The Data Leader should add any questions, comments, suggestions or corrections and
send back to the Data Administrator for resolution.
Once the Data Leader is satisfied, the full batch of catalogued datasets should be passed
to the Director General for review, and to receive approval that the Data Leader may sign
a formal Dubai Data Compliance Statement on behalf of the Entity.
Once the Director General and Data Leader are both fully satisfied that the data being
prepared for publication as Open or Shared Data is fully compliant with the Dubai Data
Law and that the relevant standards in the Dubai Data Manual have been correctly
followed by appropriately-trained staff, the Data Leader should sign a Dubai Data
Compliance Statement.
Once everything is resolved and approved, the Data Administrator should send the
Compliance Statement and the complete batch of datasets to the Dubai Data
Establishment for the Data Publishing Acceptance process - a final check, prior to
ingestion into Dubai Pulse.
Conformance
The Data Leader must be satisfied that each dataset in the batch has been catalogued
correctly, by reviewing – and then signing off, on behalf of the Entity, a Dubai Data
Compliance Statement confirming that:
© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License
7
• Each dataset:
- Has a classification. In cases where the classification is not Open, then a clear
rationale for this must be documented and at least one Open derivative dataset
must be provided (or an explanation for why this is not possible).
- Has a data quality assessment report
- Has a sample dataset in the appropriate format
- Has all Core Metadata as defined in the Metadata module
- Has easy-to-add and appropriate Optional Metadata.
© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License