The scenario of rapidly growing geodata catalogues requires tools focused on facilitate users the choice of products. Having quality fields populated in metadata allow the users to rank and then select the best fit-for-purpose products. In this direction, QualityML is a dictionary that contains hierarchically structured concepts to precisely define and relate quality levels: from quality classes to quality measurements. This levels are used to encode quality semantics for geospatial data by mapping them to the corresponding metadata schemas. The benefits of having encoded quality semantics, in the case of data producers, are related with improvements in their product discovery and better transmission of their characteristics. In the case of data users, they would better compare quality and uncertainty measures to take the best selection of data as well as to perform dataset intercomparison. Also it allows other components (such as visualization, discovery, or comparison tools) to be quality-aware and interoperable.
On one hand, the QualityML is a profile of the ISO geospatial metadata standards (e.g. ISO 19157) providing a set of rules for precisely documenting quality measure parameters that is structured in 5 levels. On the other hand, QualityML includes semantics and vocabularies for the quality concepts. Whenever possible, it uses statistic expressions from the UncertML dictionary (http://www.uncertml.org) encoding. However it also extends UncertML to provide a list of alternative metrics that are commonly used to quantify quality beyond the uncertainty concept. Unfortunatelly the website of UncertML was shotdown in 2016. It still can be queried from the Web Archive project. Given the situation, in 2018 QualityML decided to duplicated all UncertML records that were used at that time.
Finally, keep in mind that QualityML is not just suitable for encoding geospatial dataset level quality but also considers pixel and object level uncertainties. This is done by linking the metadata quality descriptions with layers representing not just the data but the uncertainty values associated with each geospatial element.
This page is structured in the following sections:
All relative URIs in this page refer to http://www.qualityml.org/1.0/
A Quality element is a combination of a quality class, a quality indicator, a quality domain, a quality metric (which include a metrics name, metrics description, metrics parameters, its values and units of measure). The combination of a quality domain and a quality metrics are commonly known as quality measures. We suggest 2 ways to map the QualityML concepts to ISO 19139 and ISO 19115-3. The first way makes a simple use of the ISO values keeping the structure of Record as simple as possible. The second one extends Record to allow a more structured encoding:
Concept | ISO 19139 mapping | ISO 19115-3 mapping | Example |
---|---|---|---|
Quality class and quality indicator |
Name of the DQ_Element | Name of the DQ_Element | ISO 19139: gmd:DQ_CompletenessComission ISO 19115-3: mdq:DQ_CompletenessOmission |
Quality measure name, Quality domain name | gmd:nameOfMeasure | mdq:nameOfMeasure | Excess, non conformance |
Quality measure identification, domain identification | gmd:measureIdentification/gmd:MD_Identifier/gmd:code | mdq:measureIdentification/mcc:MD_Identifier/mcc:code mdq:measureIdentification/mcc:MD_Identifier/mcc:version |
http://qualityml.geoviqua.org/1.0/measure/Excess, http://www.qualityml.org/1.0/domain/NonConformance/ |
Quality measure and domain description (domain value list if necessary) | gmd:measureDescription | mdq:measureDescription | Indication of elements within the dataset or sample that should not have been present, the conformance or non-conformance can be expressed as a boolean, count or rate. Non conformance field of measurement |
Metrics identifier, Metrics parameters | gmd:result/gmd:DQ_QuantitativeResult/gmd:errorStatistic | mdq:result/mdq:DQ_QuantitativeResult/mdq:errorStatistic | http://www.qualityml.org/metrics/items, qml:rate=100 (other options qml:rate, qml:max, qml:indicator, qml:count, ...) |
Metrics values | gmd:value/gco:Record | mdq:value/gco:Record | 66 |
Units of measure | gmd:valueUnit@xlink:href | mdq:valueUnit@xlink:href | urn:ogc:def:uom:OGC:1.0:percent |
Concept | ISO 19139 mapping | ISO 19115-3 mapping | Example |
---|---|---|---|
Quality class and quality indicator |
Name of the DQ_Element | Name of the DQ_Element | ISO 19139: gmd:DQ_CompletenessComission ISO 19115-3: mdq:DQ_CompletenessOmission |
Quality measure name | gmd:nameOfMeasure | mdq:nameOfMeasure | Excess |
Quality measure identification |
gmd:measureIdentification/gmd:MD_Identifier/ gmd:code/gmx:Anchor@xlink:href |
mdq:measureIdentification/mcc:MD_Identifier/mcc:code/gcx:Anchor@xlink:href mdq:measureIdentification/mcc:MD_Identifier/mcc:codeSpace/gcx:Anchor@xlink:href mdq:measureIdentification/mcc:MD_Identifier/mcc:version |
http://qualityml.geoviqua.org/1.0/measure/Excess http://www.qualityml.org (Only in ISO 19115-3) 1.0 (Only in ISO 19115-3) |
Quality measure description | gmd:measureDescription | mdq:measureDescription | Indication of elements within the dataset or sample that should not have been present. The conformance or non-conformance can be expressed as a boolean, count or rate. |
Quality domain | gmd:value/gco:Record/* | mdq:value/gco:Record/* | qmld:NonConformance |
Quality domain parameters |
gmd:value/gco:Record/qmld:NonConformance/* qmld:range/qmld:min and/or qmld:range/qmld:max and/or qmld:rule |
mdq:value/gco:Record/qmld:NonConformance/* qmld:range/qmld:min and/or qmld:range/qmld:max and/or qmld:rule |
qmld:rule: Indication of excess items Usually parameters for the domain are not needed |
Metrics description | gmd:valueType/gco:RecordType | mdq:valueRecordType/gco:RecordType | Excess items |
Metrics identifier | gmd:valueType/gco:RecordType@xlink:href | mdq:valueRecordType/gco:RecordType@xlink:href | http://www.qualityml.org/metrics/items |
Metrics parameters | gmd:value/gco:Record/qml:Items/* | mdq:value/gco:Record/qml:Items/* | qml:rate and qml:max "qml:indicator" or "qml:count" are also options |
Metrics values | gmd:value/gco:Record/qml:Items/qml:rate gmd:value/gco:Record/qml:Items/qml:rate@max |
mdq:value/gco:Record/qml:Items/qml:rate mdq:value/gco:Record/qml:Items/qml:rate@max |
66 100 |
Units of measure | gmd:valueUnit@xlink:href | mdq:valueUnit@xlink:href | urn:ogc:def:uom:OGC:1.0:percent |
The ISO 19157:2013 Geographic information - Data quality defines 7 data quality elements (or classes) describing a certain aspect of the quality of geographic data in which quality elements can be classified.
URI | Name | Parameters | Origin |
---|---|---|---|
Predicted or observed values | threshold level (select of: "above", "below") Formula a la D.28. Copiar a la fitxa |
ISO 19157 | |
Actual Values in the ground truth | threshold level (select of: "above", "below") |
ISO 19157 | |
domain/DifferentialErrors1D | 1D Differential error | ISO 19157 | |
domain/DifferentialErrorsX | 1D Differential error, X | ISO 19157 | |
domain/DifferentialErrorsY | 1D Differential error, Y | ISO 19157 | |
domain/DifferentialErrorsXAboveThreshold | 1D Differential Error Measure above a threshold, X | level (select of: "above", "below") threshold |
ISO 19157 |
domain/DifferentialErrorsYAboveThreshold | 1D Differential Error Measure above a threshold, Y | level (select of: "above", "below") threshold |
ISO 19157 |
domain/DifferentialErrors2D | 2D Differential error |
|
ISO 19157 |
domain/DifferentialErrors2DAboveThreshold | 2D Differential Error Measure above a threshold | level (select of: "above", "below") threshold |
ISO 19157 |
domain/DifferentialErrors3D | 3D Differential error | ISO 19157 | |
domain/DifferentialErrors3DAboveThreshold | 3D Differential Error Measure above a threshold | threshold level (select of: "above", "below") |
ISO 19157 |
domain/DiagonalDifferencialError | Diagonal Differential Error |
|
|
Domain correctness | range | ISO 19157 | |
Domain error | range | ISO 19157 |
XML schemas and examples are available here (for convinience a exact copy of uncertml is included).
URI | Metric | Parameters | Origin |
---|---|---|---|
metrics/items | Items | value (choice of: indicator (boolean), count (int), rate (real, attribute: max (real))) |
ISO 19157 |
metrics/Half-lengthConfidenceInterval | Half-length of the confidence interval | Extends http://www.uncertml.org/statistics/confidence-interval (upper and lower values) adding a half-lenght value | ISO 19157 |
metrics/MeanAbsolute | Mean Absolute Error (MAE) | ISO 19157 | |
metrics/MeanAbsolute2D | Mean Absolute Error (MAE) | ISO 19157 | |
metrics/MeanAbsolute3D | Mean Absolute Error (MAE) | ISO 19157 | |
metrics/MeanBias | Mean Bias | ISO 19157 | |
metrics/NormalizedMeanBias | Mean Bias Error (MBE) | ISO 19157 | |
metrics/RootMeanSquareError | Root mean square error | ISO 19157 | |
metrics/NormalizedRootMeanSquareError | Normalized root mean square error | ISO 19157 | |
metrics/CoefficientOfVariationRootMeanSquareError | coefficient of variation Root Mean Square Error | ISO 19157 | |
metrics/LMAS | Absolute linear error at 90% significance level. Alternative 1 | ISO 19157 | |
metrics/ALE | Absolute linear error at 90% significance level. Alternative 2 | ISO 19157 | |
metrics/ACE | Absolute circular error at 90% significance level. Alternative 1 | ISO 19157 | |
metrics/CMAS | Absolute cirular error at 90% significance level. Alternative 2 | ISO 19157 | |
metrics/ConfidenceEllipse | Confidence ellipse | a, b, angle in Multiple values, confidence | ISO 19157 |
metrics/ConfusionMatrix | Confusion matrix | ISO 19157 | |
http://www.uncertml.org/statistics/covariance-matrix | Covariance Matrix | ISO 19157 | |
metrics/CorrespondenceMatrix | Correspondence Matrix | GVQ | |
metrics/RelativeCorrespondenceMatrix | Relative Correspondence Matrix | GVQ | |
metrics/RelativeError | 1D Relative error | confidence | ISO 19157 |
metrics/RelativeError2D | 2D Relative error | confidence | ISO 19157 |
metrics/NormalizedConfusionMatrix | Normalized confusion matrix | Inspired in http://www.uncertml.org/statistics/confusion-matrix but using double type for count | ISO 19157 |
metrics/KappaCoefficient | Kappa coefficient | ISO 19157 | |
Omission Error | actual categories | GVQ | |
Commission Error | predicted categories | GVQ | |
metrics/Promiscuity | Promiscuity | GVQ | |
metrics/MajorityCategories | Majority Categories | GVQ | |
metrics/Purity | Purity | GVQ | |
metrics/CoefficientOfDetermination | Coefficient of determination | GVQ | |
metrics/DiscreteConfusionMatrix | Discrete Confusion Matrix | GVQ | |
metrics/TruePositive | True Positive | GVQ | |
metrics/TrueNegative | True Negative | GVQ | |
metrics/FalsePositive | False Positive | GVQ | |
metrics/FalseNegative | False Negative | GVQ | |
metrics/Sensitivity | Sensitivity | GVQ | |
metrics/Specificity | Specificity | GVQ | |
metrics/OverallAccuracy | Overall Accuracy | GVQ | |
metrics/FalsePositiveRate | False Positive Rate | GVQ | |
metrics/PositivePredictiveValue | Positive Predictive Value | GVQ | |
metrics/FalseDiscoveryRate | False Discovery Rate | GVQ | |
metrics/MatthewsCorrelationCoefficient | Matthews Correlation Coefficient | GVQ | |
metrics/AreaUnderROCCurve | Area Under ROC Curve | GVQ | |
metrics/Reliability | Reliability | GVQ | |
metrics/AverageReliability | Average Reliability | GVQ | |
metrics/Accuracy | Accuracy | GVQ | |
metrics/AverageAccuracy | Average Accuracy | GVQ |
URI |
Name | Meaning |
---|---|---|
Values |
Parameter or component that more closely represent the actual values measured | |
Quality collection |
Variable that is decomposed in components or parameters | |
Quality composition |
Component that represents a composition of other components for visualization purposes |
The research leading to these results has been carried out in the GeoViQua project that has received funding from the European Union Seventh Framework Programme (FP7/2010-2013) under grant agreement no. 265178