User guide¶
Quickstart¶
Add sample checklist repository¶
After having installed both this plugin and the QGIS Resource Sharing plugin, add a new repo to QGIS Resource Sharing:
-
Navigate to Plugins -> Resource Sharing -> Resource Sharing
-
On the QGIS Resource Sharing dialog, go to Settings -> Add repository...
-
Add the following repository:
- Name: QGIS Dataset QA Workbench demo
- URL: https://github.com/kartoza/qgis_dataset_qa_workbench.git
- Authentication: (Leave it blank)
-
The new QGIS Dataset QA Workbench demo repository shall now be displayed by the Resource Sharing plugin
-
Navigate to the All collections section and look for an entry named QGIS Dataset QA Workbench demo
-
Press the Install button. The Resource Sharing plugin proceeds to download and install some sample checklists to the
{qgis-user-profile-dir}/checklists
directory.
Choose checklist to perform validation with¶
-
Open the QGIS Dataset QA Workbench dock (navigate to Plugins -> Dataset QA Workbench -> Dataset QA Workbench or click the plugin's icon ) and navigate to the Choose Checklist tab
-
Inside the plugin dock, navigate to Choose Checklist -> Choose....
In the dialog that opens, select one of the existing checklists. Take into account the dataset type that it is applicable to (document, vector or raster) and the artifact that it applies to (dataset, metadata or style). Click the OK button to close this dialog. The checklist is loaded and is ready to use.
-
Depending on the loaded checklist and its dataset and artifact types, select if you want to:
-
a) Validate one of the currently loaded layers. If so, be sure to select it on the list of layers shown in the plugin dock
-
b) Validate an external file, by indicating its path on the local filesystem
-
Upon selecting one of these options, both the Perform Validation and Generate Report tabs become selectable
Perform validation of a resource¶
Move over to the Perform Validation tab where you are presented with a list of checklist steps to be validated. For each of these checks:
-
Read the description in order to understand what the current check is about
-
A checklist check can be validated in one of two ways:
- Manually - Follow the instructions provided by the guide section. These should be detailed and practical enough in order to allow you to properly validate the current checklist check.
-
Automatically - If applicable, the validation may be performed by pressing one of the two buttons present on the automation section.
-
Run - Perform validation by using whatever predefined parameters have been used by the checklist's designer
-
Configure and run... - Configure the check's validation parameters and then run the validation procedure
-
-
After performing validation, you may optionally click the Validation notes section and type down any relevant notes about the process.
Generate validation report¶
After having validated all of the checklist's checks, move over to the Generate Report tab. This tab displays a summary of the validation process, with information related to:
- the dataset being validated
- the overall validation result
-
result of each check
-
Customize the report's Validated by field
By default, the report uses whatever value is automatically generated by
QGIS in its global user_full_name
variable. If you wish to provide a
different name:
1. Navigate to _Settings -> Options... -> Variables_
2. Define a new variable named `dataset_qa_workbench`
3. Set an appropriate value. The plugin will use it as the author of
validation reports
- If applicable, the checklist may specify a post-validation action. In this case, the Run post validation and Configure and run post validation... buttons will be enabled.
Post validation actions may be used for providing confirmation of the validation procedure to some third-party. Some examples include emailing the validation report to a list of recipients or POSTing the report to some centralized host by using a suitable REST API
-
When checklists are designed to automatically share output reports, additional variables must be configured within QGIS in order for the reports to be shared effectively, outlined as follows:
-
Report poster: Sends the report to a remote host using an http POST
dataset_qa_workbench_auth_config_id
(optional): the QGIS AuthID, as configured with the QGIS authentication manager and linked to the current user profile, represented as a string value, e.g.'qauth01'
, and used to authenticate with the remote host (where required).dataset_qa_workbench_endpoint
: the REST endpoint URL, represented as a string value, e.g.'https://service.example.com/REST'
.
-
Report mailer: Sends the report to recipients via email
dataset_qa_workbench_sender_address
: email address of the sender, used to authenticate with the mail server and given as a string value, e.g.'noreply@example.com'
dataset_qa_workbench_sender_password
: sender address password for mailserver, used to authenticate with the mail server and given as a string value, e.g.'S3cret'
dataset_qa_workbench_recipients
: list of intended recipients, given as a single comma separated string, e.g.'user01@example.com,user02@example.com,user03@example.com'
dataset_qa_workbench_smtp_host
: SMTP mailserver host address, default'smtp.gmail.com'
dataset_qa_workbench_smtp_port
: SMTP port number as an integer, default587
dataset_qa_workbench_smtp_secure_connection
: a string value which describes the mail server connection security type. Valid values are'starttls'
(default) and'ssl'
. Use a blank string,''
, to enforce no security policy (i.e. connect over http)
-
Note that all of these elements may be configured globally for the current QGIS user profile under the menu item for Settings -> Options... -> Variables.
-
If you are validating a loaded layer, the Add validation report to layer metadata button will be enabled. In this case you have the option to include the validation report in the layer's metadata. This modifies existing layer metadata in two ways:
-
The full validation report is appended to the end of the metadata's Abstract field. Note that additional presses of the Add validation report to layer metadata button cause the new report to be appended to whatever was already written on the Abstract field (including any previous validation reports that might be there)
-
A new line is also appended to the metadata's History section. This include's the validation report's generation timestamp and the overall validation result
-
-
Finally, the report may also be saved as a PDF file by selecting an appropriate destination path in the Save validation report to text box and then pressing the Save button
Creating new checklists¶
Checklists are stored locally on the QGIS user profile directory (accessible from QGIS by navigating to Settings -> User Profiles -> Open Active Profile Folder...) under the checklists directory.
Installing a new checklist is simply a matter of placing a suitable file in this directory.
Checklists are stored in json format. They are defined as JSON objects and must conform to a predefined checklist schema. Example checklist definition:
{
"name": "Sample checklist with action for emailing validation report",
"description": "This is just a sample checklist - be sure to delete it\n\nThis also demonstrates sending emails with the report of validation",
"dataset_type": "vector",
"validation_artifact_type": "dataset",
"checks": [
{
"name": "geometry is valid",
"description": "Layer's geometry does not have invalid geometries.",
"guide": "Navigate to Vector -> Geometry tools -> Check Validity... and run the validity analysis tool. Afterwards check that there are no features on the `invalid output` layer",
"automation": {
"algorithm_id": "qgis:checkvalidity",
"artifact_parameter_name": "INPUT_LAYER",
"output_name": "INVALID_COUNT",
"negate_output": true
}
},
{
"name": "CRS is EPSG:4326",
"description": "Layer's Coordinate Reference System is lat-lon on WGS84 datum (i.e. EPSG code 4326)",
"guide": "Open the layer properties dialog, then navigate to the 'information' tab (should be the first one) and in the section called 'Information from provider' check if the 'CRS' field has a value of 'EPSG:4326 - WGS84 - Geographic'",
"automation": {
"algorithm_id": "dataset_qa_workbench:crschecker",
"artifact_parameter_name": "INPUT_LAYER",
"output_name": "OUTPUT",
"negate_output": false,
"extra_parameters": {
"INPUT_CRS": "EPSG:4326"
}
}
}
],
"report": {
"algorithm_id": "dataset_qa_workbench:reportmailer"
}
}
This plugin's code repository also features a collection of sample checklists that may be studied in order to get a better grasp on how to define new checklists.
Each checklist has the following mandatory properties:
-
name
- Name of the checklist. This is used as the checklist identifier in the QGIS UI, therefore a checklist's name must be unique; -
description
- A short text explaining what the checklist is about; -
dataset_type
- The type of dataset that this checklist operates on. It must be one of:document
;raster
;vector
.
-
validation_artifact_type
- The type of artifact that this checklist operates on. It must be one of:dataset
;metadata
;style
.
A checklist may also have the following optional properties:
-
checks
- A checklist may have a list of checks, that describe each individual validation step.-
Each checklist check is defined as a JSON object. It has the following mandatory properties:
name
- Name of the checklist step;description
- A short description explaining what the check is about;guide
- Small text specifying how a human operator might go ahead and validate this check.
-
A checklist check may also have the following optional properties:
automation
- A JSON object that contains the configuration for the automated execution of this validation check.
The
automation
object has the following mandatory properties:- `algorithm_id` - Identifier of the QGIS Processing algorithm used to perform automation. It takes the form `provider:algorithm` (e.g. `qgis:checkvalidity`). It can be retrieved from the QGIS Processing toolbox by resting the mouse pointer on top of the desired algorithm
The
automation
object may have the following optional properties:- `artifact_parameter_name` - Name of the algorithm parameter that specifies which of the algorithm's parameters represents the artifact currently being validated (e.g. `INPUT`, `INPUT_LAYER`). If not specified, it will default to `INPUT_LAYER` - This value may be retrieved from the Processing algorithm by opening up the algorithm’s dialog and resting the mouse pointer on the relevant input; - `output_name` - Name of the algorithm parameter that specifies which one of the algorithm's outputs holds the result of the validation. If not specified, it will default to `OUTPUT`. This value may also be retrieved from the algorithms dialog, in a similar fashion as the `artifact_parameter_name` property; - `negate_output` - Whether to interpret a falsy result coming from the processing algorithm as a sign of success in the validation. This is sometimes desirable. Example: the `qgis:checkvalidity` algorithm will return zero invalid features if a layer does not have any invalid geometries. In this case, the zero must be interpreted as a successful validation; - `extra_parameters` - Any additional parameters necessary for running the processing algorithm. These may be used for configuring other stuff related to the algorithm, such as the geometry validity method (in the case of the `qgis:checkvalidity` algorithm). They are passed straight to Processing. This property must be a JSON object.
-
-
report
- A JSON object with configuration of for a post validation action. This action is implemented by means of an additional Processing algorithm, which is fed the generated validation report as an input, was well as any other parameters that may be needed. Areport
has the following mandatory properties:algorithm_id
- Identifier of the QGIS Processing algorithm (or model) used to perform the post validation. It is specified in a similar way as thecheck.algorithm_id
property, specified above
A report
may also have the following optional properties:
- `extra_parameters` - Any additional parameters necessary for running
the processing algorithm. These may be used for configuring other
stuff related to the algorithm and have a similar description as the one
mentioned above for the `check.extra_parameters` property
Processing algorithms suitable for use in checklist steps¶
In order to be suitable for use as an automated validation step, a Processing
algorithm must define some output that can be used for attesting whether
the step succeeds or not. This means that it must be possible to convert the
output to a True/False
value.
Most default QGIS Processing algorithms simply output a map layer with their
results. These are not suitable for use in automated checklist validation
steps as there is no clear way to determine the success condition for the
validation. Other algorithms, like the qgis:checkvalidity
algorithm, output
both map layers and suitable numeric output results. These are suitable for
using for automated validation.
The QGIS Dataset QA Workbench algorithm provides some custom Processing algorithms that are specifically tailored for automated validation. These are also likely to increase in number in the future, as new version of the plugin are released. The current list of algorithms is:
-
dataset_qa_workbench:crschecker
- Allows checking if a layer's CRS matches an expected value -
dataset_qa_workbench:xmlchecker
- Allows checking if an XML file has the specified elements/attributes/values
You may also design your own custom Processing algorithms and then distribute them via the QGIS Resource Sharing plugin so that users can use them together with checklists from QGIS Dataset QA Workbench plugin.
Processing algorithms suitable for use in post validation actions¶
In a similar fashion to the algorithms used for automating validation, the Processing algorithms that can be used to execute post validation actions also have specific requirements. It is not possible to reuse the standard QGIS Processing algorithms for this purpose. This plugin also ships with some suitable post validation algorithms, and it is also likely that future releases will expand on the list. Current algorithms suitable for post validation actions:
-
qa_dataset_workbench:reportmailer
- Allows sending a copy of the validation report via email -
qa_dataset_workbench:reportposter
- Allows posting the validation report to an online server that implements a suitable REST API.
Validating custom checklists¶
The structure of checklists is formally defined in a JSON Schema file that is part of the plugin’s source code. This file can be inspected at:
https://raw.githubusercontent.com/kartoza/qgis_dataset_qa_workbench/master/schemas/checklist-check.json
JSON Schema files can be used to validate that a specific json object validates the schema. As such, it is desirable that every custom checklist be validated against the schema in order to ensure it works with the QGIS Dataset QA Workbench plugin.
This validation may be done by either:
- using the
jsonschema
python package. Example:
# validating checklist with local jsonschema package
pipenv run jsonschema -i checklist-file.json schemas/checklist-check.json
- using any online json schema validator, such as the one available at:
https://www.jsonschemavalidator.net/
- Your IDE might also support validating json files with a json schema.
Sharing checklists and Processing algorithms with other users¶
This plugin leans on the capabilities provided by the QGIS Resource Sharing plugin and thus users are able to leverage that to share:
- Checklists
- Processing algorithms
- Processing models
We recommend setting up a git repository with the following structure:
my-shareable-qgis-resources/
metadata.ini
collections/
my-shareable-collection/
checklists/
checklist1.json
checklist2.json
processing/
algorithm1.py
algorithm2.py
models/
model1.model3
model2.model3
Then put all checklists, algorithms, etc that are to be shared in there and simply provide the git repo's URL to your users. Consult the documentation of the QGIS Resource Sharing plugin for further information on this.