Example of DOV search methods for generic WFS layers¶
Use cases explained below¶
Get data in a bounding box
Get data with specific properties
Get data in a bounding box based on specific properties
Select data and return a subset of columns
Using sorting and limiting to find the most recent data
Combining attribute queries to limit your results
[1]:
%matplotlib inline
import inspect, sys
import warnings; warnings.simplefilter('ignore')
[2]:
# check pydov path
import pydov
Next to the predefined datatypes from pydov, one can also query any WFS layer available in DOV using pydov. This allows for the same workflow and search methods to be used to query all vectordata we publish. To check which layers are available, consult our metadata catalogue.
Get information about the datatype¶
When instantiating a WfsSearch
instance, one has to provide the workspace-qualified layer name of the WFS service one would like to query:
[3]:
from pydov.search.generic import WfsSearch
wfs_search = WfsSearch('pfas:pfas_analyseresultaten')
Once instantiated, one can request a description of the dataset:
[4]:
wfs_search.get_description()
[4]:
'PFAS analyseresultaten aangeleverd door bodemsaneringsdeskundigen en opgenomen in de OVAM bodemdatabank.'
And a list of available fields:
[5]:
fields = wfs_search.get_fields()
# print available fields
for f in fields.values():
print(f['name'])
id
opdracht
pfasdossiernr
profielnaam
top_in_m
basis_in_m
jaar
datum
parameter
detectieconditie
meetwaarde
meeteenheid
medium
profieltype
plaatsing_profiel
commentaar
x_ml72
y_ml72
geom
You can get more information of a field by requesting it from the fields dictionary:
name: name of the field
definition: definition of this field, if available
cost: for generic WFS searches, this will be 1 in all cases
notnull: whether the field is mandatory or not
type: datatype of the values of this field
[6]:
fields['top_in_m']
[6]:
{'name': 'top_in_m',
'definition': 'De diepte t.o.v het maaiveld (top) van het geanalyseerde staal',
'type': 'float',
'notnull': False,
'query': True,
'cost': 1}
Example use cases¶
Get data in a bounding box¶
Get data for all features that are geographically located within the bounds of the specified box.
The coordinates are in the Belgian Lambert72 (EPSG:31370) coordinate system and are given in the order of lower left x, lower left y, upper right x, upper right y.
[7]:
from pydov.util.location import Within, Box
df = wfs_search.search(location=Within(Box(143400, 217000, 144000, 217200)))
df.head()
[000/001] .
[7]:
id | opdracht | pfasdossiernr | profielnaam | top_in_m | basis_in_m | jaar | datum | parameter | detectieconditie | meetwaarde | meeteenheid | medium | profieltype | plaatsing_profiel | commentaar | x_ml72 | y_ml72 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 32784976 | 13935544 | 61823 | 47 | 0.0 | 0.2 | 2021 | 2021-12-16 | HFPO-DA | < | 1.0 | µg/kg ds | Vaste deel van de aarde | Boring | 2021-12-10 | 143874.97 | 217045.51 | |
1 | 32784977 | 13935544 | 61823 | 47 | 0.0 | 0.2 | 2021 | 2021-12-16 | MePFOSAtotaal | < | 0.5 | µg/kg ds | Vaste deel van de aarde | Boring | 2021-12-10 | 143874.97 | 217045.51 | |
2 | 32784978 | 13935544 | 61823 | 47 | 0.0 | 0.2 | 2021 | 2021-12-16 | PFHpS | < | 0.2 | µg/kg ds | Vaste deel van de aarde | Boring | 2021-12-10 | 143874.97 | 217045.51 | |
3 | 32784979 | 13935544 | 61823 | 47 | 0.0 | 0.2 | 2021 | 2021-12-16 | 4:2 FTS | < | 0.2 | µg/kg ds | Vaste deel van de aarde | Boring | 2021-12-10 | 143874.97 | 217045.51 | |
4 | 32784980 | 13935544 | 61823 | 47 | 0.0 | 0.2 | 2021 | 2021-12-16 | PFDA | < | 0.2 | µg/kg ds | Vaste deel van de aarde | Boring | 2021-12-10 | 143874.97 | 217045.51 |
Get data with specific properties¶
Next to querying data based on its geographic location within a bounding box, we can also search for data matching a specific set of properties. For this we can build a query using a combination of the available fields and operators provided by the WFS protocol.
A list of possible operators can be found below:
[8]:
[i for i,j in inspect.getmembers(sys.modules['owslib.fes2'], inspect.isclass) if 'Property' in i]
[8]:
['PropertyIsBetween',
'PropertyIsEqualTo',
'PropertyIsGreaterThan',
'PropertyIsGreaterThanOrEqualTo',
'PropertyIsLessThan',
'PropertyIsLessThanOrEqualTo',
'PropertyIsLike',
'PropertyIsNotEqualTo',
'PropertyIsNull',
'SortProperty']
In this example we build a query using the PropertyIsEqualTo operator to find all data for the parameter ‘PFDA’:
[9]:
from owslib.fes2 import PropertyIsEqualTo
query = PropertyIsEqualTo(propertyname='parameter',
literal='PFDA')
df = wfs_search.search(query=query)
df.head()
[000/002] ..
[9]:
id | opdracht | pfasdossiernr | profielnaam | top_in_m | basis_in_m | jaar | datum | parameter | detectieconditie | meetwaarde | meeteenheid | medium | profieltype | plaatsing_profiel | commentaar | x_ml72 | y_ml72 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 31063085 | 13077062 | 6180 | PB31 | 0.2 | 2.2 | 2021 | 2021-06-16 | PFDA | < | 0.02 | µg/l | Grondwater | Peilbuis | NaN | 237529.0 | 204908.0 | |
1 | 31063205 | 13077062 | 6180 | 108 | 0.5 | 0.7 | 2021 | 2021-06-01 | PFDA | < | 0.20 | µg/kg ds | Vaste deel van de aarde | Boring | 2021-05-21 | 237521.0 | 204927.0 | |
2 | 31063324 | 13077062 | 6180 | 109 | 0.5 | 0.7 | 2021 | 2021-05-28 | PFDA | < | 0.20 | µg/kg ds | Vaste deel van de aarde | Boring | 2021-05-21 | 237504.0 | 204955.0 | |
3 | 31063593 | 13077062 | 6180 | PB32 | 0.5 | 2.5 | 2021 | 2021-06-16 | PFDA | < | 0.02 | µg/l | Grondwater | Peilbuis | NaN | 237506.0 | 204991.0 | |
4 | 31151190 | 13123519 | 22248 | P101 | 1.5 | 3.5 | 2021 | 2021-06-30 | PFDA | < | 1.00 | ng/l | Grondwater | Peilbuis | 2021-06-11 | 97442.0 | 170962.0 |
Get data in a bounding box based on specific properties¶
We can combine a query on attributes with a query on geographic location to get the data within a bounding box that have specific properties.
The following example requests the data for the parameter PFDA within the given bounding box.
(Note that the datatype of the literal parameter should be a string, regardless of the datatype of this field in the output dataframe.)
[10]:
from pydov.util.location import Within, Box
df = wfs_search.search(
query=PropertyIsEqualTo(propertyname='parameter', literal='PFDA'),
location=Within(Box(143400, 217000, 144000, 217200))
)
df.head()
[000/001] .
[10]:
id | opdracht | pfasdossiernr | profielnaam | top_in_m | basis_in_m | jaar | datum | parameter | detectieconditie | meetwaarde | meeteenheid | medium | profieltype | plaatsing_profiel | commentaar | x_ml72 | y_ml72 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 32784980 | 13935544 | 61823 | 47 | 0.0 | 0.2 | 2021 | 2021-12-16 | PFDA | < | 0.2 | µg/kg ds | Vaste deel van de aarde | Boring | 2021-12-10 | 143874.97 | 217045.51 | |
1 | 32785041 | 13935544 | 61823 | 44 | 0.0 | 0.2 | 2021 | 2021-12-16 | PFDA | < | 0.2 | µg/kg ds | Vaste deel van de aarde | Boring | 2021-12-10 | 143449.23 | 217048.64 | |
2 | 32785070 | 13935544 | 61823 | PB7 | 0.0 | 0.0 | 2022 | 2022-01-18 | PFDA | < | 1.0 | ng/l | Grondwater | Peilbuis | 2021-12-10 | 143622.88 | 217056.85 | |
3 | 32785099 | 13935544 | 61823 | 46 | 0.0 | 0.2 | 2021 | 2021-12-16 | PFDA | < | 0.2 | µg/kg ds | Vaste deel van de aarde | Boring | 2021-12-10 | 143747.63 | 217023.63 | |
4 | 32785131 | 13935544 | 61823 | 40 | 6.1 | 7.1 | 2021 | 2021-12-20 | PFDA | < | 1.0 | ng/l | Grondwater | Peilbuis | 2019-10-10 | 143654.00 | 217032.00 |
Select data and return a subset of columns¶
We can limit the columns in the output dataframe by specifying the return_fields parameter in our search.
In this example we query all the data in a bounding box, but only return some of the fields:
[11]:
from pydov.util.location import Within, Box
df = wfs_search.search(
location=Within(Box(143400, 217000, 144000, 217200)),
return_fields=['datum', 'x_ml72', 'y_ml72', 'parameter',
'detectieconditie', 'meetwaarde', 'meeteenheid', 'medium']
)
df.head()
[000/001] .
[11]:
datum | x_ml72 | y_ml72 | parameter | detectieconditie | meetwaarde | meeteenheid | medium | |
---|---|---|---|---|---|---|---|---|
0 | 2021-12-16 | 143874.97 | 217045.51 | HFPO-DA | < | 1.0 | µg/kg ds | Vaste deel van de aarde |
1 | 2021-12-16 | 143874.97 | 217045.51 | MePFOSAtotaal | < | 0.5 | µg/kg ds | Vaste deel van de aarde |
2 | 2021-12-16 | 143874.97 | 217045.51 | PFHpS | < | 0.2 | µg/kg ds | Vaste deel van de aarde |
3 | 2021-12-16 | 143874.97 | 217045.51 | 4:2 FTS | < | 0.2 | µg/kg ds | Vaste deel van de aarde |
4 | 2021-12-16 | 143874.97 | 217045.51 | PFDA | < | 0.2 | µg/kg ds | Vaste deel van de aarde |
Using sorting and limiting to find the most recent data¶
You can use sorting and limiting to find the highest, deepest, oldest, newest, … data, depending on the available fields.
In this example we search for the 100 most recent records:
[12]:
from owslib.fes2 import SortBy, SortProperty
df = wfs_search.search(
sort_by=SortBy([SortProperty('datum', 'DESC')]),
max_features=100
)
df.head()
[000/001] .
[12]:
id | opdracht | pfasdossiernr | profielnaam | top_in_m | basis_in_m | jaar | datum | parameter | detectieconditie | meetwaarde | meeteenheid | medium | profieltype | plaatsing_profiel | commentaar | x_ml72 | y_ml72 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 35394311 | 15168013 | 101686 | 1 | 3.75 | 4.75 | 2023 | 2023-07-28 | PFOA | < | 0.01 | µg/l | Grondwater | Peilbuis | 2022-09-13 | 106197.00 | 190023.00 | |
1 | 35518388 | 15221025 | 101854 | PB15 | 7.00 | 8.00 | 2023 | 2023-07-28 | PFOStotal | < | 50.00 | ng/l | Grondwater | Peilbuis | 2023-07-12 | 174467.47 | 176886.15 | |
2 | 35394227 | 15168013 | 101686 | 1 | 3.75 | 4.75 | 2023 | 2023-07-28 | PFECHS | < | 0.01 | µg/l | Grondwater | Peilbuis | 2022-09-13 | 106197.00 | 190023.00 | |
3 | 35394310 | 15168013 | 101686 | 1 | 3.75 | 4.75 | 2023 | 2023-07-28 | PFHxA | < | 0.01 | µg/l | Grondwater | Peilbuis | 2022-09-13 | 106197.00 | 190023.00 | |
4 | 35518378 | 15221025 | 101854 | PB15 | 7.00 | 8.00 | 2023 | 2023-07-28 | EU DWRL-20 | < | 50.00 | ng/l | Grondwater | Peilbuis | 2023-07-12 | 174467.47 | 176886.15 |
Combining attribute queries to limit your results¶
You can combine multiple attribute queries to construct and advanced query to search for exactly what you’re looking for. This will be more performant than requesting more data and doing the filtering afterwards.
In this example we search for data in given bounding box, for a given year, parameter and medium which exceeds a certain value:
[13]:
from pydov.util.location import Within, Box
from owslib.fes2 import PropertyIsEqualTo, PropertyIsGreaterThan, And
df = wfs_search.search(
query=And([
PropertyIsEqualTo(propertyname='jaar', literal='2022'),
PropertyIsEqualTo(propertyname='parameter', literal='PFDA'),
PropertyIsEqualTo(propertyname='medium', literal='Grondwater'),
PropertyIsGreaterThan(propertyname='meetwaarde', literal='0.5'),
]),
location=Within(Box(143400, 217000, 144000, 217200))
)
df.head()
[000/001] .
[13]:
id | opdracht | pfasdossiernr | profielnaam | top_in_m | basis_in_m | jaar | datum | parameter | detectieconditie | meetwaarde | meeteenheid | medium | profieltype | plaatsing_profiel | commentaar | x_ml72 | y_ml72 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 32785070 | 13935544 | 61823 | PB7 | 0.0 | 0.0 | 2022 | 2022-01-18 | PFDA | < | 1.0 | ng/l | Grondwater | Peilbuis | 2021-12-10 | 143622.88 | 217056.85 |