Example of DOV search methods for generic WFS layers

Binder

Use cases explained below

  • Get data in a bounding box

  • Get data with specific properties

  • Get data in a bounding box based on specific properties

  • Select data and return a subset of columns

  • Using sorting and limiting to find the most recent data

  • Combining attribute queries to limit your results

[1]:
%matplotlib inline
import inspect, sys
import warnings; warnings.simplefilter('ignore')
[2]:
# check pydov path
import pydov

Next to the predefined datatypes from pydov, one can also query any WFS layer available in DOV using pydov. This allows for the same workflow and search methods to be used to query all vectordata we publish. To check which layers are available, consult our metadata catalogue.

Get information about the datatype

When instantiating a WfsSearch instance, one has to provide the workspace-qualified layer name of the WFS service one would like to query:

[3]:
from pydov.search.generic import WfsSearch
wfs_search = WfsSearch('pfas:pfas_analyseresultaten')

Once instantiated, one can request a description of the dataset:

[4]:
wfs_search.get_description()
[4]:
'PFAS analyseresultaten aangeleverd door bodemsaneringsdeskundigen en opgenomen in de OVAM bodemdatabank.'

And a list of available fields:

[5]:
fields = wfs_search.get_fields()

# print available fields
for f in fields.values():
    print(f['name'])
id
opdracht
pfasdossiernr
profielnaam
top_in_m
basis_in_m
jaar
datum
parameter
detectieconditie
meetwaarde
meeteenheid
medium
profieltype
plaatsing_profiel
commentaar
x_ml72
y_ml72
geom

Alternatively, you can list all the fields and their details by inspecting the get_fields() output or the search instance itself in a notebook:

[6]:
wfs_search
[6]:
pydov.search.generic.WfsSearch

PFAS analyseresultaten aangeleverd door bodemsaneringsdeskundigen en opgenomen in de OVAM bodemdatabank.

id - id

  • type: integer
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

opdracht - ID van het rapport in de OVAM databank waaruit het analyseresultaat afkomstig is

  • type: integer
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

pfasdossiernr - ID van het dossier in de OVAM databank waarin het rapport (opdracht) is opgenomen

  • type: integer
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

profielnaam - Verwijzing naar het profiel waaruit staal en analyseresultaat afkomstig is

  • type: string
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

top_in_m - De diepte t.o.v het maaiveld (top) van het geanalyseerde staal

  • type: float
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

basis_in_m - De diepte t.o.v het maaiveld (basis) van het geanalyseerde staal

  • type: float
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

jaar - Jaartal analyseresultaat

  • type: integer
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

datum - Datum analyseresultaat

  • type: date
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

parameter - Geanalyseerde parameter

  • type: string
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

detectieconditie - <, >, =

  • type: string
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

meetwaarde - Meetwaarde geanalyseerde parameter

  • type: float
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

meeteenheid - Eenheid meetwaarde geanalyseerde parameter

  • type: string
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

medium - Medium waaruit de staalname afkomstig is

  • type: string
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

profieltype - Soort profiel

  • type: string
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

plaatsing_profiel - Datum plaatsing profiel

  • type: date
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

commentaar - optioneel commentaarveld

  • type: string
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

x_ml72 - X-coördinaat

  • type: float
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

y_ml72 - Y-coördinaat

  • type: float
  • notnull: False
  • query: True
  • cost: 1
  • multivalue: False

geom - None

  • type: geometry
  • notnull: False
  • query: False
  • cost: 1
  • multivalue: False

You can get more information of a field by requesting it from the fields dictionary:

  • name: name of the field

  • definition: definition of this field, if available

  • cost: for generic WFS searches, this will be 1 in all cases

  • notnull: whether the field is mandatory or not

  • type: datatype of the values of this field

  • codelist: optionally, a codelist that describes the possible values of this field

Alternatively, you can list all the fields and their details by inspecting the get_fields() output or the search instance itself in a notebook:

Example use cases

Get data in a bounding box

Get data for all features that are geographically located within the bounds of the specified box.

The coordinates are in the Belgian Lambert72 (EPSG:31370) coordinate system and are given in the order of lower left x, lower left y, upper right x, upper right y.

[7]:
from pydov.util.location import Within, Box

df = wfs_search.search(location=Within(Box(143400, 217000, 144000, 217200, epsg=31370)))
df.head()
[000/001] .
[7]:
id opdracht pfasdossiernr profielnaam top_in_m basis_in_m jaar datum parameter detectieconditie meetwaarde meeteenheid medium profieltype plaatsing_profiel commentaar x_ml72 y_ml72
0 32784976 13935544 61823 47 0.0 0.2 2021 2021-12-16 HFPO-DA < 1.0 µg/kg ds Vaste deel van de aarde Boring 2021-12-10 143874.97 217045.51
1 32784977 13935544 61823 47 0.0 0.2 2021 2021-12-16 MePFOSAtotaal < 0.5 µg/kg ds Vaste deel van de aarde Boring 2021-12-10 143874.97 217045.51
2 32784978 13935544 61823 47 0.0 0.2 2021 2021-12-16 PFHpS < 0.2 µg/kg ds Vaste deel van de aarde Boring 2021-12-10 143874.97 217045.51
3 32784979 13935544 61823 47 0.0 0.2 2021 2021-12-16 4:2 FTS < 0.2 µg/kg ds Vaste deel van de aarde Boring 2021-12-10 143874.97 217045.51
4 32784980 13935544 61823 47 0.0 0.2 2021 2021-12-16 PFDA < 0.2 µg/kg ds Vaste deel van de aarde Boring 2021-12-10 143874.97 217045.51

Get data with specific properties

Next to querying data based on its geographic location within a bounding box, we can also search for data matching a specific set of properties. For this we can build a query using a combination of the available fields and operators provided by the WFS protocol.

A list of possible operators can be found below:

[8]:
[i for i,j in inspect.getmembers(sys.modules['owslib.fes2'], inspect.isclass) if 'Property' in i]
[8]:
['PropertyIsBetween',
 'PropertyIsEqualTo',
 'PropertyIsGreaterThan',
 'PropertyIsGreaterThanOrEqualTo',
 'PropertyIsLessThan',
 'PropertyIsLessThanOrEqualTo',
 'PropertyIsLike',
 'PropertyIsNotEqualTo',
 'PropertyIsNull',
 'SortProperty']

In this example we build a query using the PropertyIsEqualTo operator to find all data for the parameter ‘PFDA’:

[9]:
from owslib.fes2 import PropertyIsEqualTo

query = PropertyIsEqualTo(propertyname='parameter',
                          literal='PFDA')
df = wfs_search.search(query=query)

df.head()
[000/005] .....
[9]:
id opdracht pfasdossiernr profielnaam top_in_m basis_in_m jaar datum parameter detectieconditie meetwaarde meeteenheid medium profieltype plaatsing_profiel commentaar x_ml72 y_ml72
0 31063085 13077062 6180 PB31 0.2 2.2 2021 2021-06-16 PFDA < 0.02 µg/l Grondwater Peilbuis NaN 237529.0 204908.0
1 31063205 13077062 6180 108 0.5 0.7 2021 2021-06-01 PFDA < 0.20 µg/kg ds Vaste deel van de aarde Boring 2021-05-21 237521.0 204927.0
2 31063324 13077062 6180 109 0.5 0.7 2021 2021-05-28 PFDA < 0.20 µg/kg ds Vaste deel van de aarde Boring 2021-05-21 237504.0 204955.0
3 31063593 13077062 6180 PB32 0.5 2.5 2021 2021-06-16 PFDA < 0.02 µg/l Grondwater Peilbuis NaN 237506.0 204991.0
4 31151190 13123519 22248 P101 1.5 3.5 2021 2021-06-30 PFDA < 1.00 ng/l Grondwater Peilbuis 2021-06-11 97442.0 170962.0

Get data in a bounding box based on specific properties

We can combine a query on attributes with a query on geographic location to get the data within a bounding box that have specific properties.

The following example requests the data for the parameter PFDA within the given bounding box.

(Note that the datatype of the literal parameter should be a string, regardless of the datatype of this field in the output dataframe.)

[10]:
from pydov.util.location import Within, Box

df = wfs_search.search(
    query=PropertyIsEqualTo(propertyname='parameter', literal='PFDA'),
    location=Within(Box(143400, 217000, 144000, 217200, epsg=31370))
)
df.head()
[000/001] .
[10]:
id opdracht pfasdossiernr profielnaam top_in_m basis_in_m jaar datum parameter detectieconditie meetwaarde meeteenheid medium profieltype plaatsing_profiel commentaar x_ml72 y_ml72
0 32784980 13935544 61823 47 0.0 0.2 2021 2021-12-16 PFDA < 0.2 µg/kg ds Vaste deel van de aarde Boring 2021-12-10 143874.97 217045.51
1 32785041 13935544 61823 44 0.0 0.2 2021 2021-12-16 PFDA < 0.2 µg/kg ds Vaste deel van de aarde Boring 2021-12-10 143449.23 217048.64
2 32785070 13935544 61823 PB7 0.0 0.0 2022 2022-01-18 PFDA < 1.0 ng/l Grondwater Peilbuis 2021-12-10 143622.88 217056.85
3 32785099 13935544 61823 46 0.0 0.2 2021 2021-12-16 PFDA < 0.2 µg/kg ds Vaste deel van de aarde Boring 2021-12-10 143747.63 217023.63
4 32785131 13935544 61823 40 6.1 7.1 2021 2021-12-20 PFDA < 1.0 ng/l Grondwater Peilbuis 2019-10-10 143654.00 217032.00

Select data and return a subset of columns

We can limit the columns in the output dataframe by specifying the return_fields parameter in our search.

In this example we query all the data in a bounding box, but only return some of the fields:

[11]:
from pydov.util.location import Within, Box

df = wfs_search.search(
    location=Within(Box(143400, 217000, 144000, 217200, epsg=31370)),
    return_fields=['datum', 'x_ml72', 'y_ml72', 'parameter',
                   'detectieconditie', 'meetwaarde', 'meeteenheid', 'medium']
)
df.head()
[000/001] .
[11]:
datum x_ml72 y_ml72 parameter detectieconditie meetwaarde meeteenheid medium
0 2021-12-16 143874.97 217045.51 HFPO-DA < 1.0 µg/kg ds Vaste deel van de aarde
1 2021-12-16 143874.97 217045.51 MePFOSAtotaal < 0.5 µg/kg ds Vaste deel van de aarde
2 2021-12-16 143874.97 217045.51 PFHpS < 0.2 µg/kg ds Vaste deel van de aarde
3 2021-12-16 143874.97 217045.51 4:2 FTS < 0.2 µg/kg ds Vaste deel van de aarde
4 2021-12-16 143874.97 217045.51 PFDA < 0.2 µg/kg ds Vaste deel van de aarde

Using sorting and limiting to find the most recent data

You can use sorting and limiting to find the highest, deepest, oldest, newest, … data, depending on the available fields.

In this example we search for the 100 most recent records:

[12]:
from owslib.fes2 import SortBy, SortProperty

df = wfs_search.search(
    sort_by=SortBy([SortProperty('datum', 'DESC')]),
    max_features=100
)
df.head()
[000/001] .
[12]:
id opdracht pfasdossiernr profielnaam top_in_m basis_in_m jaar datum parameter detectieconditie meetwaarde meeteenheid medium profieltype plaatsing_profiel commentaar x_ml72 y_ml72
0 49484617 17945116 108231 PB2 2.08 2.78 2026 2026-01-16 PFDS < 10.0 ng/l Grondwater Peilbuis 2025-10-10 106744.0 193702.0
1 49484636 17945116 108231 PB2 2.08 2.78 2026 2026-01-16 < 50.0 ng/l Grondwater Peilbuis 2025-10-10 106744.0 193702.0
2 49484643 17945116 108231 PB2 2.08 2.78 2026 2026-01-16 PFOStotaal < 10.0 ng/l Grondwater Peilbuis 2025-10-10 106744.0 193702.0
3 49484621 17945116 108231 PB2 2.08 2.78 2026 2026-01-16 PFHxDA < 10.0 ng/l Grondwater Peilbuis 2025-10-10 106744.0 193702.0
4 49484620 17945116 108231 PB2 2.08 2.78 2026 2026-01-16 PFDoDA < 10.0 ng/l Grondwater Peilbuis 2025-10-10 106744.0 193702.0

Combining attribute queries to limit your results

You can combine multiple attribute queries to construct and advanced query to search for exactly what you’re looking for. This will be more performant than requesting more data and doing the filtering afterwards.

In this example we search for data in given bounding box, for a given year, parameter and medium which exceeds a certain value:

[13]:
from pydov.util.location import Within, Box
from owslib.fes2 import PropertyIsEqualTo, PropertyIsGreaterThan, And

df = wfs_search.search(
    query=And([
        PropertyIsEqualTo(propertyname='jaar', literal='2022'),
        PropertyIsEqualTo(propertyname='parameter', literal='PFDA'),
        PropertyIsEqualTo(propertyname='medium', literal='Grondwater'),
        PropertyIsGreaterThan(propertyname='meetwaarde', literal='0.5'),
    ]),
    location=Within(Box(143400, 217000, 144000, 217200, epsg=31370))
)
df.head()
[000/001] .
[13]:
id opdracht pfasdossiernr profielnaam top_in_m basis_in_m jaar datum parameter detectieconditie meetwaarde meeteenheid medium profieltype plaatsing_profiel commentaar x_ml72 y_ml72
0 32785070 13935544 61823 PB7 0.0 0.0 2022 2022-01-18 PFDA < 1.0 ng/l Grondwater Peilbuis 2021-12-10 143622.88 217056.85