The VAT documentation is the first point of contact for VAT-related questions. Therefore, this documentation first covers the basics of the system in the Background section and then the functionality with examples in the User Guide section. This documentation also provides relevant links to find the resources to answer your questions.
If you are curious or want to try the VAT yourself, you can follow this link: https://vat.gfbio.org/
The VAT system allows users to visualize geospatial data on a map in their browser and work with it interactively. Data and processing is provided by the Geo Engine backend service running in the Semantic Layer. The VAT system lists all data products available at the Geo Engine backend service as layers, which can be selected for visualization on the map. Layers can be combined and transformed interactively by constructing arbitrarily complex workflows, which themselves are visualized as new layers on the map or can be plotted, e.g., as a bar chart, right next to the map. This facilitates an interactive approach to constructing new data products and analyzing them.
The VAT system is served as a fully containerized application from the de.NBI cloud and is connected to the Geo Engine Backend service running in the Semantic Layer.
To become familiar with the VAT system, take a look at the publicly accessible instance, which has several biodiversity research related datasets available.
You can run through the following example to get a first impression on what is possible with the VAT system. In the example, you will take elephant occurrence datasets of two distinct species and combine them with a vegetation index dataset to visualize the difference in their habitats.
Go to vat.gfbio.org. Click on Add Data (+) -> Layers -> Elephant example. There you can find three layers: Loxodonta africana, Loxodonta cyclotis and MOD13C2 NDVI. The first two are point datasets of occurrences of two elephant species. NDVI is a vegetation index raster dataset.
Add all three layers to the map by clicking on them once. (Optional: Remove the Loxodonta cyclotis occurrences outside of Africa by first clicking on Add Data -> Draw Features, set type to "Polygon" and draw a polygon around Africa by clicking on the map. Then, select Operators -> Point In Polygon and select the Loxodonta cyclotis point layer and the drawn polygon. Apply the operator.)
Click on Operators -> Raster Vector Join to configure a raster vector join operator. The raster vector join operator attaches raster values to points.
Select as point input one of the two elephant occurrence datasets and as raster input the NDVI dataset. Give a descriptive name like " with NDVI". Click on "Create" to add the new layer to the map. Repeat for the second elephant occurrence dataset.
Click on Operators -> Histogram. Set as input one of the two new layers created by the raster vector join. Select the "MOD13C2 NDVI" attribute. Click "Create". Repeat for the other layer.
Now compare the two histograms you created. You should clearly see that the forest elephant occurs more often in more densely vegetated areas than the bush elephant (as expected). You can also move around/zoom in/out on the map to compare the two histograms for different regions.
The VAT system aims at being as intuitive as possible. Whenever a deeper understanding is required though, e.g., about the specific settings of operators, links to the documentation are provided where those are explained in depth.
Authmann, C., Beilschmidt, C., Drönner, J., Mattig, M., & Seeger, B. (2015). VAT: a system for visualizing, analyzing and transforming spatial data in science. Datenbank-Spektrum, 15, 175-184.
Beilschmidt, C., Drönner, J., Mattig, M., & Seeger, B. (2023). Geo Engine: Workflow-driven Geospatial Portals for Data Science. Datenbank-Spektrum, 1-9.
Geodata, i.e. data relating to location and time, is omnipresent. The amount of data is constantly increasing. Geodata portals play a key role in the dissemination and utilization of geodata. They typically run in the cloud and users only need a browser to be able to use them. Although portals are sometimes highly specialized, there are requirements for the underlying software that are common to all portals. Data access, data processing and visualization must always be implemented. The Geo Engine provides all the components required to build geodata portals. It consists of a back end for processing and a front end with components that can be freely combined in portals.
The Geo Engine is also a geographic information system (GIS) that makes it possible to process data. Experts can use it to create workflows that generate a result from source data and processing steps. One example is linking animal observations to a temperature layer and filtering by average temperature to find animals that can cope well with the cold. Once an interesting workflow has been found, a portal can be created that can be used intuitively without prior knowledge. The Geo Engine portals go far beyond static maps. They enable interactive analyses so that the data can be freely explored. Users can also contribute their own data and merge it with the portal data after uploading it. For example, a user can upload GPS positions of a route to the portal and visualize the development of portal data along this route.
The Geo Engine consists of a backend, which usually runs on a server and provides data and functions for various frontends. The two front ends that belong to the Geo Engine are the web UI and the Python library. In addition, external tools can also communicate directly with the backend via standard interfaces.
The web UI enables the Geo Engine to be used in the browser. The elements of the Web UI can be combined to create various applications. The Geo Engine GIS offers the greatest flexibility, but requires a training period and specialist knowledge due to the wide range of functions. Dashboards, on the other hand, are aimed at a broader user group. They are specialized portals that focus strongly on one application and are easier to use due to predefined analyses. The Geo Engine comes with ready-made dashboards and allows you to build new dashboards from existing components.
The Python library is aimed at users with programming skills who want to process data outside of the Geo Engine. For example, it is possible to create more complex diagrams or use machine learning. In addition, the Geo Engine can also be administered via Python by activating further functionalities for data and user administration via an admin token.
The Geo Engine is based on standard software. The backend uses GDAL, PROJ and Apache Arrow, among others. The front end is based on Angular and OpenLayers. Docker containers are available for the installation and operation of the Geo Engine. There is one container image each for the backend and frontend. Together with external components such as a PostgreSQL database, these can be bundled in a pod and provided as a separate instance.
The Geo Engine can be used at a very low threshold, as there are publicly accessible instances that run in the cloud and do not require installation. Examples include the GFBio VAT system at https://vat.gfbio.org and the EBV Analyzer at https://portal.geobon.org/map. In addition to these portals, which are based on the Geo Engine and offer more or less functions, there will also be a demo of the Geo Engine GIS in the future, which will be available at https://www.geoengine.io.
The Geo Engine can also be installed on your own systems and hosted yourself. It is then provided via Docker and requires certain IT expertise. Geo Engine GmbH also offers hosting and support on request.
The Geo Engine is made available under an open core license. This is a mixture of open-source and freely usable software with certain paid functions. All essential functions are available free of charge.
In the world of geodata processing, there is a huge amount of software with very different focuses. MapServer and GeoServer are server software that provide geodata via web services for maps. GeoNode is a data management platform that is based on GeoServer, among other things. It enables users to create, share and publish interactive maps. The Geo Engine goes far beyond this functionality and makes it possible to create analyses in the platform itself using an operator toolbox and workflow engine. Based on these workflows, specialized dashboards and portals can then be created that are easy for users to operate.
NFDI4Biodiversity contains a great deal of geodata, i.e. data that has a spatial and temporal reference. One example is the locations of collections in a herbarium, which can have a time of discovery and GPS coordinates. It is important for the scientific community to be able to find and use this data as easily as possible. The Geo Engine can be seen as a toolbox for creating geo-applications within the framework of NFDI4Biodiversity.
In detail, there are two points of contact in NFDI4Biodiversity, namely the GFBio portal and user portals. GFBio is a sub-project that brings together data from German collections and data centers in the field of biodiversity and offers a point of contact for researchers. The Geo Engine can be accessed via the GFBio search, from which selected data can be visualized in a web GIS in the browser, where GIS stands for Geographical Information System. In addition, the Geo Engine can be used to perform GIS operations directly on the data without having to have expert knowledge or install software. One example is the linking of environmental data, e.g. temperature models, which are offered by the Geo Engine in addition to the GFBio data, with plant locations. The actually complicated work of linking two time series of different geodata is automatically taken over by the Geo Engine. The data can in turn be visualized using maps, tables or plots or downloaded for further use.
In addition to the GFBio portal, there is also a proof-of-concept in which data portals based on the Geo Engine and some special data sets from NFDI4Biodiversity were created for specific specialist communities. Here, dashboards were built on the basis of the Geo Engine that are precisely tailored to the needs of individual user groups. These then offer selected functions with intuitive, coordinated usability.
The Geo Engine is used in very different scenarios. In the area of data portals, it implements the connection, visualization and analysis of geodata. Specifically, it is the technological basis of the Terranova portal, which is building a digital atlas of Europe. In the GEO BON EBV Data Portal, it enables the exploration of and access to Essential Biodiversity Variables, which provide indicators for the development of global biodiversity.
In research, the Geo Engine is used to connect complex data sets, implement special algorithms and implement analysis workflows. It is used in the RESPECT project, which is investigating environmental changes in tropical mountain forests in southern Ecuador. In CropHype, it provides the basis for improving the classification of agricultural fields using new types of satellite data.
One use case from industry is the enrichment of proprietary data with publicly available data that is difficult to obtain and process. A concrete example is the calculation of vegetation indicators, a measure of how densely overgrown an area is. Here, the Geo Engine procures the necessary satellite data, calculates the vegetation and links it to the company data. The results are made available via standard interfaces so that they can be integrated into company processes.
Here, we answer frequently asked questions about the VAT System.
The VAT System is built on top of the Geo Engine. The Geo Engine is a powerful geospatial processing engine that provides a wide range of geospatial processing capabilities. The VAT System is a user-friendly interface that uses the Geo Engine, designed to make it easy for users to access and use geo data from NFDI4Biodiversity.
VAT is developed by the Database Research Group of the University of Marburg (head: Prof. Bernhard Seeger). The design of VAT was a joint collaboration with the Senckenberg Biodiversity and Climate Research Centre (BiK-F) (head: Prof. Thomas Hickler).
VAT ist hosted and operated by GFBio - Gesellschaft für Biologische Daten e.V. (Imprint).
VAT is built upon the Geo Engine, a cloud-ready geo-spatial data processing platform. Learn more about Geo Engine on GitHub or visit the Geo Engine website.
If you have any questions or feedback, please feel free to contact us.
Here, we describe the most important features of the VAT system and the Geo Engine.
VAT utilizes the Geo Engine Operator Toolbox to provide a wide range of geospatial processing capabilities. The Operator Toolbox is a powerful tool for processing and analyzing geospatial data. It provides a wide range of operators for processing raster and vector data, such as filtering, combining, and aggregating. The Operator Toolbox is designed to be user-friendly and intuitive, allowing users to easily create complex processing chains. The Operator Toolbox is also extensible, allowing users to create custom expressions to meet their specific needs.
The Geo Engine Python Library allows users to interact programmatically with a Geo Engine backend, for instance the one offered in the Semantic Layer.It allows the management of a Geo Engine instance for administrators, for example to assign roles. Users can manage their datasets, layers and workflows and load data products into Python for further processing and analysis tasks. Having data products from the Geo Engine easily available directly in Python facilitates their use in external tools users are already working with. For example, the Geo Engine Python Library can be used in Jupyter Notebooks to construct and retrieve data products from a Geo Engine backend, taking advantage of Geo Engine's powerful geospatial processing capabilities. Then, with a data product loaded into Python, any suitable visualization tool can be used within the notebook. Furthermore, when connected to the same Geo Engine backend, a user can seemlessly switch between the Geo Engine web front end (VAT) and the Geo Engine Python Library, choosing the tool best suited for the task at hand any time.
To become familiar with the Geo Engine Python Library, take a look at the examples in the GitHub repository. You can connect to the Geo Engine backend running in the Semantic Layer.
In addition to the examples, which offer a good starting point, there also exists documentation for all available functionality.
The source code of the Geo Engine Python Library is publicly available on GitHub.
The VAT system provides a search integration with the GFBio search that allows users to transfer search results directly to the VAT system.
To search for data, users can enter a search term in the search bar and press the Enter key. This will show the search results. Users can filter the search results by selecting Visualizable in VAT in the menu on the left side. This will show only the datasets that can be visualized in the VAT system.
Enter
Users can add search results to the basket by clicking on the Basket button. This will add the data to the basket.
Basket
Users can transfer the data from the basket to the VAT system by opening the search basket. This will show all datasets that are in the basket. Users can select the datasets they want to transfer to the VAT system and click on the Visualize in VAT button.
Visualize in VAT
This will open a dialog in the VAT system where users can select the layers they want to add to the map. Users can choose the layers they want to add and if they should replace the current layers or add the new layers on top of the current layers.
The selected layers will be added to the map in the VAT system. Users can now work with the data as they would with any other data in the VAT system.
GFBio's connected Data Centers provide access to a variety of data archives. The VAT system allows users to access these archives directly. This makes it easier to access data without having to download it themselves. In addition, users can map the data together with other data sources in the VAT system.
In the background, the VAT system harvests all ABCD data from the GFBio Search Index every night. Thus, updates to the ABCD data are available in the VAT system the next day.
To find the ABCD data, users can click on the + button in the data menu. This opens a dialog where users can select the GFBio ABCD Datasets menu item. This will show all ABCD datasets that are available in the VAT system.
+
Users can select the data they are interested in by clicking on the dataset. This will load all occurrences from the selected dataset into the VAT system as a new layer.
The data is displayed on the map as clustered points and can then be used like any other data in the VAT system. Zooming in will dissolve the clusters and show the individual occurrences. Users can also open the data table to see more attributes of the occurrences.
Some ABCD datasets contain links to multimedia items. These can be images, videos, or audio files. Users can click on the multimedia item in the data table to open it in a new dialog. For instance, when the item is an image, it will be displayed directly in the VAT system.
The data table will show at most three links for clustered occurrences. As a users, you can zoom in to see more items.
To cite the data, users can click on the Show Provenance icon in the context menu. This will open a table that shows the citation for the data.
The table has three columns: Citation, License, and URI. The Citation column contains the citation for the data. The License column contains the license under which the data is available. The URI column contains the URI to the license file.
The VAT system provides access to a snapshot of the GBIF occurrence data. This allows users to easily access GBIF data without having to download the data themselves. In addition, they can map the data together with other data sources in the VAT system.
The GBIF snapshot contains all occurrences available at gbif.org at the time of the snapshot. The snapshot date is noted in the GBIF data provider description (see Add Data dialog above). Since data in VAT is spatio-temporal, we filter the occurrences by three conditions:
All occurrences fulfilling these three conditions are imported into a PostgreSQL database and indexed by time and space (using the PostGIS extension), as well as family, genus and species names. To enable browsing along the taxonomic hierarchy, we additionally import GBIF's backbone taxonomy. We also retrieve the citations for all datasets through the registry API endpoint to be able to compile them according to GBIF's citation guidelines for a set of filtered occurrences.
The GBIF data are made available as a data provider, which can be selected in the data menu (+). There, VAT groups the GBIF occurrences by different taxonomic ranks, e.g. family or species. Selecting such data will load all occurrence records from different datasets that fall under this taxonomic rank, e.g. all occurrences of the Genus Abedus (water bug).
While users can browse lists of taxonomic ranks, they can also search for specific taxa. This makes it much easier to find the data of interest when specific taxa are known. At the top of the GBIF catalog, users can find the search icon on the right-hand side. Clicking on it brings up a search bar where users can enter the name of the taxon of interest. By typing in a few letters of the taxon name, VAT will suggest possible names that can be selected if they seem appropriate. Clicking the search icon again, or pressing ENTER, will display a list of search results.
Users can also change the default search settings by clicking the options icon next to the search icon. This opens a dialog that allows users to change the default search settings, such as the search type. The Fulltext search matches the term anywhere in the name, while the Prefix search matches only the beginning of the name. In addition, users can filter their results by taxonomic rank, e.g., show only results that are of the rank Species. This can be done by first selecting one of the collections, e.g., Species datasets, and then performing the search.
At all browsing levels, the current filter, if selected, is respected also during a search. This is an additional upgrade to a previous version of the search where you could only search for family, genus, or species. Now you can, for example, filter for a specific kingdom and order beforehand and already reduce the amount of search results by combining hierarchical browsing and searching.
This chapter contains video tutorials on how to use the VAT system. These tutorials are designed to help users get started with the VAT system and to demonstrate how to use its features.
++ Currently, the examples are being reworked after the latest update because GBIF behaves differently now. Find out more. ++
Welcome to the Introduction to VAT.
This first tutorial will introduce you to the VAT system, which can be used to easily load, transform and explore spatio-temporal datasets, such as in the context of ecological science. This tutorial will give you a tour, explain each menu and show the functionality in a simple first use case where we spatially join the minimum and maximum temperature with the GBIF occurrence data of Aeshna affinis.
Let the tour begin!
The most prominent area when opening the link https://vat.gfbio.org is the large map. Here you can visualise the spatio-temporal data. The extent of the map can be changed by dragging with the mouse or zooming with the scroll wheel.
Next, in the top left-hand corner, is the layer selection menu, which allows you to view all the layers currently loaded, change the symbology or arrange the layers. You can also view the provenance, data table or download the layer.
In the top left hand corner you will find the GFBio Portal button which will take you back to the GFBio Search. Due to the deep integration between the VAT and the GFBio search, it is possible to load data directly from the GFBio search. This button will take you back to the GFBio search when you have finished your data exploration.
Next to the GFBio button is a zoom manipulation menu. Next to the scroll wheel, the zoom level can be changed using the maximise and minimise buttons.
In the middle of the top bar is the time step selector. When viewing spatio-temporal data, you may wish to change the time by one time step. This menu can be used to move the current time step or to open the time selector, which we will see in a moment.
On the top bar you will find a series of icons, which we will visit next.
The first icon is the Account menu. Here you can log in with your GFBio account, which allows you to upload files or create, save or export a project. It also shows the session token, which can be used in Python to visit loaded files.
The next menu is the data selection menu. Here you will find several data catalogues. The Data Catalogue contains datasets hosted by the GeoEngine, such as land use classification, climate information or orographic elevation maps. The Personal Catalogue contains all files and workflows, and the All Datasets Catalogue contains all hosted and uploaded datasets. Below these are the GBIF and GFBio ABCD data catalogues, which contain all datasets derived from the respective data providers. It is also possible to draw features or load a layer by inserting the workflow_id from a Python workflow.
The cogwheel icon hides the operator selection menu. Here you will find a range of operators to manipulate, transform, merge or plot vector or raster data.
The plots are then displayed in the Plot Window. Here you can view the plot results and delete plots.
The next menu is the time configuration menu. Here you can filter the spatio-temporal data. It is also possible to change the time step using the time step selector.
If you are logged in, the workspace settings allow you to save and load projects and change the spatial reference of your project.
The last menu is the Help section. Here you will find initial information and links to the geoengine documentation, as well as further information about the VAT.
After this brief tour, let us start with an example workflow to demonstrate the capabilities of the VAT.
First we go to the data selection menu and search for Aeshna affinis in the GBIF data catalogue. Clicking on the file loads the layer into the map.
To link the occurrence data with the mean temperature, we search for the Minimum Temperature dataset in the data catalogue.
The Minimum Temperature dataset is a spatio-temporal dataset and therefore has a spatial and temporal extent. This can be found in the metadata of the dataset.
To adjust the time range, change the time in the time configuration menu.
We also load the Maximum Temperature dataset.
As the visual appearence of the temperature datasets are not appealing, we change the symbology of the raster layer.
When we clicked on Edit Symbology we were taken to the Edit Symbology menu. Here we scroll down, select a different colour map such as VIRIDIS or MAGMA and click on Create colour map. Finally, we confirm the change with the Apply button at the bottom of the menu.
After loading the data, we want to spatially join the occurrence data of Aeshna affinis with the Minimum Temperature and Maximum Temperature datasets using the raster vector join operator. For better readability it is recommended to name the datasets.
The result is that the vector data is spatially linked to the raster data by position. Therefore, new columns are added to the vector data table containing the information.
The Histogram operator can be used to visualise the distribution of occurrence data as a function of temperature.
The graphs then show the distribution of occurrences of Aeshna affinis as a function of the minimum and maximum temperatures on 1 January 1990.
When you are finished manipulating the data, you can download the raster data as a .tif file and the vector data as a .shp file from the layer selection menu.
In the menu it is also possible to display the origin, which will then appear in the data table area at the bottom of the VAT.
This was the first introductory tour of the VAT system. If you want to learn more, you can do so by watching the videos or exploring the use cases in this documentation.
Warning The VAT system is designed primarily for data exploration. Changing the extent of the visual map will recalculate the workflow and may change the results! This must be taken into account when working scientifically with the VAT system. There is also a new window in the bottom left corner. This window must be present when working scientifically with the VAT system, as it allows reproducibility!
Tip: The layers have several options. They can be downloaded to work with the data in other systems. The layers also always have a workflow tree and the workflow_id can be copied to import the workflow directly into Python.
Welcome to the Canis lupus meets Felis silvestris use case.
In this example the GBIF occurrence data of Canis lupus and Felis silvestris are cut to the extent of Germany and linked to the land use classification of the Ecosystematlas.
To begin, we select the Data Catalogue in the top right-hand corner. Here we have several data catalogues to choose from.
In our case, we start by searching for the individual species in the GBIF data provider. The search function makes it easy to find the species, so we search for Canis lupus and load the dataset by selecting it.
For the spatial selection we also need the German borders, which we found by searching for Germany in the data catalogue.
In order to link the occurrence data with the land use classification, it is also necessary to load the Oekosystematlas by searching for it in the personal data catalogue. The personal data catalogue contains all datasets uploaded by the user as well as a section with all datasets, which also contains datasets not listed.
The next step takes place in the Operators section, located in the top right-hand corner.
First we use a Point in Polygon Filter to restrict our occurrence data to Germany. For better readability it is recommended to name the datasets.
Next, we join the raster data to the vector data using the Raster Vector Join Operator, which takes the occurrence data as a vector and the Ecosystem Atlas as raster data.
The result is that the vector data is spatially linked to the raster data by position. Therefore, a new column is added to the vector data table containing the information.
To visualise the classified data, it is recommended to use the Class Histogram operator, which translates the Ecosystem Atlas numbers into class names using the metadata.
The graph then shows the distribution of occurrences according to class.
Using the same procedure for Felis silvestris, it is possible to compare the occurrence of the two species.
Warning: The VAT system is mainly used for data exploration. Changing the extent of the visual map will recalculate the workflow and could change the results! This must be taken into account when working scientifically with the VAT system. There is also a new window in the bottom left corner. This window must be present when working scientifically with the VAT system, as it allows reproducibility!
Welcome to the Dry Land Use Case.
In this example, the GBIF occurrence data of Calopteryx splendens are clipped to the extent of Germany and merged with the land use classification from the Oekosystematlas as well as a time of average temperature provided by the WorldClim dataset.
In our case we start by searching for Calopteryx splendens in the GBIF data provider. The search function makes it easy to find the species, so we can search for Calopteryx splendens and load the dataset by selecting it.
For the spatial selection we also need the German border, which we found by searching for Germany in the data catalogue.
Next, for the link between the occurrence data and the average temperature, we search for the Average Temperature dataset in the data catalogue.
Caution: The Average Temperature is a spatio-temporal dataset. Always check the spatial and temporal extent in the metadata.
The Average Temperature dataset covers the whole Earth and a time range from 1970/01/01 to 2000/12/31. To do this we need to change the time in the time menu at the top right.
As the dataset does not look very attractive, we will change the colour palette of the raster data. This can be done by right-clicking on the layer and selecting Edit Symbology.
In the symbology menu, scroll down to Create colour table, select a colour map such as VIRIDIS or MAGMA, click the Create colour table button and confirm with the Apply button at the bottom of the symbology menu.
Next, we join the raster data to the vector data using the Raster Vector Join Operator, which takes the occurrence data as a vector and the Ecosystem Atlas and Mean Temperature as raster data.
The Histogram operator can be used to visualise the distribution of occurrence data as a function of average temperature.
The plots then show the distribution of occurrences of Calopteryx splendens as a function, firstly, of the average temperature on 1 January 2000 and, secondly, of the land-use classification of the Ecosystematlas.
Warning: The VAT system is designed primarily for data exploration. Changing the extent of the visual map will recalculate the workflow and could change the results! This must be taken into account when working scientifically with the VAT system. There is also a new window in the bottom left corner. This window must be present when working scientifically with the VAT system, as it allows reproducibility!
This workflow is a contribution to the NFDI4Earth conference.
The video for this use case is coming soon!
Welcome to the VAT 4 ML Use Case.
In this example we will label training data in VAT for Germany, transfer it to a Jupyter notebook using the unique workflow identifier, download the training data as a geodataframe and finally use a machine learning model to build a species distribution model.
For this use case, we will therefore use the frequency of Arnica montana occurrences from GBIF as the target variable together with weather data from CHELSA, land use classification from Ökosystematlas and topographic information as predictor variables.
To begin, select the Data Catalogue in the top right-hand corner. Here we have several data catalogues to choose from.
In our case, we start by searching for the individual species in the GBIF data provider. The search function makes it easy to find the species, so we search for Arnica montana and load the dataset by selecting it.
For the weather data we taking weather information from CHELSA. Here we choose the Mean daily air temperature, Monthly moisture index and the Montly precipitation amount.
**Caution: The weather data is a spatio-temporal data set. Always check the spatial and temporal extent in the metadata.
The weather datasets cover the whole earth and a time range from 01/01/1981 to 01/01/2011. We need to change the time in the time menu at the top right.
To add topographic information to the predictor variables, we include the SRTM elevation model.
Finally, we add land use classification data, which in this case is the Oekosystemaltas. It can be loaded by searching for it in the personal data catalogue. The personal data catalogue contains all the datasets that the user has uploaded, as well as a section with all datasets, which also contains datasets that are not listed.
This gives us all the layers we need to create the training and prediction data.
We start to create the training data and prepare the prediction data by aggregating the spatio-temporal weather data. To do this, we use the Temporal Raster Aggregation operator. This allows us to aggregate temporal data by a moving window (i.e. 1 year). We use this operator for all weather data. While we choose the mean aggregation type for the temperature and the moisture index, we choose the sum aggregation type for the precipitation. For better readability it is recommended to name the datasets.
In a second step, we spatially filter the GBIF occurrence data of Arnica montana using the Point in Polygon Filter to restrict our occurrence data to Germany.
.
Finally, to create the training data, we join the prepared raster data to the vector data using the Raster Vector Join Operator, which takes the occurrence data as a vector and the other prepared raster data. This allows us to spatially join the occurrences with the value of the underlying raster cells.
To create the prediction data, we then use the Raster Stacker operator to create a multi-layer raster containing all the raster data. This makes it easier to import it into Jupyter Notebook and work with it.
This brings us to the Arnica montana training data and the stacked prediction grid data.
We now copy the Workflow ID for each layer to use in Jupyter Notebook.
In Jupyter Notebook, we can use the geoengine package to first initialise the VAT API. We then import the training data workflow. We then round and group the data in Jupyter Notebook to create a frequency of Arnica montana occurrences for each predictor variable combination. The frequency is used as the target variable and the remaining columns are used for the predictor variables. We then split the dataset into training and test data and start training the RandomForestRegressor model using a GridSearchCV strategy for better results. The best resulting model has an r2 value of 0.07.
After model training we can import the prediction data workflow. The best RandomForestRegressor model is used for the final prediction
Finally, the result is plotted using the matplotlib package.
Although the model did not show the best performance, it was possible to show how easy it is to create spatio-temporal training data for machine learning applications using the VAT and exporting the data directly into Python, where it can be used in typical formats such as geopandas GeoDataFrame or xarray DataArray.
This chapter contains examples of how to use the VAT system. The examples are written in Jupyter notebooks and are available in the examples directory. The notebooks are converted to markdown and included in the user documentation.
examples
Welcome to geoengine-python! This notebook is intended to show you around and explain the basics of how geoengine-python and VAT are related.
The purpose of this notebook is to demonstrate the capabilities of Geo Engine. Therefore some useful techniques will be shown:
When building your own nested workflow, it is recommended to build it in several steps as seen in this notebook.
Documentation about the operators and how to use them in Python can be found here: https://docs.geoengine.io/operators/intro.html
The first thing to do is to import the geoengine-python package:
import geoengine as ge
For plotting it is currently also necessary to import Altair:
import altair as alt
#Other imports from datetime import datetime import matplotlib.pyplot as plt
To establish a connection with the VAT, the ge.initialise can be used together with the API URL:
ge.initialise
ge.initialize("https://vat.gfbio.org/api")
In the case of a locally hosted instance, the link would be http://localhost:4200/api.
http://localhost:4200/api
For a more comfortable work with the GBIF DataProvider it is possible to get the name from the root_collection:
root_collection
root_collection = ge.layer_collection() gbif_prov_id = '' for elem in root_collection.items: if elem.name == 'GBIF': gbif_prov_id = str(elem.provider_id) gbif_prov_id
'1c01dbb9-e3ab-f9a2-06f5-228ba4b6bf7a'
To load data, use operators or plot vector data, 'workflows' need to be created, as shown in Loading the dragonfly species Aeshna affinis
A workflow needs to be registered in the VAT or Geo Engine instance. Therefore the command ge.register_workflow followed by the command in JSON can be used:
ge.register_workflow
workflow_aeshna_affinis = ge.register_workflow({ "type": "Vector", "operator": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Aeshna affinis`", } } }) workflow_aeshna_affinis
c7b6b25a-714d-58d1-9f53-db7bf4995a5b
Alternatively the workflow_builder can be used as shown here: TODO
workflow_builder
The result of each registration is the workflow_id, which can be used directly in VAT to trigger the workflow. To finally load the vector data from VAT, the .get_dataframe method can be used. The method takes as parameters the search extent, a time interval, the spatial resolution and a coordinate reference system.
.get_dataframe
#Set time start_time = datetime.strptime( '2010-01-01T12:00:00.000Z', "%Y-%m-%dT%H:%M:%S.%f%z") end_time = datetime.strptime( '2011-01-01T12:00:00.000Z', "%Y-%m-%dT%H:%M:%S.%f%z") #Request the data from Geo Engine into a geopandas dataframe data = workflow_aeshna_affinis.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(-180, -90, 180, 90), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Plot the data ax = data.plot(markersize=3) ax.set_xlim([-180,180]) ax.set_ylim([-90,90])
(-90.0, 90.0)
The extent was chosen to make it clear that the occurrences of Aeshna affinis only occur on the Eurasian continent. Without the x- and y-limiters the plot would look different:
data.plot()
<Axes: >
In addition to vector data, raster data could also be loaded from the VAT.
To load raster data again, a workflow must be registered, but this time the 'GdalSource' is used instead of the 'OgrSource':
workflow_t_min = ge.register_workflow({ "type": "Raster", "operator": { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_minimum_2m_air_temperature" } } } } } } }) workflow_t_min
a57efb5a-7256-58b9-b9f2-9f22d9724bab
The raster data can then be requested as a `xarray.DataArray' and plotted that way:
#Request the data from Geo Engine into a xarray dataarray data = workflow_t_min.get_xarray( ge.QueryRectangle( ge.BoundingBox2D(-180, -90, 180, 90), ge.TimeInterval(start_time, start_time), resolution=ge.SpatialResolution(1., 1.), srs="EPSG:4326" ) ) #Plot the data TODO more description data.plot(vmin=-50, vmax=50)
/home/duempelmann/geoengine_env/lib/python3.10/site-packages/owslib/coverage/wcs110.py:85: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead. elem = self._capabilities.find(self.ns.OWS('ServiceProvider')) or self._capabilities.find(self.ns.OWS('ServiceProvider')) # noqa <matplotlib.collections.QuadMesh at 0x7fb654cef9a0>
The same can be done for the maximum temperature:
workflow_t_max = ge.register_workflow({ "type": "Raster", "operator": { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_maximum_2m_air_temperature" } } } } } } }) workflow_t_max
cdfe579d-b451-5b7e-b98d-bf0570489784
#Request the data from Geo Engine into a xarray dataarray data = workflow_t_max.get_xarray( ge.QueryRectangle( ge.BoundingBox2D(-180, -90, 180, 90), ge.TimeInterval(start_time, start_time), resolution=ge.SpatialResolution(1.0, 1.0), srs="EPSG:4326" ) ) #Plot the data data.plot(vmin=-50, vmax=50)
/home/duempelmann/geoengine_env/lib/python3.10/site-packages/owslib/coverage/wcs110.py:85: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead. elem = self._capabilities.find(self.ns.OWS('ServiceProvider')) or self._capabilities.find(self.ns.OWS('ServiceProvider')) # noqa <matplotlib.collections.QuadMesh at 0x7fb652906ad0>
As well as loading data, the VAT has several operators for manipulating or transforming geodata. One example is the raster vector join.
The raster vector join operator joins the vector data to one or more raster layers based on the position of the vector features. As shown in this example, the inputs are more or less the individual workflows seen before:
workflow_aeshna_affinis_join = ge.register_workflow({ "type": "Vector", "operator": { "type": "RasterVectorJoin", "params": { "names": { "type": "names", "values": ["Min_Temperature", "Max_Temperature"] }, "temporalAggregation": "none", "featureAggregation": "mean", }, "sources": { "vector": { #Aeshna affinis ########################################## "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Aeshna affinis`", } }, ################################################################### "rasters": [{ #Minimum temperature ################################### "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_minimum_2m_air_temperature" } } } } } }, ################################################################ { #Maximum temperature ############################################ "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_maximum_2m_air_temperature" } } } } } } ################################################################# ] } } }) workflow_aeshna_affinis_join
8b26f457-4d52-5f35-b10a-aca7352f47d1
The input parameters required for each operator can be found in the documentation: https://docs.geoengine.io/operators/intro.html. In this example, the RasterVectorJoin operator takes two input parameters: vector, which represents the vector layer to use, and raster, which represents the one or more raster layers to join.
The resulting vector data again can be get by requesting the data as a GeoDataFrame:
#Request the data from Geo Engine into a geopandas dataframe data_aeshna_affinis = workflow_aeshna_affinis_join.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(-180, -90, 180, 90), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Show the geopandas dataframe data_aeshna_affinis
978 rows × 8 columns
The data could then be plotted directly in Pyhton:
fig, ax = plt.subplots(1, 2, figsize=(20,10)) data_aeshna_affinis.plot(ax=ax[0], column='Min_Temperature', legend=True, legend_kwds={'label': 'Minimum Temperature'}) data_aeshna_affinis.plot(ax=ax[1], column='Max_Temperature', legend=True, legend_kwds={'label': 'Maximum Temperature'}) plt.show()
The VAT also offers some of its own plot types, such as histograms.
Of course, a workflow must be registered in order to plot the data:
workflow_aeshna_affinis_join_plot_min = ge.register_workflow({ "type": "Plot", "operator": { "type": "Histogram", "params": { "attributeName": "Min_Temperature", "bounds": "data", "buckets": { "type": "number", "value": 20 } }, "sources": { "source": { #Aeshna affinis Join ############################################# "type": "RasterVectorJoin", "params": { "names": { "type": "names", "values": ["Min_Temperature", "Max_Temperature"] }, "temporalAggregation": "none", "featureAggregation": "mean", }, "sources": { "vector": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Aeshna affinis`", } }, "rasters": [{ "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_minimum_2m_air_temperature" } } } } } }, { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_maximum_2m_air_temperature" } } } } } } ] } ########################################################################## } } } }) workflow_aeshna_affinis_join_plot_min
8426078a-2940-5a76-8f16-afda4ed45b80
The .plot_chart method can be used to get the plot, which can then be plotted using the altair' package:
.plot_chart
altair
#Request the plot from Geo Engine plot_aeshna_affinis_min = workflow_aeshna_affinis_join_plot_min.plot_chart( ge.QueryRectangle( ge.BoundingBox2D(-180, -90, 180, 90), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Show the plot alt.Chart.from_dict(plot_aeshna_affinis_min.spec)
workflow_aeshna_affinis_join_plot_max = ge.register_workflow({ "type": "Plot", "operator": { "type": "Histogram", "params": { "attributeName": "Max_Temperature", "bounds": "data", "buckets": { "type": "number", "value": 20 } }, "sources": { "source": { #Aeshna affinis Join ############################################# "type": "RasterVectorJoin", "params": { "names": { "type": "names", "values": ["Min_Temperature", "Max_Temperature"] }, "temporalAggregation": "none", "featureAggregation": "mean", }, "sources": { "vector": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Aeshna affinis`", } }, "rasters": [{ "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_minimum_2m_air_temperature" } } } } } }, { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_maximum_2m_air_temperature" } } } } } } ] } ########################################################################## } } } }) #Request the plot from Geo Engine plot_aeshna_affinis_max = workflow_aeshna_affinis_join_plot_max.plot_chart( ge.QueryRectangle( ge.BoundingBox2D(-180, -90, 180, 90), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Show the plot alt.Chart.from_dict(plot_aeshna_affinis_max.spec)
As you can see, VAT offers a lot of functionality, which will be deepened and extended in the following examples.
In this chapter, some other useful links between Geo Engine and Python are shown.
#Overlay plot with context import geopandas as gpd import matplotlib.pyplot as plt #Request the data from Geo Engine into a xarray dataarray data_min = workflow_t_min.get_xarray( ge.QueryRectangle( ge.BoundingBox2D(-15.1189, 29.6655, 92.9116, 65.3164), ge.TimeInterval(start_time, start_time), resolution=ge.SpatialResolution(1.0, 1.0), srs="EPSG:4326" ) ) #Request the data from Geo Engine into a xarray dataarray data_max = workflow_t_max.get_xarray( ge.QueryRectangle( ge.BoundingBox2D(-15.1189, 29.6655, 92.9116, 65.3164), ge.TimeInterval(start_time, start_time), resolution=ge.SpatialResolution(1.0, 1.0), srs="EPSG:4326" ) ) #Plot the data fig, ax = plt.subplots(1, 2, figsize=(20,10)) data_min.plot(ax=ax[0], vmin=-30, vmax=20) data_aeshna_affinis.plot(ax=ax[0], color='red', markersize=3) data_max.plot(ax=ax[1], vmin=-30, vmax=20) data_aeshna_affinis.plot(ax=ax[1], color='red', markersize=3) plt.show()
/home/duempelmann/geoengine_env/lib/python3.10/site-packages/owslib/coverage/wcs110.py:85: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead. elem = self._capabilities.find(self.ns.OWS('ServiceProvider')) or self._capabilities.find(self.ns.OWS('ServiceProvider')) # noqa /home/duempelmann/geoengine_env/lib/python3.10/site-packages/owslib/coverage/wcs110.py:85: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead. elem = self._capabilities.find(self.ns.OWS('ServiceProvider')) or self._capabilities.find(self.ns.OWS('ServiceProvider')) # noqa
This workflow uses the VAT to compare the occurrence of Canis lupus and Felis silvestris as a function of land use classification from the Ökosystematlas.
The purpose of this notebook is also to demonstrate the capabilities of Geo Engine. Therefore some useful techniques will be shown:
When building your own nested workflow, it is recommended to build it in several steps as shown in this notebook.
#Import packages import geoengine as ge import geoengine_openapi_client from datetime import datetime from geoengine.types import RasterBandDescriptor import altair as alt alt.renderers.enable('default')
RendererRegistry.enable('default')
#Initialize Geo Engine in VAT ge.initialize("https://vat.gfbio.org/api")
#Get the GBIF DataProvider id (Useful for translating the DataProvider name to its id) root_collection = ge.layer_collection() gbif_prov_id = '' for elem in root_collection.items: if elem.name == 'GBIF': gbif_prov_id = str(elem.provider_id) gbif_prov_id
This chapter is not required and only shows that country borders are available.
#Create workflow to request German border workflow_germany = ge.register_workflow({ "type": "Vector", "operator": { "type": "OgrSource", "params": { "data": "germany", } } }) workflow_germany
2429a993-385f-546f-b4f7-97b3ba4a5adb
#Set time start_time = datetime.strptime( '2000-04-01T12:00:00.000Z', "%Y-%m-%dT%H:%M:%S.%f%z") end_time = datetime.strptime( '2030-04-01T12:00:00.000Z', "%Y-%m-%dT%H:%M:%S.%f%z") #Request the data from Geo Engine into a geopandas dataframe data = workflow_germany.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Plot the data data.plot()
This chapter is not needed and only shows that raster data is also available.
#Create a workflow to request the oekosystematlas raster data workflow_oekosystematlas = ge.register_workflow({ "type": "Raster", "operator": { "type": "GdalSource", "params": { "data": "oekosystematlas" } } }) workflow_oekosystematlas
8a859eeb-0778-5190-a9d1-b1f787e4176d
#Request the data from Geo Engine into a xarray dataarray data = workflow_oekosystematlas.get_xarray( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Plot the data data.plot(vmax=75)
/home/duempelmann/geoengine_env/lib/python3.10/site-packages/owslib/coverage/wcs110.py:85: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead. elem = self._capabilities.find(self.ns.OWS('ServiceProvider')) or self._capabilities.find(self.ns.OWS('ServiceProvider')) # noqa <matplotlib.collections.QuadMesh at 0x7f67d1c4ada0>
None of the following steps are theoretically necessary, as the entire workflow will be projected in the nested request in the end. However, the steps are intended to show the capabilities of Geo Engine.
#Create workflow to request Canis lupus incidents workflow_canis_lupus = ge.register_workflow({ "type": "Vector", "operator": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Canis lupus`", } } }) workflow_canis_lupus.get_result_descriptor()
Data type: MultiPoint Spatial Reference: EPSG:4326 Columns: gbifid: Column Type: int Measurement: unitless scientificname: Column Type: text Measurement: unitless basisofrecord: Column Type: text Measurement: unitless
#Request the data from Geo Engine into a geopandas dataframe data = workflow_canis_lupus.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Plot the data data.plot()
#Create workflow to request Canis lupus incidents filtered by German border workflow_canis_lupus_cut = ge.register_workflow({ "type": "Vector", "operator": { "type": "PointInPolygonFilter", "params": {}, "sources": { "points": { #Canis lupus ############################### "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Canis lupus`", "attributeProjection": [] } }, ##################################################### "polygons": { #Germany ################################# "type": "OgrSource", "params": { "data": "germany" } } ###################################################### } } }) workflow_canis_lupus_cut
f30ac841-81b0-5301-bac6-840dd914c1ba
#Request the data from Geo Engine into a geopandas dataframe data_canis_lupus = workflow_canis_lupus_cut.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Plot the data data_canis_lupus.plot()
#Create a workflow to request Canis lupus occurrences filtered by the German border and linked to the Ökosystematlas data. workflow_canis_lupus_cut_join = ge.register_workflow({ "type": "Vector", "operator": { "type": "RasterVectorJoin", "params": { "names": { "type": "names", "values": ["Ökosystematlas"] }, "temporalAggregation": "none", "featureAggregation": "mean", }, "sources": { "vector": { #Canis lupus cut ###################################### "type": "PointInPolygonFilter", "params": {}, "sources": { "points": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Canis lupus`", "attributeProjection": [] } }, "polygons": { "type": "OgrSource", "params": { "data": "germany" } } } }, ############################################################## "rasters": [{ #Ökosystematlas ################################### "type": "GdalSource", "params": { "data": "oekosystematlas" } }] ############################################################## }, } }) workflow_canis_lupus_cut_join
2c8ebbbc-b848-58e6-8f5c-f51976db3c8f
#Request the data from Geo Engine into a geopandas dataframe data = workflow_canis_lupus_cut_join.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ), resolve_classifications=True ) #Show the geopandas dataframe data
1341 rows × 7 columns
It can be seen that the Ökosystematlas variable is numerical, while the classes are human-readable encoded in the metadata of the files. This can be adjusted using a class histogram
#Create a workflow to plot Canis lupus occurrences filtered by the German border and merged with Ökosystematlas data as a class histogram. workflow_canis_lupus_full = ge.register_workflow({ "type": "Plot", "operator": { "type": "ClassHistogram", "params": { "columnName": "Ökosystematlas" }, "sources": { "source": { #Canis lupus cut join ##################################### "type": "RasterVectorJoin", "params": { "names": { "type": "names", "values": ["Ökosystematlas"] }, "temporalAggregation": "none", "featureAggregation": "mean", }, "sources": { "vector": { "type": "PointInPolygonFilter", "params": {}, "sources": { "points": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Canis lupus`", "attributeProjection": [] } }, "polygons": { "type": "OgrSource", "params": { "data": "germany" } } } }, "rasters": [{ "type": "GdalSource", "params": { "data": "oekosystematlas" } }] } } ###################################################################### } } }) workflow_canis_lupus_full
b182c10b-59ce-5d5b-946f-fccc3ae04c88
#Request the plot from Geo Engine plot_canis_lupus = workflow_canis_lupus_full.plot_chart( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Show the plot alt.Chart.from_dict(plot_canis_lupus.spec)
#Create workflow to request Felis silvestris occurrences workflow_felis_silvestris = ge.register_workflow({ "type": "Vector", "operator": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Felis silvestris`", } } }) workflow_felis_silvestris
f8d5abd5-7d5f-567e-97a2-7830052d6cbf
#Request the data from Geo Engine into a geopandas dataframe data = workflow_felis_silvestris.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Plot the data data.plot()
#Create workflow to request Felis silvestris occurrences filtered by German border workflow_felis_silvestris_cut = ge.register_workflow({ "type": "Vector", "operator": { "type": "PointInPolygonFilter", "params": {}, "sources": { "points": { #Felis silvestris ################################ "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Felis silvestris`", "attributeProjection": [] } }, ########################################################### "polygons": { #Germany ####################################### "type": "OgrSource", "params": { "data": "germany" } } ############################################################ } } }) workflow_felis_silvestris_cut
518c27b3-0ce7-56ac-b826-5a72be463a73
#Request the data from Geo Engine into a geopandas dataframe data_felis_silvestris = workflow_felis_silvestris_cut.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Plot the data data_felis_silvestris.plot()
#Create a workflow to request Felis silvestris occurrences filtered by the German border and linked to the Ökosystematlas data. workflow_felis_silvestris_cut_join = ge.register_workflow({ "type": "Vector", "operator": { "type": "RasterVectorJoin", "params": { "names": { "type": "names", "values": ["Ökosystematlas"] }, "temporalAggregation": "none", "featureAggregation": "mean", }, "sources": { "vector": { #Felis silvestris cut ##################################### "type": "PointInPolygonFilter", "params": {}, "sources": { "points": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Felis silvestris`", "attributeProjection": [] } }, "polygons": { "type": "OgrSource", "params": { "data": "germany" } } } }, ################################################################### "rasters": [{ #Ökosystematlas ######################################## "type": "GdalSource", "params": { "data": "oekosystematlas" } }] ################################################################### }, } }) workflow_felis_silvestris_cut_join
355b4e59-65cc-5cfe-a0b4-636f4d41beab
#Request the data from Geo Engine into a geopandas dataframe data = workflow_felis_silvestris_cut_join.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ), resolve_classifications=True ) #Show the geopandas dataframe data
1121 rows × 7 columns
#Create a workflow to plot Felis silvestris occurrences filtered by the German border and merged with the Ökosystematlas data as a class histogram. workflow_felis_silvestris_full = ge.register_workflow({ "type": "Plot", "operator": { "type": "ClassHistogram", "params": { "columnName": "Ökosystematlas" }, "sources": { "source": { "type": "RasterVectorJoin", "params": { "names": { "type": "names", "values": ["Ökosystematlas"] }, "temporalAggregation": "none", "featureAggregation": "mean", }, "sources": { "vector": { "type": "PointInPolygonFilter", "params": {}, "sources": { "points": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Felis silvestris`", "attributeProjection": [] } }, "polygons": { "type": "OgrSource", "params": { "data": "germany" } } } }, "rasters": [{ "type": "GdalSource", "params": { "data": "oekosystematlas" } }] } } } } }) workflow_felis_silvestris_full
db03640c-cf0e-5fe0-978c-f45a55eb5da3
#Request the plot from Geo Engine plot_felis_silvestris = workflow_felis_silvestris_full.plot_chart( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Show the plot alt.Chart.from_dict(plot_felis_silvestris.spec)
#Show the plot from Canis lupus alt.Chart.from_dict(plot_canis_lupus.spec)
#Show the plot from Felis silvestris alt.Chart.from_dict(plot_felis_silvestris.spec)
#Comparison plots import pandas as pd # Convert the JSON data to pandas DataFrames df1 = pd.DataFrame(plot_canis_lupus.spec['data']['values']) df2 = pd.DataFrame(plot_felis_silvestris.spec['data']['values']) df1['dataset'] = 'Canis lupus' df2['dataset'] = 'Felis silvestris' combined_df = pd.concat([df1, df2]) chart = alt.Chart(combined_df).mark_bar().encode( x=alt.X('Land Cover:N', title='Land Cover'), y=alt.Y('Frequency:Q', title='Frequency'), color=alt.Color('dataset:N', title='Dataset'), xOffset=alt.Color('dataset:N', title='Dataset') ).properties(width=600) # Display the grouped barplot chart
#Plotting of multiple species import geopandas as gpd gdf1 = data_canis_lupus gdf2 = data_felis_silvestris gdf1['dataset'] = 'Canis lupus' gdf2['dataset'] = 'Felis silvestris' combined_gdf = pd.concat([gdf1, gdf2]) combined_gdf.plot(column='dataset', cmap='rainbow', markersize=5, legend=True)
++ Currently, the examples are being reworked after the latest update because GBIF behaves differently now. Find out more ++
This workflow uses the VAT to evaluate the distribution of Calopteryx splendens in dependence of the land use classification from the Ökosystematlas and an temporal aggregation of the average air temperature.
#Import packages import geoengine as ge import geoengine_openapi_client from datetime import datetime from geoengine.types import RasterBandDescriptor import altair as alt import asyncio import nest_asyncio alt.renderers.enable('default')
This chapter is not needed and only shows that country boundaries are available
#Create workflow to request germany boundary workflow_germany = ge.register_workflow({ "type": "Vector", "operator": { "type": "OgrSource", "params": { "data": "germany", } } }) workflow_germany
#Set time start_time = datetime.strptime( '2010-01-01T12:00:00.000Z', "%Y-%m-%dT%H:%M:%S.%f%z") end_time = datetime.strptime( '2011-01-01T12:00:00.000Z', "%Y-%m-%dT%H:%M:%S.%f%z") #Request the data from Geo Engine into a geopandas dataframe data = workflow_germany.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Plot the data data.plot()
#Create workflow to request the oekosystematlas raster data workflow_oekosystematlas = ge.register_workflow({ "type": "Raster", "operator": { "type": "GdalSource", "params": { "data": "oekosystematlas_detail" } } }) workflow_oekosystematlas
f447601c-0ba1-57c3-9127-b0622f982231
/home/duempelmann/geoengine_env/lib/python3.10/site-packages/owslib/coverage/wcs110.py:85: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead. elem = self._capabilities.find(self.ns.OWS('ServiceProvider')) or self._capabilities.find(self.ns.OWS('ServiceProvider')) # noqa <matplotlib.collections.QuadMesh at 0x7f4d09e97a60>
#Create workflow to request the average temperature raster data workflow_t_avg = ge.register_workflow({ "type": "Raster", "operator": { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_air_temperature" } } } } } } }) workflow_t_avg
6393648d-6545-5435-a49e-015ba9dfa92e
#Preparing of the boundaries for the workflow raster stream bbox = ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" )
#Request the data from Geo Engine into a xarray dataarray data = workflow_t_avg.get_xarray( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, start_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Plot the data data.plot(vmin=-3, vmax=3)
/home/duempelmann/geoengine_env/lib/python3.10/site-packages/owslib/coverage/wcs110.py:85: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead. elem = self._capabilities.find(self.ns.OWS('ServiceProvider')) or self._capabilities.find(self.ns.OWS('ServiceProvider')) # noqa <matplotlib.collections.QuadMesh at 0x7f4d09db8820>
None of the following steps are necessary in theory, as the entire workflow will be projected in the nested request in the end. However, the steps are intended to show the capabilities of Geo Engine and how to logically build nested workflows.
#Create workflow to request Calopteryx splendens occurences workflow_calopteryx_splendens = ge.register_workflow({ "type": "Vector", "operator": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Calopteryx splendens`", } } }) workflow_calopteryx_splendens.get_result_descriptor()
Data type: MultiPoint Spatial Reference: EPSG:4326 Columns: scientificname: Column Type: text Measurement: unitless basisofrecord: Column Type: text Measurement: unitless gbifid: Column Type: int Measurement: unitless
#Request the data from Geo Engine into a geopandas dataframe data = workflow_calopteryx_splendens.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Plot the data data.plot()
#Create workflow to request Calopteryx splendens occurrences filtered by German border workflow_calopteryx_splendens_cut = ge.register_workflow({ "type": "Vector", "operator": { "type": "PointInPolygonFilter", "params": {}, "sources": { "points": { #Calopteryx splendens ############################### "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Calopteryx splendens`", "attributeProjection": [] } }, ##################################################### "polygons": { #Germany ################################# "type": "OgrSource", "params": { "data": "germany" } } ###################################################### } } }) workflow_calopteryx_splendens_cut
6cf9ef88-8bd3-5904-bc74-f866165b18c3
#Request the data from Geo Engine into a geopandas dataframe data_calopteryx_splendens = workflow_calopteryx_splendens_cut.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Plot the data data_calopteryx_splendens.plot()
#Create a workflow to request Calopteryx splendens occurrences filtered by the German border and linked to the Ökosystematlas data. workflow_calopteryx_splendens_cut_join = ge.register_workflow({ "type": "Vector", "operator": { "type": "RasterVectorJoin", "params": { "names": { "type": "names", "values": ["Ökosystematlas", "Avg_Temperature"] }, "temporalAggregation": "none", "featureAggregation": "first", }, "sources": { "vector": { #Calopteryx splendens cut ###################################### "type": "PointInPolygonFilter", "params": {}, "sources": { "points": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Calopteryx splendens`", "attributeProjection": [] } }, "polygons": { "type": "OgrSource", "params": { "data": "germany" } } } }, ############################################################## "rasters": [{ #Ökosystematlas ################################### "type": "GdalSource", "params": { "data": "oekosystematlas" } }, ############################################################## { #Average temperature "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_air_temperature" } } } } } }] ############################################################## }, } }) workflow_calopteryx_splendens_cut_join
63c46ba9-3efd-5ddd-b446-c36fad6537e8
#Request the data from Geo Engine into a geopandas dataframe data = workflow_calopteryx_splendens_cut_join.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ), resolve_classifications=True ) #Show the geopandas dataframe data
540 rows × 8 columns
#Create a workflow to request Calopteryx splendens occurrences filtered by the German border and linked to the Ökosystematlas data. workflow_calopteryx_splendens_cut_join = ge.register_workflow({ "type": "Vector", "operator": { "type": "RasterVectorJoin", "params": { "names": { "type": "names", "values": ["Ökosystematlas", "Avg_Temperature"] }, "temporalAggregation": "none", "featureAggregation": "mean", }, "sources": { "vector": { #Calopteryx splendens cut ###################################### "type": "PointInPolygonFilter", "params": {}, "sources": { "points": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Calopteryx splendens`", "attributeProjection": [] } }, "polygons": { "type": "OgrSource", "params": { "data": "germany" } } } }, ############################################################## "rasters": [{ #Ökosystematlas ################################### "type": "GdalSource", "params": { "data": "oekosystematlas" } }, ############################################################## { #Average temperature "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_air_temperature" } } } } } }] ############################################################## }, } }) workflow_calopteryx_splendens_cut_join
4f2e830a-9570-5c8f-b2e1-bc433814df82
#Create a workflow to plot Calopteryx splendens occurrences filtered by the German border and merged with the ecosystematlas data as a class histogram. workflow_calopteryx_splendens_full_öko = ge.register_workflow({ "type": "Plot", "operator": { "type": "ClassHistogram", "params": { "columnName": "Ökosystematlas" }, "sources": { "source": { #Calopteryx splendens cut join ##################################### "type": "RasterVectorJoin", "params": { "names": { "type": "names", "values": ["Ökosystematlas", "Avg_Temperature"] }, "temporalAggregation": "none", "featureAggregation": "mean", }, "sources": { "vector": { "type": "PointInPolygonFilter", "params": {}, "sources": { "points": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Calopteryx splendens`", "attributeProjection": [] } }, "polygons": { "type": "OgrSource", "params": { "data": "germany" } } } }, "rasters": [{ "type": "GdalSource", "params": { "data": "oekosystematlas" } }, { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_air_temperature" } } } } } }] } } ###################################################################### } } }) workflow_calopteryx_splendens_full_öko
befec7cb-1b9a-5464-88b0-aa14b6be3077
#Request the plot from Geo Engine plot_calopteryx_splendens = workflow_calopteryx_splendens_full_öko.plot_chart( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Show the plot alt.Chart.from_dict(plot_calopteryx_splendens.spec)
#Create a workflow to request Calopteryx splendens occurrences filtered by the German border and linked to the Ökosystematlas data. workflow_calopteryx_splendens_full_avg_temp = ge.register_workflow({ "type": "Vector", "operator": { "type": "RasterVectorJoin", "params": { "names": { "type": "names", "values": ["Ökosystematlas", "Avg_Temperature"] }, "temporalAggregation": "none", "featureAggregation": "mean", }, "sources": { "vector": { "type": "PointInPolygonFilter", "params": {}, "sources": { "points": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`species/Calopteryx splendens`", "attributeProjection": [] } }, "polygons": { "type": "OgrSource", "params": { "data": "germany" } } } }, "rasters": [{ "type": "GdalSource", "params": { "data": "oekosystematlas" } }, { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_air_temperature" } } } } } }] }, } }) workflow_calopteryx_splendens_full_avg_temp
#Request the data from Geo Engine into a geopandas dataframe data = workflow_calopteryx_splendens_full_avg_temp.get_dataframe( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, end_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Show the geopandas dataframe data.plot(column='Avg_Temperature', legend=True, legend_kwds={'label': 'Average Temperature'})
#Overlay plot with context import geopandas as gpd import matplotlib.pyplot as plt #Request the data from Geo Engine into a xarray dataarray data = workflow_t_avg.get_xarray( ge.QueryRectangle( ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334), ge.TimeInterval(start_time, start_time), resolution=ge.SpatialResolution(0.1, 0.1), srs="EPSG:4326" ) ) #Plot the data data.plot(vmin=-3, vmax=3) data_calopteryx_splendens.plot(ax=plt.gca(), color='red', markersize=3) plt.show()
/home/duempelmann/geoengine_env/lib/python3.10/site-packages/owslib/coverage/wcs110.py:85: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead. elem = self._capabilities.find(self.ns.OWS('ServiceProvider')) or self._capabilities.find(self.ns.OWS('ServiceProvider')) # noqa
This workflow is a contribution to the NFDI4Earth conference. This workflow therefore uses the frequency of Arnica montana occurrences from GBIF as a target variable together with weather data from CHELSA, land use classification from the Ökosystematlas and topographic information as predictor variables to create a species distribution model for Arnica montana across Germany.
#Import Packages import geoengine as ge from datetime import datetime from sklearn.model_selection import train_test_split from sklearn.metrics import r2_score from sklearn.ensemble import RandomForestRegressor from sklearn.model_selection import GridSearchCV import matplotlib.pyplot as plt import xarray as xr import numpy as np import asyncio import nest_asyncio
#Get the GBIF DataProvider id (useful for translating the DataProvider name to its id) root_collection = ge.layer_collection() gbif_prov_id = '' for elem in root_collection.items: if elem.name == 'GBIF': gbif_prov_id = str(elem.provider_id) gbif_prov_id
This chapter shows how to register the workflow for generating training data and how to manipulate this data to generate training data.
#Tuning parameters start_time = datetime.strptime('2001-01-01T12:00:00.000Z', "%Y-%m-%dT%H:%M:%S.%f%z") end_time = datetime.strptime('2011-01-01T12:00:00.000Z', "%Y-%m-%dT%H:%M:%S.%f%z") resolution = ge.SpatialResolution(0.01, 0.01) extent = ge.BoundingBox2D(5.852490, 47.271121, 15.022059, 55.065334) #Species selection species = "species/Arnica montana" #Arnica
#Create a workflow to retrieve Arnica montana occurrences filtered by the German border and linked to weather, land use and topographic data. workflow = ge.register_workflow({ "type": "Vector", "operator": { "type": "RasterVectorJoin", "params": { "names": { "type": "names", "values": ["Ökosystematlas", "SRTM", "Mean Air Temperature", "Mean Climate Moisture Index", "Precipitation"] }, "temporalAggregation": "none", "featureAggregation": "first", }, "sources": { "vector": { #Arnica montana ######################################### "type": "PointInPolygonFilter", "params": {}, "sources": { "points": { "type": "OgrSource", "params": { "data": f"_:{gbif_prov_id}:`{species}`", "attributeProjection": [] } }, "polygons": { "type": "OgrSource", "params": { "data": "germany" } } } }, "rasters": [{ #Ökosystematlas ######################################## "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "oekosystematlas" }, } } }, { #SRTM ######################################################### "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "srtm" }, } } }, { #Mean Annual Air Temperature ################################## "type": "TemporalRasterAggregation", "params": { "aggregation": { "type": "mean", "ignoreNoData": False }, "window": { "granularity": "years", "step": 1 }, "windowReference": None, "outputType": None, }, "sources": { "raster": { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_air_temperature" } } } } } } } }, { #Mean Annual Climate moisture indices ######################### "type": "TemporalRasterAggregation", "params": { "aggregation": { "type": "mean", "ignoreNoData": False }, "window": { "granularity": "years", "step": 1 }, "windowReference": None, "outputType": None, }, "sources": { "raster": { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": 0 }, "outputMeasurement": { "type": "continuous", "measurement": "climate moisture", "unit": "kg m^-2 month^-1" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "monthly_climate_moisture_indicies" } } } } } } } }, { #Sum Annual Precipitation #################################### "type": "TemporalRasterAggregation", "params": { "aggregation": { "type": "sum", "ignoreNoData": False }, "window": { "granularity": "years", "step": 1 }, "windowReference": None, "outputType": None, }, "sources": { "raster": { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": 0 }, "outputMeasurement": { "type": "continuous", "measurement": "precipitation", "unit": "kg m-2 month^-1" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "monthly_precipitation_amount" } } } } } } } }] }, } }) workflow
7582cfcb-3d36-5b86-bb72-e81cef584fae
#Request the data from Geo Engine into a geopandas dataframe data = workflow.get_dataframe( ge.QueryRectangle( extent, ge.TimeInterval(start_time, end_time), resolution=resolution, srs="EPSG:4326" ) ) #Plot the data data.plot()
data
1556 rows × 11 columns
#Rounding and grouping of occurrences to create frequency along with predictor variable combination training_data = data.round(3) training_data = training_data.groupby(['Mean Air Temperature', 'Mean Climate Moisture Index', 'Precipitation', 'SRTM', 'Ökosystematlas']).size().reset_index(name='counts') training_data
352 rows × 6 columns
training_data.sort_values('counts', ascending=False)
This chapter shows how to register the workflow to create prediction data.
#Create a workflow to request weather, land use and topographic data as a raster stack. prediction_workflow = ge.register_workflow({ "type": "Raster", "operator": { "type": "RasterStacker", "params": { "renameBands": { "type": "rename", "values": ["Ökosystematlas", "SRTM", "Mean Air Temperature", "Mean Climate Moisture Index", "Precipitation"] } }, "sources": { "rasters": [{ #Ökosystematlas ######################################## "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "oekosystematlas" }, } } }, { #SRTM ######################################################### "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "srtm" }, } } }, { #Mean Annual Air Temperature ################################## "type": "TemporalRasterAggregation", "params": { "aggregation": { "type": "mean", "ignoreNoData": False }, "window": { "granularity": "years", "step": 1 }, "windowReference": None, "outputType": None, }, "sources": { "raster": { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": -273.15 }, "outputMeasurement": { "type": "continuous", "measurement": "temperature", "unit": "K/10" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "mean_daily_air_temperature" } } } } } } } }, { #Mean Annual Climate moisture indices ######################### "type": "TemporalRasterAggregation", "params": { "aggregation": { "type": "mean", "ignoreNoData": False }, "window": { "granularity": "years", "step": 1 }, "windowReference": None, "outputType": None, }, "sources": { "raster": { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": 0 }, "outputMeasurement": { "type": "continuous", "measurement": "climate moisture", "unit": "kg m^-2 month^-1" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "monthly_climate_moisture_indicies" } } } } } } } }, { #Sum Annual Precipitation #################################### "type": "TemporalRasterAggregation", "params": { "aggregation": { "type": "sum", "ignoreNoData": False }, "window": { "granularity": "years", "step": 1 }, "windowReference": None, "outputType": None, }, "sources": { "raster": { "type": "RasterScaling", "params": { "slope": { "type": "constant", "value": 0.1 }, "offset": { "type": "constant", "value": 0 }, "outputMeasurement": { "type": "continuous", "measurement": "precipitation", "unit": "kg m-2 month^-1" }, "scalingMode": "mulSlopeAddOffset" }, "sources": { "raster": { "type": "RasterTypeConversion", "params": { "outputDataType": "F32" }, "sources": { "raster": { "type": "GdalSource", "params": { "data": "monthly_precipitation_amount" } } } } } } } }] } } }) prediction_workflow
370296a3-db66-599b-8e55-2a4bf362a09a
#Preparing of the boundaries for the workflow raster stream bbox = ge.QueryRectangle( extent, ge.TimeInterval(start_time, start_time), resolution=resolution, srs="EPSG:4326" )
nest_asyncio.apply() async def get_prediction_data(workflow, bbox, bands=[0,1,2,3,4], clip=True): data = await workflow.raster_stream_into_xarray(bbox, bands=bands, clip_to_query_rectangle=clip) data.to_dataset(name="prediction") return data async def main(extent, time, resolution, workflow): bbox = ge.QueryRectangle(extent, ge.TimeInterval(time, time), resolution=resolution, srs="EPSG:4326") return await get_prediction_data(workflow, bbox) try: loop = asyncio.get_event_loop() except RuntimeError: loop = asyncio.new_event_loop() asyncio.set_event_loop(loop) prediction_data = loop.run_until_complete(main(extent, start_time, resolution, prediction_workflow)) prediction_data.to_dataset(name="prediction")
/home/duempelmann/geoengine_env/lib/python3.10/site-packages/rasterio/windows.py:314: RasterioDeprecationWarning: The height, width, and precision parameters are unused, deprecated, and will be removed in 2.0.0. warnings.warn(
<xarray.Dataset> Size: 14MB Dimensions: (x: 918, y: 780, time: 1, band: 5) Coordinates:
x (x) float64 7kB 5.855 5.865 5.875 5.885 ... 15.0 15.01 15.02
y (y) float64 6kB 55.07 55.06 55.05 55.04 ... 47.3 47.29 47.28
time (time) datetime64[ns] 8B 2001-01-01
band (band) int64 40B 0 1 2 3 4 spatial_ref int64 8B 0 Data variables: prediction (time, band, y, x) float32 14MB 21.0 21.0 ... 1.082e+03
array([ 5.855, 5.865, 5.875, ..., 15.005, 15.015, 15.025])
array([55.065, 55.055, 55.045, ..., 47.295, 47.285, 47.275])
array(['2001-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
array([0, 1, 2, 3, 4])
array(0)
array([[[[ 21. , 21. , 21. , ..., 255. , 255. , 255. ], [ 21. , 21. , 21. , ..., 255. , 255. , 255. ], [ 21. , 21. , 21. , ..., 255. , 255. , 255. ], ..., [ 255. , 255. , 255. , ..., 255. , 255. , 255. ], [ 255. , 255. , 255. , ..., 255. , 255. , 255. ], [ 255. , 255. , 255. , ..., 255. , 255. , 255. ]], [[ nan, nan, nan, ..., 75. , 73. , 65. ], [ nan, nan, nan, ..., 43. , 43. , 43. ], [ nan, nan, nan, ..., 37. , 36. , 40. ], ... [ 15.433333 , 16.1 , 14.066668 , ..., 16.175 , 17.408333 , 18.425001 ], [ 10.825001 , 14.850001 , 14.416667 , ..., 37.775 , 33.941666 , 29.716667 ], [ 11.216667 , 10.241667 , 13.741668 , ..., 43.55 , 42.108334 , 37.875 ]], [[ 834.4 , 834.89996 , 835.2999 , ..., 682.3 , 685.1 , 687.1 ], [ 834.5 , 835.1 , 835.5 , ..., 673.5 , 676.4 , 678.3 ], [ 834.7 , 835.2 , 835.7 , ..., 656.30005 , 659.19995 , 661.2 ], ..., [1098.7999 , 1097.2001 , 1097.5 , ..., 924.60004 , 936.5 , 943.3001 ], [1088.6001 , 1094.4 , 1098.6 , ..., 1041.5001 , 1045.1001 , 1018.89996 ], [1094. , 1102.1 , 1108.2999 , ..., 1123.7001 , 1130.8999 , 1081.5 ]]]], dtype=float32)</pre></div></li></ul></div></li><li class='xr-section-item'><input id='section-f21fb5ff-9d23-4a16-9aa9-510e0e81ee92' class='xr-section-summary-in' type='checkbox' ><label for='section-f21fb5ff-9d23-4a16-9aa9-510e0e81ee92' class='xr-section-summary' >Indexes: <span>(4)</span></label><div class='xr-section-inline-details'></div><div class='xr-section-details'><ul class='xr-var-list'><li class='xr-var-item'><div class='xr-index-name'><div>x</div></div><div class='xr-index-preview'>PandasIndex</div><div></div><input id='index-46fd53bf-7d8e-436b-a4f8-f8f06bdd2069' class='xr-index-data-in' type='checkbox'/><label for='index-46fd53bf-7d8e-436b-a4f8-f8f06bdd2069' title='Show/Hide index repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-index-data'><pre>PandasIndex(Float64Index([ 5.854999999999984, 5.864999999999984, 5.874999999999984, 5.884999999999984, 5.894999999999984, 5.904999999999983, 5.914999999999983, 5.924999999999983, 5.934999999999983, 5.9449999999999825, ... 14.934999999999901, 14.9449999999999, 14.9549999999999, 14.9649999999999, 14.9749999999999, 14.9849999999999, 14.9949999999999, 15.0049999999999, 15.0149999999999, 15.024999999999899], dtype='float64', name='x', length=918))</pre></div></li><li class='xr-var-item'><div class='xr-index-name'><div>y</div></div><div class='xr-index-preview'>PandasIndex</div><div></div><input id='index-8054ce07-50ea-4595-8c4a-6e16bb4c7500' class='xr-index-data-in' type='checkbox'/><label for='index-8054ce07-50ea-4595-8c4a-6e16bb4c7500' title='Show/Hide index repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-index-data'><pre>PandasIndex(Float64Index([55.065000000000246, 55.05500000000025, 55.04500000000025, 55.03500000000025, 55.025000000000254, 55.015000000000256, 55.00500000000026, 54.99500000000026, 54.98500000000026, 54.975000000000264, ... 47.36500000000076, 47.355000000000764, 47.345000000000766, 47.33500000000077, 47.32500000000077, 47.31500000000077, 47.305000000000774, 47.295000000000776, 47.28500000000078, 47.27500000000078], dtype='float64', name='y', length=780))</pre></div></li><li class='xr-var-item'><div class='xr-index-name'><div>time</div></div><div class='xr-index-preview'>PandasIndex</div><div></div><input id='index-282dd89b-3f25-43ac-bbfe-030b158d7433' class='xr-index-data-in' type='checkbox'/><label for='index-282dd89b-3f25-43ac-bbfe-030b158d7433' title='Show/Hide index repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-index-data'><pre>PandasIndex(DatetimeIndex(['2001-01-01'], dtype='datetime64[ns]', name='time', freq=None))</pre></div></li><li class='xr-var-item'><div class='xr-index-name'><div>band</div></div><div class='xr-index-preview'>PandasIndex</div><div></div><input id='index-734beb7b-b166-4bc2-bee9-4115008dce2b' class='xr-index-data-in' type='checkbox'/><label for='index-734beb7b-b166-4bc2-bee9-4115008dce2b' title='Show/Hide index repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-index-data'><pre>PandasIndex(Int64Index([0, 1, 2, 3, 4], dtype='int64', name='band'))</pre></div></li></ul></div></li><li class='xr-section-item'><input id='section-902453eb-39c5-441f-a011-16fdf12cf690' class='xr-section-summary-in' type='checkbox' disabled ><label for='section-902453eb-39c5-441f-a011-16fdf12cf690' class='xr-section-summary' title='Expand/collapse section'>Attributes: <span>(0)</span></label><div class='xr-section-inline-details'></div><div class='xr-section-details'><dl class='xr-attrs'></dl></div></li></ul></div></div> #Plotting the Layers of the returned xarray dataarray fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(16, 8)) axes[0, 0].set_ylim(-0.01, 1.01) axes[0, 1].set_ylim(-0.01, 1.01) axes[0, 2].set_ylim(-0.01, 1.01) axes[1, 0].set_ylim(-0.01, 1.01) axes[1, 1].set_ylim(-0.01, 1.01) # Add your plot data and other customizations to each subplot prediction_data.isel(band=0).plot(ax=axes[0, 0], vmin=0, vmax=74) prediction_data.isel(band=1).plot(ax=axes[0, 1], vmin=0, vmax=3000) prediction_data.isel(band=2).plot(ax=axes[0, 2]) prediction_data.isel(band=3).plot(ax=axes[1, 0], vmin=-100, vmax=300) prediction_data.isel(band=4).plot(ax=axes[1, 1]) axes[0, 0].set_title("Ökosystematlas") axes[0, 1].set_title("SRTM") axes[0, 2].set_title("Mean Annual Air Temperature") axes[1, 0].set_title("Mean Annual Climate moisture indices") axes[1, 1].set_title("Sum Annual Precipitation") plt.subplots_adjust(wspace=0.2, hspace=0.4) plt.show() Machine Learning In this chapter, the training data is used to create a simple RandomForestRegressor model, which is hyperparameterised using a GridSearchCV and the best model is selected for prediction later. #Create training and test data X = training_data[['Mean Air Temperature', 'Mean Climate Moisture Index', 'Precipitation', 'SRTM', 'Ökosystematlas']] y = training_data['counts'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) #Define the hyperparameter grid param_grid = { 'n_estimators': [200, 400, 600, 800, 1000], 'max_depth': [5, 10, 15, 20], 'min_samples_split': [2, 5, 10], 'min_samples_leaf': [1, 2, 4] } #Create the random forest regressor model rf = RandomForestRegressor() #Perform grid search cross-validation grid_search = GridSearchCV(rf, param_grid, cv=5, scoring='neg_mean_squared_error', n_jobs=4, verbose=2) grid_search.fit(X_train, y_train) #Get the best hyperparameters and model best_params = grid_search.best_params_ best_model = grid_search.best_estimator_ Fitting 5 folds for each of 180 candidates, totalling 900 fits [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s best_params {'max_depth': 5, 'min_samples_leaf': 4, 'min_samples_split': 10, 'n_estimators': 200} #Simple prediction on the training data y_pred = best_model.predict(X_test) #Model performance using r2 r2 = r2_score(y_test, y_pred) print(f"R2 score: {r2:.2f}") R2 score: 0.06 Prediction In this chapter the best model is chosen and used to predict on the prediction data, for the whole of Germany. # Flatten the xarray dataset to a 2D array prediction_df = prediction_data.to_dataset(dim="band").to_dataframe().reset_index() X_pred = prediction_df.loc[:, [0, 1, 2, 3, 4]] X_pred.columns = ["Ökosystematlas", "SRTM", "Mean Air Temperature", "Mean Climate Moisture Index", "Precipitation"] X_pred = X_pred[['Mean Air Temperature', 'Mean Climate Moisture Index', 'Precipitation', 'SRTM', 'Ökosystematlas']] # Use the trained model to make predictions y_pred = best_model.predict(X_pred) y_pred_log = np.log(y_pred) y_pred array([9.35699139, 9.35699139, 9.35699139, ..., 6.37438362, 6.25392884, 6.70247926]) #Extract coordinates for spatial alignment prediction_ds = prediction_data.to_dataset(name='prediction_data') x_coords = prediction_ds.coords['x'].values y_coords = prediction_ds.coords['y'].values #Reshape the model prediction for plotting y_pred_reshaped = y_pred.reshape(prediction_data.time.size, 1, prediction_data.y.size, prediction_data.x.size) da = xr.DataArray(y_pred_reshaped, dims=['time', 'band', 'y', 'x']) da = da.assign_coords(x=x_coords, y=y_coords) da.rio.write_crs('EPSG:4326', inplace=True) #Reshape the model prediction for plotting y_pred_reshaped_log = y_pred_log.reshape(prediction_data.time.size, 1, prediction_data.y.size, prediction_data.x.size) da_log = xr.DataArray(y_pred_reshaped_log, dims=['time', 'band', 'y', 'x']) da_log = da_log.assign_coords(x=x_coords, y=y_coords) da_log.rio.write_crs('EPSG:4326', inplace=True) <xarray.DataArray (time: 1, band: 1, y: 780, x: 918)> Size: 6MB array([[[[2.23612381, 2.23612381, 2.23612381, ..., 0.80446736, 0.80446736, 0.80446736], [2.23612381, 2.23612381, 2.23612381, ..., 0.80937892, 0.80937892, 0.80937892], [2.23612381, 2.23612381, 2.23612381, ..., 0.81060927, 0.81060927, 0.81060927], ..., [0.75689373, 0.75689373, 0.75689373, ..., 1.5268051 , 1.40980293, 1.51508751], [0.75689373, 0.75689373, 0.75689373, ..., 1.68137312, 1.86942503, 1.63518892], [0.75689373, 0.75583891, 0.75511056, ..., 1.8522874 , 1.83320988, 1.9024775 ]]]]) Coordinates: x (x) float64 7kB 5.855 5.865 5.875 5.885 ... 15.0 15.01 15.02 y (y) float64 6kB 55.07 55.06 55.05 55.04 ... 47.3 47.29 47.28 spatial_ref int64 8B 0 Dimensions without coordinates: time, bandxarray.DataArraytime: 1band: 1y: 780x: 9182.236 2.236 2.236 2.236 2.236 2.236 ... 1.915 1.902 1.852 1.833 1.902array([[[[2.23612381, 2.23612381, 2.23612381, ..., 0.80446736, 0.80446736, 0.80446736], [2.23612381, 2.23612381, 2.23612381, ..., 0.80937892, 0.80937892, 0.80937892], [2.23612381, 2.23612381, 2.23612381, ..., 0.81060927, 0.81060927, 0.81060927], ..., [0.75689373, 0.75689373, 0.75689373, ..., 1.5268051 , 1.40980293, 1.51508751], [0.75689373, 0.75689373, 0.75689373, ..., 1.68137312, 1.86942503, 1.63518892], [0.75689373, 0.75583891, 0.75511056, ..., 1.8522874 , 1.83320988, 1.9024775 ]]]])Coordinates: (3)x(x)float645.855 5.865 5.875 ... 15.01 15.02array([ 5.855, 5.865, 5.875, ..., 15.005, 15.015, 15.025])y(y)float6455.07 55.06 55.05 ... 47.29 47.28array([55.065, 55.055, 55.045, ..., 47.295, 47.285, 47.275])spatial_ref()int640crs_wkt :GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]semi_major_axis :6378137.0semi_minor_axis :6356752.314245179inverse_flattening :298.257223563reference_ellipsoid_name :WGS 84longitude_of_prime_meridian :0.0prime_meridian_name :Greenwichgeographic_crs_name :WGS 84horizontal_datum_name :World Geodetic System 1984grid_mapping_name :latitude_longitudespatial_ref :GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]array(0)Indexes: (2)xPandasIndexPandasIndex(Float64Index([ 5.854999999999984, 5.864999999999984, 5.874999999999984, 5.884999999999984, 5.894999999999984, 5.904999999999983, 5.914999999999983, 5.924999999999983, 5.934999999999983, 5.9449999999999825, ... 14.934999999999901, 14.9449999999999, 14.9549999999999, 14.9649999999999, 14.9749999999999, 14.9849999999999, 14.9949999999999, 15.0049999999999, 15.0149999999999, 15.024999999999899], dtype='float64', name='x', length=918))yPandasIndexPandasIndex(Float64Index([55.065000000000246, 55.05500000000025, 55.04500000000025, 55.03500000000025, 55.025000000000254, 55.015000000000256, 55.00500000000026, 54.99500000000026, 54.98500000000026, 54.975000000000264, ... 47.36500000000076, 47.355000000000764, 47.345000000000766, 47.33500000000077, 47.32500000000077, 47.31500000000077, 47.305000000000774, 47.295000000000776, 47.28500000000078, 47.27500000000078], dtype='float64', name='y', length=780))Attributes: (0) workflow_germany = ge.register_workflow({ "type": "Vector", "operator": { "type": "OgrSource", "params": { "data": "germany", } } }) workflow_germany 2429a993-385f-546f-b4f7-97b3ba4a5adb #Request the data from Geo Engine into a geopandas dataframe germany = workflow_germany.get_dataframe( ge.QueryRectangle( extent, ge.TimeInterval(start_time, start_time), resolution=resolution, srs="EPSG:4326" ) ) fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(24, 8)) da.plot(ax=axes[0], cmap='viridis') germany.boundary.plot(ax=axes[0], color='orange', linewidth=1) axes[0].set_title("Prediction") axes[0].set_xlabel('') axes[0].set_ylabel('') da_log.plot(ax=axes[1], cmap='viridis') germany.boundary.plot(ax=axes[1], color='orange', linewidth=1) axes[1].set_title("Prediction (log)") axes[1].set_xlabel('') axes[1].set_ylabel('') # Vector plot data.plot(ax=axes[2], markersize=10, color='teal') germany.boundary.plot(ax=axes[2], color='orange', linewidth=1) axes[2].set_title("GBIF") axes[2].set_xlabel('') axes[2].set_ylabel('') plt.show() Updates & Changes ++ 25.10.2024 ++
[[ nan, nan, nan, ..., 75. , 73. , 65. ], [ nan, nan, nan, ..., 43. , 43. , 43. ], [ nan, nan, nan, ..., 37. , 36. , 40. ],
... [ 15.433333 , 16.1 , 14.066668 , ..., 16.175 , 17.408333 , 18.425001 ], [ 10.825001 , 14.850001 , 14.416667 , ..., 37.775 , 33.941666 , 29.716667 ], [ 11.216667 , 10.241667 , 13.741668 , ..., 43.55 , 42.108334 , 37.875 ]],
[[ 834.4 , 834.89996 , 835.2999 , ..., 682.3 , 685.1 , 687.1 ], [ 834.5 , 835.1 , 835.5 , ..., 673.5 , 676.4 , 678.3 ], [ 834.7 , 835.2 , 835.7 , ..., 656.30005 , 659.19995 , 661.2 ], ..., [1098.7999 , 1097.2001 , 1097.5 , ..., 924.60004 , 936.5 , 943.3001 ], [1088.6001 , 1094.4 , 1098.6 , ..., 1041.5001 , 1045.1001 , 1018.89996 ], [1094. , 1102.1 , 1108.2999 , ..., 1123.7001 , 1130.8999 , 1081.5 ]]]], dtype=float32)</pre></div></li></ul></div></li><li class='xr-section-item'><input id='section-f21fb5ff-9d23-4a16-9aa9-510e0e81ee92' class='xr-section-summary-in' type='checkbox' ><label for='section-f21fb5ff-9d23-4a16-9aa9-510e0e81ee92' class='xr-section-summary' >Indexes: <span>(4)</span></label><div class='xr-section-inline-details'></div><div class='xr-section-details'><ul class='xr-var-list'><li class='xr-var-item'><div class='xr-index-name'><div>x</div></div><div class='xr-index-preview'>PandasIndex</div><div></div><input id='index-46fd53bf-7d8e-436b-a4f8-f8f06bdd2069' class='xr-index-data-in' type='checkbox'/><label for='index-46fd53bf-7d8e-436b-a4f8-f8f06bdd2069' title='Show/Hide index repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-index-data'><pre>PandasIndex(Float64Index([ 5.854999999999984, 5.864999999999984, 5.874999999999984, 5.884999999999984, 5.894999999999984, 5.904999999999983, 5.914999999999983, 5.924999999999983, 5.934999999999983, 5.9449999999999825, ... 14.934999999999901, 14.9449999999999, 14.9549999999999, 14.9649999999999, 14.9749999999999, 14.9849999999999, 14.9949999999999, 15.0049999999999, 15.0149999999999, 15.024999999999899], dtype='float64', name='x', length=918))</pre></div></li><li class='xr-var-item'><div class='xr-index-name'><div>y</div></div><div class='xr-index-preview'>PandasIndex</div><div></div><input id='index-8054ce07-50ea-4595-8c4a-6e16bb4c7500' class='xr-index-data-in' type='checkbox'/><label for='index-8054ce07-50ea-4595-8c4a-6e16bb4c7500' title='Show/Hide index repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-index-data'><pre>PandasIndex(Float64Index([55.065000000000246, 55.05500000000025, 55.04500000000025, 55.03500000000025, 55.025000000000254, 55.015000000000256, 55.00500000000026, 54.99500000000026, 54.98500000000026, 54.975000000000264, ... 47.36500000000076, 47.355000000000764, 47.345000000000766, 47.33500000000077, 47.32500000000077, 47.31500000000077, 47.305000000000774, 47.295000000000776, 47.28500000000078, 47.27500000000078], dtype='float64', name='y', length=780))</pre></div></li><li class='xr-var-item'><div class='xr-index-name'><div>time</div></div><div class='xr-index-preview'>PandasIndex</div><div></div><input id='index-282dd89b-3f25-43ac-bbfe-030b158d7433' class='xr-index-data-in' type='checkbox'/><label for='index-282dd89b-3f25-43ac-bbfe-030b158d7433' title='Show/Hide index repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-index-data'><pre>PandasIndex(DatetimeIndex(['2001-01-01'], dtype='datetime64[ns]', name='time', freq=None))</pre></div></li><li class='xr-var-item'><div class='xr-index-name'><div>band</div></div><div class='xr-index-preview'>PandasIndex</div><div></div><input id='index-734beb7b-b166-4bc2-bee9-4115008dce2b' class='xr-index-data-in' type='checkbox'/><label for='index-734beb7b-b166-4bc2-bee9-4115008dce2b' title='Show/Hide index repr'><svg class='icon xr-icon-database'><use xlink:href='#icon-database'></use></svg></label><div class='xr-index-data'><pre>PandasIndex(Int64Index([0, 1, 2, 3, 4], dtype='int64', name='band'))</pre></div></li></ul></div></li><li class='xr-section-item'><input id='section-902453eb-39c5-441f-a011-16fdf12cf690' class='xr-section-summary-in' type='checkbox' disabled ><label for='section-902453eb-39c5-441f-a011-16fdf12cf690' class='xr-section-summary' title='Expand/collapse section'>Attributes: <span>(0)</span></label><div class='xr-section-inline-details'></div><div class='xr-section-details'><dl class='xr-attrs'></dl></div></li></ul></div></div>
#Plotting the Layers of the returned xarray dataarray fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(16, 8)) axes[0, 0].set_ylim(-0.01, 1.01) axes[0, 1].set_ylim(-0.01, 1.01) axes[0, 2].set_ylim(-0.01, 1.01) axes[1, 0].set_ylim(-0.01, 1.01) axes[1, 1].set_ylim(-0.01, 1.01) # Add your plot data and other customizations to each subplot prediction_data.isel(band=0).plot(ax=axes[0, 0], vmin=0, vmax=74) prediction_data.isel(band=1).plot(ax=axes[0, 1], vmin=0, vmax=3000) prediction_data.isel(band=2).plot(ax=axes[0, 2]) prediction_data.isel(band=3).plot(ax=axes[1, 0], vmin=-100, vmax=300) prediction_data.isel(band=4).plot(ax=axes[1, 1]) axes[0, 0].set_title("Ökosystematlas") axes[0, 1].set_title("SRTM") axes[0, 2].set_title("Mean Annual Air Temperature") axes[1, 0].set_title("Mean Annual Climate moisture indices") axes[1, 1].set_title("Sum Annual Precipitation") plt.subplots_adjust(wspace=0.2, hspace=0.4) plt.show()
In this chapter, the training data is used to create a simple RandomForestRegressor model, which is hyperparameterised using a GridSearchCV and the best model is selected for prediction later.
#Create training and test data X = training_data[['Mean Air Temperature', 'Mean Climate Moisture Index', 'Precipitation', 'SRTM', 'Ökosystematlas']] y = training_data['counts'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
#Define the hyperparameter grid param_grid = { 'n_estimators': [200, 400, 600, 800, 1000], 'max_depth': [5, 10, 15, 20], 'min_samples_split': [2, 5, 10], 'min_samples_leaf': [1, 2, 4] } #Create the random forest regressor model rf = RandomForestRegressor() #Perform grid search cross-validation grid_search = GridSearchCV(rf, param_grid, cv=5, scoring='neg_mean_squared_error', n_jobs=4, verbose=2) grid_search.fit(X_train, y_train) #Get the best hyperparameters and model best_params = grid_search.best_params_ best_model = grid_search.best_estimator_
Fitting 5 folds for each of 180 candidates, totalling 900 fits [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.5s [CV] END max_depth=5, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=200; total time= 0.1s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=600; total time= 0.4s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=10, min_samples_leaf=4, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=800; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=1, min_samples_split=10, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=200; total time= 0.2s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=2, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=5, n_estimators=1000; total time= 0.8s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=600; total time= 0.5s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=2, min_samples_split=10, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=200; total time= 0.1s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=2, n_estimators=1000; total time= 0.7s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=400; total time= 0.3s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=600; total time= 0.4s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=800; total time= 0.6s [CV] END max_depth=15, min_samples_leaf=4, min_samples_split=5, n_estimators=1000; total time= 0.7s
best_params
{'max_depth': 5, 'min_samples_leaf': 4, 'min_samples_split': 10, 'n_estimators': 200}
#Simple prediction on the training data y_pred = best_model.predict(X_test) #Model performance using r2 r2 = r2_score(y_test, y_pred) print(f"R2 score: {r2:.2f}")
R2 score: 0.06
In this chapter the best model is chosen and used to predict on the prediction data, for the whole of Germany.
# Flatten the xarray dataset to a 2D array prediction_df = prediction_data.to_dataset(dim="band").to_dataframe().reset_index() X_pred = prediction_df.loc[:, [0, 1, 2, 3, 4]] X_pred.columns = ["Ökosystematlas", "SRTM", "Mean Air Temperature", "Mean Climate Moisture Index", "Precipitation"] X_pred = X_pred[['Mean Air Temperature', 'Mean Climate Moisture Index', 'Precipitation', 'SRTM', 'Ökosystematlas']] # Use the trained model to make predictions y_pred = best_model.predict(X_pred) y_pred_log = np.log(y_pred) y_pred
array([9.35699139, 9.35699139, 9.35699139, ..., 6.37438362, 6.25392884, 6.70247926])
#Extract coordinates for spatial alignment prediction_ds = prediction_data.to_dataset(name='prediction_data') x_coords = prediction_ds.coords['x'].values y_coords = prediction_ds.coords['y'].values
#Reshape the model prediction for plotting y_pred_reshaped = y_pred.reshape(prediction_data.time.size, 1, prediction_data.y.size, prediction_data.x.size) da = xr.DataArray(y_pred_reshaped, dims=['time', 'band', 'y', 'x']) da = da.assign_coords(x=x_coords, y=y_coords) da.rio.write_crs('EPSG:4326', inplace=True) #Reshape the model prediction for plotting y_pred_reshaped_log = y_pred_log.reshape(prediction_data.time.size, 1, prediction_data.y.size, prediction_data.x.size) da_log = xr.DataArray(y_pred_reshaped_log, dims=['time', 'band', 'y', 'x']) da_log = da_log.assign_coords(x=x_coords, y=y_coords) da_log.rio.write_crs('EPSG:4326', inplace=True)
<xarray.DataArray (time: 1, band: 1, y: 780, x: 918)> Size: 6MB array([[[[2.23612381, 2.23612381, 2.23612381, ..., 0.80446736, 0.80446736, 0.80446736], [2.23612381, 2.23612381, 2.23612381, ..., 0.80937892, 0.80937892, 0.80937892], [2.23612381, 2.23612381, 2.23612381, ..., 0.81060927, 0.81060927, 0.81060927], ..., [0.75689373, 0.75689373, 0.75689373, ..., 1.5268051 , 1.40980293, 1.51508751], [0.75689373, 0.75689373, 0.75689373, ..., 1.68137312, 1.86942503, 1.63518892], [0.75689373, 0.75583891, 0.75511056, ..., 1.8522874 , 1.83320988, 1.9024775 ]]]]) Coordinates:
array([[[[2.23612381, 2.23612381, 2.23612381, ..., 0.80446736, 0.80446736, 0.80446736], [2.23612381, 2.23612381, 2.23612381, ..., 0.80937892, 0.80937892, 0.80937892], [2.23612381, 2.23612381, 2.23612381, ..., 0.81060927, 0.81060927, 0.81060927], ..., [0.75689373, 0.75689373, 0.75689373, ..., 1.5268051 , 1.40980293, 1.51508751], [0.75689373, 0.75689373, 0.75689373, ..., 1.68137312, 1.86942503, 1.63518892], [0.75689373, 0.75583891, 0.75511056, ..., 1.8522874 , 1.83320988, 1.9024775 ]]]])
PandasIndex(Float64Index([ 5.854999999999984, 5.864999999999984, 5.874999999999984, 5.884999999999984, 5.894999999999984, 5.904999999999983, 5.914999999999983, 5.924999999999983, 5.934999999999983, 5.9449999999999825, ... 14.934999999999901, 14.9449999999999, 14.9549999999999, 14.9649999999999, 14.9749999999999, 14.9849999999999, 14.9949999999999, 15.0049999999999, 15.0149999999999, 15.024999999999899], dtype='float64', name='x', length=918))
PandasIndex(Float64Index([55.065000000000246, 55.05500000000025, 55.04500000000025, 55.03500000000025, 55.025000000000254, 55.015000000000256, 55.00500000000026, 54.99500000000026, 54.98500000000026, 54.975000000000264, ... 47.36500000000076, 47.355000000000764, 47.345000000000766, 47.33500000000077, 47.32500000000077, 47.31500000000077, 47.305000000000774, 47.295000000000776, 47.28500000000078, 47.27500000000078], dtype='float64', name='y', length=780))
workflow_germany = ge.register_workflow({ "type": "Vector", "operator": { "type": "OgrSource", "params": { "data": "germany", } } }) workflow_germany
#Request the data from Geo Engine into a geopandas dataframe germany = workflow_germany.get_dataframe( ge.QueryRectangle( extent, ge.TimeInterval(start_time, start_time), resolution=resolution, srs="EPSG:4326" ) )
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(24, 8)) da.plot(ax=axes[0], cmap='viridis') germany.boundary.plot(ax=axes[0], color='orange', linewidth=1) axes[0].set_title("Prediction") axes[0].set_xlabel('') axes[0].set_ylabel('') da_log.plot(ax=axes[1], cmap='viridis') germany.boundary.plot(ax=axes[1], color='orange', linewidth=1) axes[1].set_title("Prediction (log)") axes[1].set_xlabel('') axes[1].set_ylabel('') # Vector plot data.plot(ax=axes[2], markersize=10, color='teal') germany.boundary.plot(ax=axes[2], color='orange', linewidth=1) axes[2].set_title("GBIF") axes[2].set_xlabel('') axes[2].set_ylabel('') plt.show()