PackageTutorial.Rmd
In this vignette you will learn how to use the VegX R package to map, integrate and harmonize vegetation data using the Veg-X standard (v. 2.0). For the examples, we use the data sets provided in the package. If you do not know what the Veg-X standard is, please refer to vignette The Veg-X exchange standard. Here we refer to elements of the Veg-X standard. Readers should refer to the same vignette to understand the definition of these elements and their logical relationships.
We envisage two different kinds of users of the VegX package:
The VegX package is currently distributed from GitHub. To install it, you should have package devtools
installed and use the following command: devtools::install_github("iavs-org/VegX", build_vignettes=TRUE)
. Assuming that the package is already installed you begin by loading it, which results in the required package XML also being loaded:
User’s of the VegX package are expected to know how import their source data into R, either using a database connection or by reading files in diverse formats (e.g. txt, csv, xlsx, …). For the examples of this manual, we will use three data sets that were extracted from the New Zealand National Vegetation Survey (NVS) Databank. These are subsets of the original datasets prepared for demonstration purposes only.
Each of the three data sets contains different tables, corresponding to plot location, site observations, taxon observations, … For simplicity, we reduced the number of plots in each example data set to five, although some data sets contain subplots. As the three data sets are included with the package, we load the data from the three data sets into the R workspace using:
## [1] "moki_dia" "moki_loc" "moki_lookup"
## [4] "moki_site" "moki_str" "moki_tcv"
## [7] "mtfyffe_counts" "mtfyffe_dia" "mtfyffe_disturbance"
## [10] "mtfyffe_groundcover" "mtfyffe_loc" "mtfyffe_lookup"
## [13] "mtfyffe_site" "taki_disturbance" "taki_freq"
## [16] "taki_groundcover" "taki_loc" "taki_lookup"
## [19] "taki_site"
Before mapping any data to the Veg-X standard, we need to create a new (empty) document for each data set, using newVegX()
:
The output from print()
command reveals that a Veg-X document is defined in R using a S4 class, each of the different slots being vectors of the main elements of the Veg-X document:
print(moki_vegx)
## An object of class "VegX"
## Slot "VegXVersion":
## [1] "2.0.0"
##
## Slot "parties":
## list()
##
## Slot "literatureCitations":
## list()
##
## Slot "methods":
## list()
##
## Slot "attributes":
## list()
##
## Slot "strata":
## list()
##
## Slot "surfaceTypes":
## list()
##
## Slot "organismNames":
## list()
##
## Slot "taxonConcepts":
## list()
##
## Slot "organismIdentities":
## list()
##
## Slot "projects":
## list()
##
## Slot "plots":
## list()
##
## Slot "individualOrganisms":
## list()
##
## Slot "plotObservations":
## list()
##
## Slot "individualObservations":
## list()
##
## Slot "aggregateObservations":
## list()
##
## Slot "stratumObservations":
## list()
##
## Slot "communityObservations":
## list()
##
## Slot "siteObservations":
## list()
##
## Slot "surfaceCoverObservations":
## list()
Printing a Veg-X object will normally result in too much data being shown in the console output. More user-friendly information about the Veg-X object can be obtained using the function summary()
, which tell us how many instances we have of each of the main elements:
summary(moki_vegx)
## ================================================================
## Veg-X object (ver 2.0.0)
## ----------------------------------------------------------------
##
## Projects: 0
##
## Plots: 0 [Parent plots: 0 Sub-plots: 0]
##
## Individual organisms: 0
##
## Organism names: 0
##
## Taxon concepts: 0
##
## Organism Identities: 0
##
## Vegetation strata: 0
##
## Surface types: 0
##
## Parties: 0
##
## Literature citations: 0
##
## Methods: 0
##
## Plot observations: 0 [in parent plots: 0 in sub-plots: 0]
##
## Individual organism observations: 0
##
## Aggregated organism observations: 0
##
## Stratum observations: 0
##
## Community observations: 0
##
## Site observations: 0
##
## Surface cover observations: 0
##
## ================================================================
Of course, moki_vegx
is now empty (as are the other two VegX objects).
In the following sections we will progressively add content to the VegX objects. When using the VegX package, the order in which we introduce data to VegX documents is not particularly important, as elements are created as needed. Nevertheless, we will introduce the different functions that add data following a logical sequence. Thus, we begin by introducing plot and survey information, followed by observations of individual organisms, taxa, strata, etc. Later sections of the manual deal with functions that facilitate data integration and harmonization.
Function | Description |
---|---|
addPlotObservation() |
Adds plot observation records to a VegX object from a data table where rows are plot observations. |
addPlotLocations() |
Adds/replaces static plot location information (spatial coordinates, elevation, place names, …) to plot elements of a VegX object. |
addPlotGeometries() |
Adds/replaces static plot geometry information (plot shape, dimensions, …) to plot elements of a VegX object. |
addSiteCharacteristics() |
Adds/replaces static site characteristics (topography, geology, …) to plot elements of a VegX object. |
fillProjectInformation() |
Fills the information for a given research project. |
In this subsection we show how to introduce information about plot names and survey dates. We start with the Mokinihui forest data set, by inspecting the data in the data frame moki_site
:
head(moki_site, 3)
## PlotID PlotObsID Plot Subplot Project
## 38 789033 1161630 LGM08r MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 39 789034 1161631 LGM08r 1Q MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 40 789035 1161632 LGM08r 2Q MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## PlotObsCurrentName PlotLocationDescription PlotObsStartDate PlotObsStopDate
## 38 LGM08r Mokihinui, Lower gorge 2011-02-17 2011-02-17
## 39 1Q 2011-02-17 2011-02-17
## 40 2Q 2011-02-17 2011-02-17
## MeanTopHeight MeanTopHeightUnits PlotPermanence PlotObsCanopyPercentage
## 38 3 m True 20
## 39 NA NA
## 40 NA NA
## PlotArea AreaUnits Shape Altitude AltitudeUnits Drainage
## 38 400 m Square 95 m Good
## 39 100 m Square NA
## 40 100 m Square NA
## DrainageTechniqueName AltitudeDatum PlotSlope SlopeUnits PlotTreatment
## 38 Standard Soil Drainage a.s.l 40 degrees Normal
## 39 NA
## 40 NA
## PlotAspect AspectDirection PlotObsIsRelocated Physiography
## 38 360 NA NA Face
## 39 NA NA NA
## 40 NA NA NA
## PhysiographyTechniqueName ParentMaterial ParentMaterialTechniqueName
## 38 Standard Physiography NA NA
## 39 NA NA
## 40 NA NA
## PlotRadius RadiusUnits PlotRectangleLength01 PlotRectangleLength02
## 38 NA NA 20 20
## 39 NA NA 10 10
## 40 NA NA 10 10
## RectangleUnits Placement ParentPlotID ParentPlotObsID
## 38 m Objective: Stratified random 789033 1161630
## 39 m 789033 1161630
## 40 m 789033 1161630
## ProjectID pH
## 38 2444 7
## 39 2444 7
## 40 2444 7
The data frame has many columns, encompasing both plot shape, site characteirstics, experimental treamtments, etc. The most important columns to parse in the beginning are Plot
, Subplot
and PlotObsStartDate
, because these specify the space and time context of the vegetation observations. Other columns specify identifiers (IDs), but these are specific to the source data base. As Veg-X documents have their own internal IDs, it is not necessary to import the source identifiers.
To import data into Veg-X documents, we almost always need a mapping between the names of elements in the Veg-X standard and the names of columns in the data table used as input. For example, in the following code we define that column "Project"
in the source data table contains the information about the projectTitle in Veg-X, column "Plot"
contains the information about the plotName element, and so on:
mapping = list(projectTitle = "Project", plotName = "Plot", subPlotName = "Subplot",
obsStartDate = "PlotObsStartDate", obsEndDate = "PlotObsStopDate")
Once the mapping is defined, we can import the data using addPlotObservations()
:
moki_vegx = addPlotObservations(moki_vegx, moki_site, mapping = mapping)
## 1 project(s) parsed, 1 new project(s) added.
## 25 plot(s) parsed, 25 new plot(s) added.
## 25 plot observation(s) parsed, 25 new plot observation(s) added.
The console output of the add function informs us of the steps that took place and the modifications of our Veg-X R object (note that we could store the result in a different object instead of replacing moki_vegx
). 25 plots were identified, all belonging to the same research project, and one plot observation was read for each plot. If we again call the summary
function we will see a change in the number of data elements:
summary(moki_vegx)
## ================================================================
## Veg-X object (ver 2.0.0)
## ----------------------------------------------------------------
##
## Projects: 1
## 1. MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
##
## Plots: 25 [Parent plots: 5 Sub-plots: 20]
##
## Individual organisms: 0
##
## Organism names: 0
##
## Taxon concepts: 0
##
## Organism Identities: 0
##
## Vegetation strata: 0
##
## Surface types: 0
##
## Parties: 0
##
## Literature citations: 0
##
## Methods: 0
##
## Plot observations: 25 [in parent plots: 5 in sub-plots: 20]
##
## Individual organism observations: 0
##
## Aggregated organism observations: 0
##
## Stratum observations: 0
##
## Community observations: 0
##
## Site observations: 0
##
## Surface cover observations: 0
##
## ================================================================
Note that among the 25 plots there are 20 sub-plots (i.e. 4 quadrants for each parent plot). If we want to inspect, at any time, the content of a Veg-X object in more detail, we can use the function showElementTable()
, indicating which of the main Veg-X elements we want to inspect:
head(showElementTable(moki_vegx, "plotObservation"),6)
## plotName obsStartDate obsEndDate projectTitle
## 1 LGM08r 2011-02-17 2011-02-17 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 2 LGM08r_1Q 2011-02-17 2011-02-17 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 3 LGM08r_2Q 2011-02-17 2011-02-17 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 4 LGM08r_3Q 2011-02-17 2011-02-17 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 5 LGM08r_4Q 2011-02-17 2011-02-17 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 6 LGM16l 2011-02-15 2011-02-15 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
When sub-plots are added to a VegX object, the package automatically names them by concatenating the name of the plot with the name of the subplot, with an underscore ’_’ to separate both strings.
Let’s now read plots and plot observations for the Mt Fyffe forest data set. Since it comes from the same vegetation data base (NVS), the data tables have similar column names and we will not show them again. In this case, however, there is no information about the sampling end date, only the start date. We modify our mapping accordingly and we call addPlotObservations()
:
mapping = list(projectTitle = "Project", plotName = "Plot", subPlotName = "Subplot",
obsStartDate = "PlotObsStartDate")
mtfyffe_vegx = addPlotObservations(mtfyffe_vegx, mtfyffe_site, mapping)
## 2 project(s) parsed, 2 new project(s) added.
## 165 plot(s) parsed, 165 new plot(s) added.
## 326 plot observation(s) parsed, 326 new plot observation(s) added.
summary(mtfyffe_vegx)
## ================================================================
## Veg-X object (ver 2.0.0)
## ----------------------------------------------------------------
##
## Projects: 2
## 1. FYFFE, MOUNT FOREST 1980
## 2. FYFFE, MOUNT FOREST 2007-2008
##
## Plots: 165 [Parent plots: 4 Sub-plots: 161]
##
## Individual organisms: 0
##
## Organism names: 0
##
## Taxon concepts: 0
##
## Organism Identities: 0
##
## Vegetation strata: 0
##
## Surface types: 0
##
## Parties: 0
##
## Literature citations: 0
##
## Methods: 0
##
## Plot observations: 326 [in parent plots: 8 in sub-plots: 318]
##
## Individual organism observations: 0
##
## Aggregated organism observations: 0
##
## Stratum observations: 0
##
## Community observations: 0
##
## Site observations: 0
##
## Surface cover observations: 0
##
## ================================================================
In this source data set there are many more sub-plots for each parent plot, and each plot was visited twice (in 1980 and the austral summer of 2007-2008). Moreover, each survey corresponds to a different project. Veg-X does not require projects to be equated to surveys, but this data set is structured this way. We now turn our attention to the Takitimu grassland data set.
mapping = list(projectTitle = "Project", plotName = "Plot", subPlotName = "Subplot",
obsStartDate = "PlotObsStartDate")
taki_vegx = addPlotObservations(taki_vegx, taki_site, mapping)
## 1 project(s) parsed, 1 new project(s) added.
## 5 plot(s) parsed, 5 new plot(s) added.
## 5 plot observation(s) parsed, 5 new plot observation(s) added.
summary(taki_vegx)
## ================================================================
## Veg-X object (ver 2.0.0)
## ----------------------------------------------------------------
##
## Projects: 1
## 1. TAKITIMU GRASSLAND 1968-1969
##
## Plots: 5 [Parent plots: 5 Sub-plots: 0]
##
## Individual organisms: 0
##
## Organism names: 0
##
## Taxon concepts: 0
##
## Organism Identities: 0
##
## Vegetation strata: 0
##
## Surface types: 0
##
## Parties: 0
##
## Literature citations: 0
##
## Methods: 0
##
## Plot observations: 5 [in parent plots: 5 in sub-plots: 0]
##
## Individual organism observations: 0
##
## Aggregated organism observations: 0
##
## Stratum observations: 0
##
## Community observations: 0
##
## Site observations: 0
##
## Surface cover observations: 0
##
## ================================================================
According to this summary, this third data set contains again 5 plots, but with no sub-plots, so even though we specified a mapping for sub-plots, there were no sub-plots in the source data table to populate the VegX object.
In the previous subsection, we specified a mapping for research project titles, and this lead to the creation of project elements in Veg-X documents. However, we did not introduce any data describing the project:
showElementTable(moki_vegx, "project")
## title
## 1 MOKIHINUI HYDRO PROPOSAL - ...
Veg-X package provides the function fillProjectInformation()
to fill project data. It can be used to fill the data for an existing project (identified by its title) or to define a new project. In this case the data is introduced directly as text to the parameters of the function, instead of being read from a data frame. As an example, we provide the information for the project that led to the collection of data in the Mokihinui forest:
moki_vegx = fillProjectInformation(moki_vegx, "MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011",
personnel = c(contributor = "Susan K. Wiser"),
abstract = paste("Characterise the forest and riparian vegetation",
"in the lower Mokihinui gorge,",
"and compare this with the vegetation",
"in (a) North Branch gorge of Mokihinui",
"and (b) Karamea catchment."),
studyAreaDescription = paste("Mokihinui and Karamea catchments.",
" Forest riparian habitat."))
## 1 new party(ies) added to the document as individuals.
showElementTable(moki_vegx, "project")
## title abstract
## 1 MOKIHINUI HYDRO PROPOSAL - ... Characterise the forest and...
## studyAreaDescription
## 1 Mokihinui and Karamea catch...
Note that filling the information about the project led to the definition of personnel involved in the project. In the Veg-X standard any individual/organization/position involved in the creation of a data set is stored in a party element. We may fill contact information for party elements using the function fillPartyInformation()
.
The next piece of information we will introduce are the geographic locations of plots (sampling dates were already mapped with addPlotObservations()
). We thus take a look at moki_loc
data frame:
head(moki_loc, 3)
## AbsoluteCoordID Plot Subplot Project
## 7 201568 LGM45h NA MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 11 201531 LGM08r NA MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 14 201561 LGM38h NA MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## Type AbsoluteCoordXEastLong AbsoluteCoordYNorthLat Datum
## 7 Grid Coordinate (map) 2429288 5960763 NZGD1949
## 11 Grid Coordinate (map) 2438288 5962438 NZGD1949
## 14 Grid Coordinate (map) 2431238 5962363 NZGD1949
## MapProjection MapSeries MapSheet GPS Method
## 7 NZMG NZMS 260 L28 Garmin GPSMap 60CSX GPS
## 11 NZMG NZMS 260 L28 Garmin GPSMap 60CSX GPS
## 14 NZMG NZMS 260 L28 Garmin GPSMap 60CSX GPS
## Source EastingMG NorthingMG Longitude Latitude PrecisionMetres
## 7 Original Coordinate 2429288 5960763 172.0326 -41.5559 5
## 11 Original Coordinate 2438288 5962438 172.1407 -41.5417 5
## 14 Original Coordinate 2431238 5962363 172.0562 -41.5417 5
## PrecisionShape ParentPlotID ParentPlotObsID PlotCoordinateID PlotObsID
## 7 Circle 789218 1161815 287136 1161815
## 11 Circle 789033 1161630 287099 1161630
## 14 Circle 789183 1161780 287129 1161780
## ProjectID
## 7 2444
## 11 2444
## 14 2444
Locations are expressed using different coordinate systems, but the easiest and more common way of exchanging geographic information is by using latitude and longitude. Hence, we define a new mapping and use the function addPlotLocations()
:
mapping = list(plotName = "Plot", x = "Longitude", y = "Latitude")
moki_vegx = addPlotLocations(moki_vegx, moki_loc, mapping,
proj4string = "+proj=longlat +datum=WGS84")
## 5 plot(s) parsed, 0 new plot(s) added.
## 5 record(s) parsed.
When defining the mapping x
and y
are used to map coordinates. We should also include plotName
, because otherwise the function does not know how to match coordinates with the plots already defined in moki_vegx
(subPlotName
should be included if coordinates are available for subplots). Parameter proj4string
is used to supply the spatial reference system of the coordinates. The console output indicates that no new plots have been added (they were previously defined), but they would if we had started populating an empty Veg-X object using addPlotLocations()
. We can inspect the data recently entered using the following command:
head(showElementTable(moki_vegx, "plot"),3)
## plotName coordX coordY spatialReference relatedPlotName
## 1 LGM08r 172.1407 -41.5417 +proj=longlat +datum=WGS84 <NA>
## 2 LGM08r_1Q NA NA <NA> LGM08r
## 3 LGM08r_2Q NA NA <NA> LGM08r
## plotRelationship
## 1 <NA>
## 2 subplot
## 3 subplot
When calling showElementTable()
for plot elements we are showing the plot/sub-plot relationships. Note that sub-plots have no explicit coordinates associated to them (they are not given in moki_loc
). It is up to the user to provide them in the source data. Using the same mapping we can parse the coordinates of the Mt Fyffe forest data set:
mtfyffe_vegx = addPlotLocations(mtfyffe_vegx, mtfyffe_loc, mapping)
## 4 plot(s) parsed, 0 new plot(s) added.
## 8 record(s) parsed.
In this example, 8 records were parsed, but coordinates are availble for four plots only. Coordinate records are duplicated in mtfyffe_loc
, because they are provided independently for each survey. The function addPlotLocations()
will only keep the most recently read location records of each plot. Finally, we parse plot coordinates for the Takikimu grassland data set, realizing that they are missing for three of the plots.
taki_vegx = addPlotLocations(taki_vegx, taki_loc, mapping)
## 5 plot(s) parsed, 0 new plot(s) added.
## 5 record(s) parsed.
## 3 record(s) with missing value(s) not added.
The function addPlotLocations()
accepts coordinates in any spatial reference system (which is specified using the parameter proj4string
). Setting toWGS84 = TRUE
will indicate to the function that it should attempt to translate the input coordinates into longitude and latitude, but this was not required in our examples.
While x
and y
specify horizontal plot position, the vertical position of a plot is specified using elevation (normally above sea level). Since plot elevation is a measurement, it is important to specify a measurement method (i.e. instruments) and a measurement scale (i.e. measurement units) because this metadata decreases potential errors when pooling data from different sources. In the Veg-X standard, this information is specified via defining method and attribute elements, whereas the VegX package has a S4 class named VegXMethod
that encapsulates both things. Users can define their own methods, but the package provides function predefinedMeasurementMethod()
to easily define methods for the most common variables. For example, we can define the measurement for elevation in meters above sea level using:
elevMethod = predefinedMeasurementMethod("Elevation/m")
Plot elevation is added to Veg-X documents using addPlotLocations()
as before. However, in our Mokihinui data set elevation is included in the data frame moki_site
(and not moki_loc
), so we could not add it using the same call that we used for plot coordinates. Having our elevation method defined, we use again:
mapping = list(plotName = "Plot", elevation = "Altitude")
moki_vegx = addPlotLocations(moki_vegx, moki_site, mapping,
methods = list(elevation = elevMethod))
## Measurement method 'Elevation/m' added for 'elevation'.
## 5 plot(s) parsed, 0 new plot(s) added.
## 25 record(s) parsed.
## 20 record(s) with missing value(s) not added.
Only the parent plots have elevation data (i.e., the records of sub-plots are missing). If we inspect again the plot elements of our document we find that elevation data has been added to plot coordinates:
head(showElementTable(moki_vegx, "plot"),3)
## plotName coordX coordY spatialReference elevation_method
## 1 LGM08r 172.1407 -41.5417 +proj=longlat +datum=WGS84 Elevation/m
## 2 LGM08r_1Q NA NA <NA> <NA>
## 3 LGM08r_2Q NA NA <NA> <NA>
## elevation_value relatedPlotName plotRelationship
## 1 95 <NA> <NA>
## 2 NA LGM08r subplot
## 3 NA LGM08r subplot
Analogous calls to addPlotLocations()
can be made to fill elevation data for the Mt Fyffe forest and the Takitimu grassland data sets:
mtfyffe_vegx = addPlotLocations(mtfyffe_vegx, mtfyffe_site, mapping,
methods = c(elevation = elevMethod))
## Measurement method 'Elevation/m' added for 'elevation'.
## 4 plot(s) parsed, 0 new plot(s) added.
## 326 record(s) parsed.
## 318 record(s) with missing value(s) not added.
taki_vegx = addPlotLocations(taki_vegx, taki_site, mapping,
methods = list(elevation = "Elevation/m"))
## Measurement method 'Elevation/m' added for 'elevation'.
## 5 plot(s) parsed, 0 new plot(s) added.
## 5 record(s) parsed.
Note that in this case we specified the method for elevation using a string directly. This avoids having to call function predefinedMeasurementMethod()
.
By plot geometry, we refer to plot area, shape and dimensions. Veg-X allows different plot shapes (circle, rectangle, line or polygon), and each plot shape implies different dimensions. Plot geometry is specified using function addPlotGeometries()
and, analogously to addPlotLocation()
, the function will replace any previous information regarding geometry. We start by looking at the plot geometry fields in the Mokihinui forest data set table moki_site
:
names(moki_site)
## [1] "PlotID" "PlotObsID"
## [3] "Plot" "Subplot"
## [5] "Project" "PlotObsCurrentName"
## [7] "PlotLocationDescription" "PlotObsStartDate"
## [9] "PlotObsStopDate" "MeanTopHeight"
## [11] "MeanTopHeightUnits" "PlotPermanence"
## [13] "PlotObsCanopyPercentage" "PlotArea"
## [15] "AreaUnits" "Shape"
## [17] "Altitude" "AltitudeUnits"
## [19] "Drainage" "DrainageTechniqueName"
## [21] "AltitudeDatum" "PlotSlope"
## [23] "SlopeUnits" "PlotTreatment"
## [25] "PlotAspect" "AspectDirection"
## [27] "PlotObsIsRelocated" "Physiography"
## [29] "PhysiographyTechniqueName" "ParentMaterial"
## [31] "ParentMaterialTechniqueName" "PlotRadius"
## [33] "RadiusUnits" "PlotRectangleLength01"
## [35] "PlotRectangleLength02" "RectangleUnits"
## [37] "Placement" "ParentPlotID"
## [39] "ParentPlotObsID" "ProjectID"
## [41] "pH"
table(moki_site$Shape)
##
## Square
## 25
After realizing that plot/subplot shapes are rectangular and both length and width are available, we define the following mapping for rectangular (or square) plots:
mapping = list(plotName = "Plot", subPlotName = "Subplot",
area = "PlotArea", shape = "Shape",
length = "PlotRectangleLength01", width = "PlotRectangleLength02")
Like elevation, plot area and plot dimensions are measurements so we need to define them. We are now ready to import plot geometry using addPlotGeometries()
, where we specify both the mapping and the list of methods corresponding to the Veg-X element names of the mapping (i.e. area
, length
and width
for rectangular plots):
moki_vegx = addPlotGeometries(moki_vegx, moki_site, mapping,
list(area = "Plot area/m2", width = "Plot dimension/m", length = "Plot dimension/m"))
## Measurement method 'Plot area/m2' added for 'area'.
## Measurement method 'Plot dimension/m' added for 'width'.
## Measurement method 'Plot dimension/m' for 'length' already included.
## 25 plot(s) parsed, 0 new plot(s) added.
## 25 record(s) parsed.
head(showElementTable(moki_vegx, "plot"),3)
## plotName area_method area_value shape length_method length_value
## 1 LGM08r Plot area/m2 400 rectangle Plot dimension/m 20
## 2 LGM08r_1Q Plot area/m2 100 rectangle Plot dimension/m 10
## 3 LGM08r_2Q Plot area/m2 100 rectangle Plot dimension/m 10
## width_method width_value coordX coordY spatialReference
## 1 Plot dimension/m 20 172.1407 -41.5417 +proj=longlat +datum=WGS84
## 2 Plot dimension/m 10 NA NA <NA>
## 3 Plot dimension/m 10 NA NA <NA>
## elevation_method elevation_value relatedPlotName plotRelationship
## 1 Elevation/m 95 <NA> <NA>
## 2 <NA> NA LGM08r subplot
## 3 <NA> NA LGM08r subplot
Like before, no new plots were added, as previous call functions had already defined them. In the call to showElementTable()
we have now the plot geometry added to the plot location and plot/subplot relationships. Importing plot geometry for the Takitimu grassland data set is analogous:
taki_vegx = addPlotGeometries(taki_vegx, taki_site, mapping,
list(area = "Plot area/m2", width = "Plot dimension/m", length = "Plot dimension/m"))
## Measurement method 'Plot area/m2' added for 'area'.
## Measurement method 'Plot dimension/m' added for 'width'.
## Measurement method 'Plot dimension/m' for 'length' already included.
## 5 plot(s) parsed, 0 new plot(s) added.
## 5 record(s) parsed.
head(showElementTable(taki_vegx, "plot"))
## plotName area_method area_value shape length_method
## 1 WB2 2 (WB2 2_TRA) Plot area/m2 400 rectangle Plot dimension/m
## 2 R1 1 (R1 1_TRA) Plot area/m2 400 rectangle Plot dimension/m
## 3 T2 2 (T2 2_TRA) Plot area/m2 400 rectangle Plot dimension/m
## 4 T1 2 (T1 2_TRA) Plot area/m2 400 rectangle Plot dimension/m
## 5 R3 1 (R3 1_TRA) Plot area/m2 400 rectangle Plot dimension/m
## length_value width_method width_value coordX coordY elevation_method
## 1 20 Plot dimension/m 20 167.9438 -45.6321 Elevation/m
## 2 20 Plot dimension/m 20 NA NA Elevation/m
## 3 20 Plot dimension/m 20 NA NA Elevation/m
## 4 20 Plot dimension/m 20 NA NA Elevation/m
## 5 20 Plot dimension/m 20 167.8307 -45.7138 Elevation/m
## elevation_value
## 1 1463
## 2 1097
## 3 914
## 4 1097
## 5 1067
In the case of Mt Fyffe forest data set, shape is missing for many of plots. Other plots are circular but radius is not defined in the mtfyffe_site
data frame.
names(mtfyffe_site)
## [1] "PlotID" "PlotObsID"
## [3] "Plot" "Subplot"
## [5] "Project" "PlotObsCurrentName"
## [7] "PlotLocationDescription" "PlotObsStartDate"
## [9] "PlotObsStopDate" "MeanTopHeight"
## [11] "MeanTopHeightUnits" "PlotPermanence"
## [13] "PlotArea" "AreaUnits"
## [15] "Shape" "Altitude"
## [17] "AltitudeUnits" "Drainage"
## [19] "DrainageTechniqueName" "PlotSlope"
## [21] "SlopeUnits" "PlotTreatment"
## [23] "PlotAspect" "Physiography"
## [25] "PhysiographyTechniqueName" "Placement"
## [27] "ParentPlotID" "ParentPlotObsID"
## [29] "ProjectID"
table(mtfyffe_site$Shape)
##
## Circle
## 135 191
Hence, we define the following mapping, and a call addPlotGeometries()
produces the following result:
mapping = list(plotName = "Plot", subPlotName = "Subplot",
area = "PlotArea", shape = "Shape")
mtfyffe_vegx = addPlotGeometries(mtfyffe_vegx, mtfyffe_site, mapping,
list(area = "Plot area/m2"))
## Measurement method 'Plot area/m2' added for 'area'.
## 165 plot(s) parsed, 0 new plot(s) added.
## 326 record(s) parsed.
## 1 record(s) with missing value(s) not added.
head(showElementTable(mtfyffe_vegx, "plot"), 3)
## plotName area_method area_value coordX coordY elevation_method
## 1 6 4 Plot area/m2 400 173.6370 -42.3276 Elevation/m
## 2 12 1 Plot area/m2 400 173.6479 -42.2728 Elevation/m
## 3 14 2 Plot area/m2 400 173.6532 -42.2563 Elevation/m
## elevation_value relatedPlotName plotRelationship shape
## 1 480 <NA> <NA> <NA>
## 2 575 <NA> <NA> <NA>
## 3 879 <NA> <NA> <NA>
To finish with static plot information, the next data we should add to our Veg-X documents is plot topography. This can be done using function addSiteCharacteristics()
, which also allows introducing other site attributes that are considered static in time for the time scales of vegetation dynamics (e.g. geological parent material). Having inspected data frame moki_site
before makes us suspect that an appropriate mapping is:
sitemapping = list(plotName = "Plot", subPlotName = "Subplot",
slope = "PlotSlope", aspect = "PlotAspect")
Since slope and aspect are again measurements, we also need to provide methods for them. After checking the units in the source data we are ready to import the data:
moki_vegx = addSiteCharacteristics(moki_vegx, moki_site, mapping = sitemapping,
measurementMethods = list(slope = "Slope/degrees", aspect = "Aspect/degrees"))
## Measurement method 'Slope/degrees' added for 'slope'.
## Measurement method 'Aspect/degrees' added for 'aspect'.
## 25 plot(s) parsed, 0 new plot(s) added.
## 25 record(s) parsed.
## 40 measurement(s) with missing value(s) not added.
head(showElementTable(moki_vegx, "plot"), 3)
## plotName area_method area_value shape length_method length_value
## 1 LGM08r Plot area/m2 400 rectangle Plot dimension/m 20
## 2 LGM08r_1Q Plot area/m2 100 rectangle Plot dimension/m 10
## 3 LGM08r_2Q Plot area/m2 100 rectangle Plot dimension/m 10
## width_method width_value coordX coordY spatialReference
## 1 Plot dimension/m 20 172.1407 -41.5417 +proj=longlat +datum=WGS84
## 2 Plot dimension/m 10 NA NA <NA>
## 3 Plot dimension/m 10 NA NA <NA>
## elevation_method elevation_value slope_method slope_value aspect_method
## 1 Elevation/m 95 Slope/degrees 40 Aspect/degrees
## 2 <NA> NA <NA> NA <NA>
## 3 <NA> NA <NA> NA <NA>
## aspect_value relatedPlotName plotRelationship
## 1 360 <NA> <NA>
## 2 NA LGM08r subplot
## 3 NA LGM08r subplot
Again, no new plots are added, and missing values correspond to subplots. When calling showElementTable()
the topography information is shown along with the plot information previously added. Since the site data frames for Mt Fyffe forest and Takitimu grassland data sets have the same structure as that of Mokihinui, adding topography information for the former data sets is rather straightforward:
mtfyffe_vegx = addSiteCharacteristics(mtfyffe_vegx, mtfyffe_site, mapping = sitemapping,
measurementMethods = list(slope = "Slope/degrees", aspect = "Aspect/degrees"))
## Measurement method 'Slope/degrees' added for 'slope'.
## Measurement method 'Aspect/degrees' added for 'aspect'.
## 165 plot(s) parsed, 0 new plot(s) added.
## 326 record(s) parsed.
## 636 measurement(s) with missing value(s) not added.
taki_vegx = addSiteCharacteristics(taki_vegx, taki_site, mapping = sitemapping,
measurementMethods = list(slope = "Slope/degrees", aspect = "Aspect/degrees"))
## Measurement method 'Slope/degrees' added for 'slope'.
## Measurement method 'Aspect/degrees' added for 'aspect'.
## 5 plot(s) parsed, 0 new plot(s) added.
## 5 record(s) parsed.
In the beginning of the previous section we specified plot observation dates for the plots of our examples, using function addPlotObservation()
. While this function defines survey events for plots, it does not add any observation or measurement made on plot visits. In this section we show how to add such information.
Function | Description |
---|---|
addIndividualOrganismObservations() |
Adds individual organism observation records (e.g. tree diameters or heights) to a VegX object. |
addAggregateOrganismObservations() |
Adds aggregate organism observation records (e.g. % cover of a particular taxon) to a VegX object. |
addStratumObservations() |
Adds stratum observation records (e.g. % cover of plants in the tree layer) to a VegX object. |
addCommunityObservations() |
Adds community observation records (e.g. stand age or total basal area) to a VegX object. |
addSiteObservations() |
Adds site observation records (e.g. abiotic measurements such as pH) to a VegX object. |
addSurfaceCoverObservations() |
Adds surface cover observation records (e.g. percent of ground covered by bare soil or rocks) to a VegX object. |
First we focus on observations made on individual organisms (e.g. diameter values measured on individual trees). Since individual organisms can be labelled and re-measured in different plot surveys, Veg-X uses the element individualOrganism to keep track of the organism itself. Then, different elements individualOrganismObservations can be used to contain measurements made on the individual organism each time there was an observation of the plot (i.e. each time the plot was revisited). The individual organism (e.g. a particular tree) is uniquely identified using the plot name and an organism label (i.e. a tag on the specimen). Thus, the same label can be repeated in different plots without causing data integrity problems. Individual organisms and their observations are added to Veg-X using the function addIndividualOrganismObservations()
. We first show how it works using the data frame moki_dia
, which contains diameter measurements for trees in the Mokihinui forest data set:
head(moki_dia, 3)
## DiameterID Plot Subplot Project
## 912 2965105 LGM38h 1Q MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 920 2965112 LGM38h 1Q MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 921 2965114 LGM38h 1Q MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## EntryNo ItemCurrentIdentifier Identifier AliveState MethodName
## 912 1 NA NA Alive Stem Diameter
## 920 8 NA NA Alive Stem Diameter
## 921 10 NA NA Alive Stem Diameter
## Verbatim.Species TaxonID CurrentTaxonID NVSCode NVSSpeciesName
## 912 WEIRAC 1747 1747 WEIRAC Weinmannia racemosa
## 920 COPGRA 2396 2396 COPGRA Coprosma grandifolia
## 921 CYASMI 1146 1146 CYASMI Cyathea smithii
## Diameter DiameterValueUnits AssociationNo Association HeightOrLength
## 912 10.6 cm NA NA
## 920 4.0 cm NA NA
## 921 11.3 cm NA NA
## LinearDimensionUnits ItemObsNote ItemID ItemObsID ParentPlotID
## 912 NA 2321109 3016862 789183
## 920 NA 2321116 3016869 789183
## 921 NA 2321118 3016871 789183
## ParentPlotObsID PlotMethodID PlotObsID ProjectID SampleMethodID
## 912 1161780 1392927 1161781 2444 13406
## 920 1161780 1392927 1161781 2444 13406
## 921 1161780 1392927 1161781 2444 13406
## PlotObsStartDate
## 912 2011-02-20
## 920 2011-02-20
## 921 2011-02-20
unique(moki_dia$Identifier)
## [1] NA
Note that there is a column called Identifier
but no data in it. Fortunately, the data set includes a single survey, so that there is no need to provide labels for individual organisms. Hence, we can define our mapping as follows:
mapping = list(plotName = "Plot", subPlotName = "Subplot", obsStartDate = "PlotObsStartDate",
taxonName = "NVSSpeciesName", diameterMeasurement = "Diameter")
If no mapping is provided for individualOrganismLabel, function addIndividualOrganismObservations()
will assume that each record corresponds to a different organism. To define the identity of organisms we can use mapping for either organismName
or taxonName
. The first option is used to specify names that are not taxa (e.g. “tree #1”, “tree #2”, or morphospecies), while the second option explicitly identifies names as taxa. The call to the function produces the following output:
moki_vegx = addIndividualOrganismObservations(moki_vegx, moki_dia, mapping = mapping,
methods = list(diameterMeasurement = "DBH/cm"))
## Measurement method 'DBH/cm' added for 'diameterMeasurement'.
## 23 plot(s) parsed, 0 new added.
## 18 plot observation(s) parsed, 0 new added.
## 28 organism names(s) parsed, 28 new added.
## 0 taxon concept(s) parsed, 0 new added.
## 28 organism identitie(s) parsed, 28 new added.
## 0 individual organism(s) parsed, 643 new added.
## 643 record(s) parsed, 643 new individual organism observation(s) added.
where we see that the number of individual organisms is equal to the number of observations. We can inspect the added individual organism observations using:
head(showElementTable(moki_vegx, "individualOrganismObservation"), 3)
## plotName obsStartDate individualOrganismLabel organismIdentityName
## 1 LGM38h_1Q 2011-02-20 ind1 Weinmannia racemosa
## 2 LGM38h_1Q 2011-02-20 ind2 Coprosma grandifolia
## 3 LGM38h_1Q 2011-02-20 ind3 Cyathea smithii
## diameter_method diameter_value
## 1 DBH/cm 10.6
## 2 DBH/cm 4.0
## 3 DBH/cm 11.3
Note that the column individualOrganismLabel
contains labels created by the function itself, by numbering all individuals of each plot. The call to function addIndividualOrganismObservations()
also led to the definition of elements organismName (used to store the different organism/taxon names that are used in the Veg-X document) and elements organismIdentity (which define the identity of organisms, as with links to organism names and taxon concepts). Let’s inspect the latter:
head(showElementTable(moki_vegx, "organismIdentity"), 3)
## identityName originalOrganismName taxon
## 1 Weinmannia racemosa Weinmannia racemosa TRUE
## 2 Coprosma grandifolia Coprosma grandifolia TRUE
## 3 Cyathea smithii Cyathea smithii TRUE
In this case, the identity is simply the species name coming from the source data, but it could be another name considered nomenclaturally more valid for the same species.
The Mt Fyffe forest data set also contains tree diameter measurements, but in this case there have been two surveys, so in order to add individual tree observations we need the mapping individualOrganismLabel
to specify which column identifies each tree in each plot:
head(mtfyffe_dia, 3)
## DiameterID Plot Subplot Project EntryNo
## 17 119220 17 3 L FYFFE, MOUNT FOREST 1980 91
## 20 121700 6 4 I FYFFE, MOUNT FOREST 1980 154
## 29 121685 6 4 D FYFFE, MOUNT FOREST 1980 139
## ItemCurrentIdentifier Identifier AliveState MethodName
## 17 5167 5167 Alive Quadrat tree diameter
## 20 7778 7778 Alive Quadrat tree diameter
## 29 7701 7701 Alive Quadrat tree diameter
## Verbatim.Species TaxonID CurrentTaxonID NVSCode NVSSpeciesName
## 17 COPLUC 2400 2400 COPLUC Coprosma lucida
## 20 RUBCIS 2756 2756 RUBCIS Rubus cissoides
## 29 PSEARB 2568 2568 PSEARB Pseudopanax arboreus
## Diameter DiameterValueUnits AssociationNo Association ItemObsNote ItemID
## 17 7.90 cm NA NA 13431
## 20 2.90 cm NA NA 11093
## 29 4.60 cm NA NA 11049
## ItemObsID ParentPlotID ParentPlotObsID PlotMethodID PlotObsID ProjectID
## 17 902441 229 52221 89460 145010 867
## 20 902572 217 52241 67343 149638 867
## 29 902750 217 52241 24783 149634 867
## SampleMethodID PlotObsStartDate
## 17 4629 1980-02-07
## 20 4629 1980-02-07
## 29 4629 1980-02-07
mapping = list(plotName = "Plot", subPlotName = "Subplot", obsStartDate = "PlotObsStartDate",
taxonName = "NVSSpeciesName", individualOrganismLabel = "Identifier",
diameterMeasurement = "Diameter")
Since the diameter measurement method is the same as before, we can directly run `addIndividualOrganismObservations()`` and inspect the result:
mtfyffe_vegx = addIndividualOrganismObservations(mtfyffe_vegx, mtfyffe_dia,
mapping = mapping,
methods = list(diameterMeasurement = "DBH/cm"))
## Measurement method 'DBH/cm' added for 'diameterMeasurement'.
## 67 plot(s) parsed, 0 new added.
## 117 plot observation(s) parsed, 0 new added.
## 30 organism names(s) parsed, 30 new added.
## 0 taxon concept(s) parsed, 0 new added.
## 30 organism identitie(s) parsed, 30 new added.
## 725 individual organism(s) parsed, 725 new added.
## 1082 record(s) parsed, 1082 new individual organism observation(s) added.
## 105 individual organism observation(s) with missing diameter value(s) not added.
head(showElementTable(mtfyffe_vegx, "individualOrganismObservation"), 3)
## plotName obsStartDate individualOrganismLabel organismIdentityName
## 1 17 3_L 1980-02-07 5167 Coprosma lucida
## 2 6 4_I 1980-02-07 7778 Rubus cissoides
## 3 6 4_D 1980-02-07 7701 Pseudopanax arboreus
## diameter_method diameter_value
## 1 DBH/cm 7.9
## 2 DBH/cm 2.9
## 3 DBH/cm 4.6
Note that in this case the number of observations of individual trees is higher than the number of trees, because of the repeated measurements. Although we will not show it here in any example, it is possible to associate organism observations to particular heights where organisms are observed or to particular strata, by linking them to stratum observations in the same way as we did for aggregate organism observations.
Aggregate organism observations include measurements that apply to a set of organisms collectively, normally all organisms of the same species identity. The most common examples are abundance values (e.g. cover) for species. Function addAggregateOrganismObservations()
can be used to import such data into a VegX document. We first inspect the Mokihinui forest data frame moki_tcv
to decide what information should be mapped:
head(moki_tcv,3)
## TaxonCategoryValueID Plot Subplot
## 920 4302767 LGM38h NA
## 921 4302768 LGM38h NA
## 922 4302769 LGM38h NA
## Project EntryNo MethodName
## 920 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011 1 Recce Inventory
## 921 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011 1 Recce Inventory
## 922 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011 1 Recce Inventory
## Verbatim.Species TaxonID CurrentTaxonID NVSCode NVSSpeciesName
## 920 DACCUP 1178 1178 DACCUP Dacrydium cupressinum
## 921 DACCUP 1178 1178 DACCUP Dacrydium cupressinum
## 922 DACCUP 1178 1178 DACCUP Dacrydium cupressinum
## Tier TierDescription TierTechniqueName TierLower TierUpper
## 920 Tier 1 > 25 m Standard Recce 25 50.0
## 921 Tier 2 12 - 25 m Standard Recce 12 25.0
## 922 Tier 6 < 30 cm Standard Recce 0 0.3
## TierHeightUnits Category CategoryDescription CategoryTechniqueName
## 920 m 3 6-25% Standard Recce (Allen)
## 921 m 3 6-25% Standard Recce (Allen)
## 922 m 1 <1% Standard Recce (Allen)
## TaxonObsNote ParentPlotID ParentPlotObsID PlotMethodID PlotObsID ProjectID
## 920 789183 1161780 1392859 1161780 2444
## 921 789183 1161780 1392859 1161780 2444
## 922 789183 1161780 1392859 1161780 2444
## SampleMethodID TaxonObsID TierID PlotObsStartDate
## 920 13405 7448485 2133018 2011-02-20
## 921 13405 7448485 2133019 2011-02-20
## 922 13405 7448485 2133023 2011-02-20
As before, taxon names can be drawn from column NVSSpeciesName
. Column Tier
contains information about the stratum where species were recorded, whereas column Category
contains cover values codified in a cover ordinal scale. First, we define a mapping for these variables as well as for plot and observation start date, which together specify a plotObservation (aggregate organism observations were not done in subplots for this data set).
mapping = list(plotName = "Plot", obsStartDate = "PlotObsStartDate",
taxonName = "NVSSpeciesName",
stratumName = "Tier", cover = "Category")
In order to parse cover values, we could use a method of percent cover, but in this data set cover is specified using cover classes. Thus, we need to define an ordinal scale that can be used to interpret Category
values; this can be done with function defineOrdinalScaleMethod()
:
coverscale = defineOrdinalScaleMethod(name = "Recce cover scale",
description = "Recce recording method by Hurst/Allen",
subject = "plant cover",
citation = "Hurst, JM and Allen, RB. (2007) The Recce method for describing
Zealand vegetation – Field protocols. Landcare Research, Lincoln.",
codes = c("P","1","2","3", "4", "5", "6"),
quantifiableCodes = c("1","2","3", "4", "5", "6"),
breaks = c(0, 1, 5, 25, 50, 75, 100),
midPoints = c(0.05, 0.5, 15, 37.5, 62.5, 87.5),
definitions = c("Presence", "<1%", "1-5%","6-25%", "26-50%",
"51-75%", "76-100%"))
As the source data specifies taxon abundances within vegetation strata, we also need to supply information on how the strata are defined. The VegX R package provides three different ways of defining strata: by heights, by categories and using a mixed approach. This last option is used in the following code:
moki_strataDef = defineMixedStrata(name = "Recce strata",
description = "Standard Recce stratum definition",
citation = "Hurst, JM and Allen, RB. (2007) The Recce method for describing
Zealand vegetation – Field protocols. Landcare Research, Lincoln.",
heightStrataBreaks = c(0, 0.3,2.0,5, 12, 25, 50),
heightStrataNames = paste0("Tier ",1:6),
categoryStrataNames = "Tier 7",
categoryStrataDefinition = "Epiphytes")
Having the mapping, the cover scale and the stratum definition we can proceed to import species cover values by strata using function addAggregateOrganismObservations()
:
moki_vegx = addAggregateOrganismObservations(moki_vegx, moki_tcv, mapping,
methods = list(cover=coverscale),
stratumDefinition = moki_strataDef)
## 1 additional aggregate organism measurements found: Category.
## Measurement method 'Recce cover scale' added for 'cover'.
## Stratum definition method 'Recce strata' added.
## 7 new stratum definitions added.
## 5 plot(s) parsed, 0 new added.
## 5 plot observation(s) parsed, 0 new added.
## 148 organism names(s) parsed, 121 new added.
## 148 organism identitie(s) parsed, 121 new added.
## 33 stratum observation(s) parsed, 33 new added.
## 582 record(s) parsed, 582 new aggregate organism observation(s) added.
Note that the both the stratum definition and the cover scale contain methods that are added to the Veg-X document. The strata themselves are also added to the document (i.e. stratum elements). Other elements that are added are organism identities (i.e. taxon names), stratum observations (because species were observed while focusing on particular strata) and, finally, aggregate organism observation themselves. Less organism names and organism names have been added than those parsed, because the Veg-X document already contained some from individual organism observations. We can inspect the newly added taxon cover observations using:
head(showElementTable(moki_vegx, "aggregateOrganismObservation"),3)
## plotName obsStartDate organismIdentityName stratumName agg_1_method
## 1 LGM38h 2011-02-20 Dacrydium cupressinum Tier 1 Recce cover scale
## 2 LGM38h 2011-02-20 Dacrydium cupressinum Tier 2 Recce cover scale
## 3 LGM38h 2011-02-20 Dacrydium cupressinum Tier 6 Recce cover scale
## agg_1_value
## 1 3
## 2 3
## 3 1
The Mt Fyffe forest data set includes individual counts by species and stratum (i.e. another kind of aggregate organism observations) in a data frame mtfyffe_counts
, which has a similar structure as moki_tcv
, but with counts being in column value
:
head(mtfyffe_counts, 3)
## TaxonSimpleValueID Plot Subplot Project EntryNo
## 2 1496938 12 1 E FYFFE, MOUNT FOREST 1980 11
## 3 1496939 12 1 E FYFFE, MOUNT FOREST 1980 12
## 4 1496951 12 1 M FYFFE, MOUNT FOREST 1980 26
## MethodName Verbatim.Species TaxonID CurrentTaxonID NVSCode
## 2 Quadrat sapling PSECOL 2588 2588 PSECOL
## 3 Quadrat sapling RUBCIS 2756 2756 RUBCIS
## 4 Quadrat sapling PSECRA 2571 2571 PSECRA
## NVSSpeciesName Tier TierDescription TierTechniqueName Value Units
## 2 Pseudowintera colorata NA 1 count
## 3 Rubus cissoides NA 1 count
## 4 Pseudopanax crassifolius NA 1 count
## MeasureName TaxonObsNote ParentPlotID ParentPlotObsID PlotMethodID
## 2 Sapling Count 221 52256 508043
## 3 Sapling Count 221 52256 508043
## 4 Sapling Count 221 52256 420498
## PlotObsID ProjectID SampleMethodID TaxonObsID TierID PlotObsStartDate
## 2 142900 867 3512 2185897 NA 1980-02-07
## 3 142900 867 3512 2185898 NA 1980-02-07
## 4 326011 867 3512 2185912 NA 1980-02-07
mapping = list(plotName = "Plot", subPlotName = "Subplot", obsStartDate = "PlotObsStartDate",
taxonName = "NVSSpeciesName", stratumName = "Tier", counts = "Value")
Analogously to the previous case, we need to specify a measurement method for counts, and in this case we can use function predefinedMeasurementMethod()
:
countscale = predefinedMeasurementMethod("Individual plant counts")
Then we also need to provide the strata definition, which is different from that of the previous data set. Here all strata are defined by height, so we can use a function called defineHeightStrata()
:
mtfyffe_strataDef = defineHeightStrata(name = "Standard seedling/sapling strata",
description = "Seedling/sapling stratum definition",
heightBreaks = c(0, 15, 45, 75, 105, 135, 200),
strataNames = as.character(1:6),
strataDefinitions = c("0-15 cm", "16-45 cm", "46-75 cm",
"76-105 cm", "106-135 cm", "> 135 cm"))
Now, we are ready to import the data:
mtfyffe_vegx = addAggregateOrganismObservations(mtfyffe_vegx, mtfyffe_counts, mapping,
methods = list(counts=countscale),
stratumDefinition = mtfyffe_strataDef)
## 1 additional aggregate organism measurements found: Value.
## Measurement method 'Individual plant counts' added for 'counts'.
## Stratum definition method 'Standard seedling/sapling strata' added.
## 6 new stratum definitions added.
## 131 plot(s) parsed, 0 new added.
## 194 plot observation(s) parsed, 0 new added.
## 55 organism names(s) parsed, 30 new added.
## 55 organism identitie(s) parsed, 30 new added.
## 170 stratum observation(s) parsed, 170 new added.
## 533 record(s) parsed, 533 new aggregate organism observation(s) added.
head(showElementTable(mtfyffe_vegx, "aggregateOrganismObservation"),3)
## plotName obsStartDate organismIdentityName agg_1_method
## 1 12 1_E 1980-02-07 Pseudowintera colorata Individual plant counts
## 2 12 1_E 1980-02-07 Rubus cissoides Individual plant counts
## 3 12 1_M 1980-02-07 Pseudopanax crassifolius Individual plant counts
## agg_1_value stratumName
## 1 1 <NA>
## 2 1 <NA>
## 3 1 <NA>
Again, this involves that elements of several kinds are added to our Veg-X document. The process for the Takitimu grassland data set is similar, but in this case, the observations are not organized by strata, and as abundance values we have frequency of occurrence.
head(taki_freq, 3)
## TaxonSimpleValueID Plot Subplot Project
## 3 7514719 T2 2 (T2 2_TRA) NA TAKITIMU GRASSLAND 1968-1969
## 6 7566591 T2 2 (T2 2_TRA) NA TAKITIMU GRASSLAND 1968-1969
## 9 7583612 R3 1 (R3 1_TRA) NA TAKITIMU GRASSLAND 1968-1969
## EntryNo MethodName Verbatim.Species TaxonID CurrentTaxonID NVSCode
## 3 0 Transect frequency MOSS 2184 2184 MOSS
## 6 0 Transect frequency SCHPAU 2843 2843 SCHPAU
## 9 0 Transect frequency HELICH 843 843 HELICH
## NVSSpeciesName TierDescription TierTechniqueName Value Units
## 3 Moss species NA NA 38 %
## 6 Schoenus pauciflorus NA NA 9 %
## 9 Helichrysum species NA NA 2 %
## MeasureName TaxonObsNote ParentPlotID ParentPlotObsID PlotMethodID
## 3 Percentage Frequency NA 781855 1153877 1386555
## 6 Percentage Frequency NA 781855 1153877 1386555
## 9 Percentage Frequency NA 781860 1153882 1385873
## PlotObsID ProjectID SampleMethodID TaxonObsID TierID PlotObsStartDate
## 3 1153877 500 13184 7262934 NA 1968-02-07
## 6 1153877 500 13184 7262217 NA 1968-02-07
## 9 1153882 500 13184 7247662 NA 1968-02-07
mapping = list(plotName = "Plot", obsStartDate = "PlotObsStartDate",
taxonName = "NVSSpeciesName", freq = "Value")
Hence, we define the new measurement scale and call again addAggregateOrganismObservations()
:
taki_vegx = addAggregateOrganismObservations(taki_vegx, taki_freq, mapping,
methods = list(freq="Plant frequency/%"))
## 1 additional aggregate organism measurements found: Value.
## Measurement method 'Plant frequency/%' added for 'freq'.
## 5 plot(s) parsed, 0 new added.
## 5 plot observation(s) parsed, 0 new added.
## 38 organism names(s) parsed, 38 new added.
## 38 organism identitie(s) parsed, 38 new added.
## 94 record(s) parsed, 94 new aggregate organism observation(s) added.
head(showElementTable(taki_vegx, "aggregateOrganismObservation"), 3)
## plotName obsStartDate organismIdentityName agg_1_method
## 1 T2 2 (T2 2_TRA) 1968-02-07 Moss species Plant frequency/%
## 2 T2 2 (T2 2_TRA) 1968-02-07 Schoenus pauciflorus Plant frequency/%
## 3 R3 1 (R3 1_TRA) 1968-02-07 Helichrysum species Plant frequency/%
## agg_1_value
## 1 38
## 2 9
## 3 2
As expected, no stratum definition nor stratum observations are added to the Veg-X document, but we still see the addition of organism names, organism identities and aggregate organism observations.
While aggregated organism observations are often related to strata, it is possible to indicate that measurements of cover of counts were done focusing on a particular height, by mapping to heightMeasurement instead of using stratumName.
In the previous subsections we stated that both individual and aggregate organism observations can be positioned in a particular vegetation stratum (e.g. the moss layer). However, one could imagine measurements that apply to the stratum itself, like the overall cover or basal area of all organisms in the stratum, regardless of their identity. Other common stratum measurements are those that define its vertical limits (e.g. at which height did the tree layer started?). Veg-X allows storing this information in elements stratumObservation. We showed that of this kind these were automatically created and added when dealing with aggregate taxon observations, but here we show how to add measurements that specifically refer to strata using function addStratumObservations()
.
To illustrate how to add stratum observations to a Veg-X document, we take again the Mokihinui forest data set as data source and inspect the data frame moki_str
, which contains strata cover measurements:
head(moki_str, 3)
## TierID Plot Subplot Project
## 26 2133018 LGM38h NA MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 27 2133019 LGM38h NA MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 28 2133020 LGM38h NA MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## MethodName Tier TierDescription TierTechniqueName TierLower TierUpper
## 26 Recce Inventory Tier 1 > 25 m Standard Recce 25 50
## 27 Recce Inventory Tier 2 12 - 25 m Standard Recce 12 25
## 28 Recce Inventory Tier 3 5 - 12 m Standard Recce 5 12
## TierHeightUnits CoverClass CoverClassDescription ParentPlotID
## 26 m 3 NA 789183
## 27 m 3 NA 789183
## 28 m 4 NA 789183
## ParentPlotObsID PlotObsID ProjectID SampleMethodID PlotMethodID
## 26 1161780 1161780 2444 13405 1392859
## 27 1161780 1161780 2444 13405 1392859
## 28 1161780 1161780 2444 13405 1392859
## PlotObsStartDate
## 26 2011-02-20
## 27 2011-02-20
## 28 2011-02-20
The data table also contains stratum height limits, although our definition of strata to import taxon cover data already contained height limits for most strata. We will assume that the data in moki_str
indeed contains actual measurements and define the mapping accordingly:
mapping = list(plotName = "Plot", obsStartDate = "PlotObsStartDate", stratumName = "Tier",
lowerLimitMeasurement = "TierLower", upperLimitMeasurement = "TierUpper",
cover = "CoverClass")
Both the cover ordinal scale and the strata definitions have been used before, so we do not need to redefine them. We do need, however, to create a definition of the method applying to height measurements, before calling addStratumObservations()
:
heightMethod = predefinedMeasurementMethod("Stratum height/m")
moki_vegx = addStratumObservations(moki_vegx, moki_str, mapping = mapping,
methods = list(lowerLimitMeasurement = heightMethod,
upperLimitMeasurement = heightMethod,
cover=coverscale),
stratumDefinition = moki_strataDef)
## 1 stratum measurement variables found.
## Measurement method 'Stratum height/m' added for 'lowerLimitMeasurement'.
## Measurement method 'Stratum height/m' for 'upperLimitMeasurement' already included.
## Measurement method 'Recce cover scale' for 'cover' already included.
## Stratum definition 'Recce strata' already included.
## 5 plot(s) parsed, 0 new added.
## 5 plot observation(s) parsed, 0 new added.
## 35 record(s) parsed, 2 new stratum observation(s) added.
## 7 measurement(s) with missing value(s) not added.
Note that no new strata definitions are added, as they were already included when adding aggregate stratum observations. We do have some new stratum observations. The status of the stratum observations can be shown using:
head(showElementTable(moki_vegx, "stratumObservation"), 3)
## plotName obsStartDate stratumName lowerLimit_method lowerLimit_value
## 1 LGM38h 2011-02-20 Tier 1 Stratum height/m 25
## 2 LGM38h 2011-02-20 Tier 2 Stratum height/m 12
## 3 LGM38h 2011-02-20 Tier 6 Stratum height/m 0
## upperLimit_method upperLimit_value str_1_method str_1_value
## 1 Stratum height/m 50.0 Recce cover scale 3
## 2 Stratum height/m 25.0 Recce cover scale 3
## 3 Stratum height/m 0.3 Recce cover scale 3
Veg-X includes into elements communityObservation all biotic observations and measurements that are naturally defined at the plant community (or vegetation stand) level, such as basal area, species richness or stand age. Since our example source data sets did not include any of such measurements, we start by adding a column BA
with simulated basal area values in the moki_site
data frame using a Normal distribution:
Adding community observations requires, as usual, a mapping where in addition to mapping plots and surveys we can specify mappings for measurements defined at the community level:
# Define mapping
mapping = list(plotName = "Plot", subPlotName = "Subplot",
obsStartDate = "PlotObsStartDate", basal_area = "BA")
Of course, for each measurement we will need to provide a method that describes the measured subject, units, etc. Function addCommunityObservations()
is used to add community observations to a VegX object:
# Add basal area measurements to the VegX object
moki_vegx = addCommunityObservations(moki_vegx, moki_site, mapping = mapping,
methods = list(basal_area = "basal area"))
## Measurement method 'Basal area/m2*ha-1' added for 'basal_area'.
## 25 plot(s) parsed, 0 new plot(s) added.
## 25 plot observation(s) parsed, 0 new plot observation(s) added.
## 25 record(s) parsed, 25 new community observation(s) added.
# Inspect the result
head(showElementTable(moki_vegx, "communityObservation"),3)
## plotName obsStartDate comm_1_method comm_1_value
## 1 LGM08r 2011-02-17 Basal area/m2*ha-1 0.6565694
## 2 LGM08r_1Q 2011-02-17 Basal area/m2*ha-1 9.5813046
## 3 LGM08r_2Q 2011-02-17 Basal area/m2*ha-1 12.0541588
Veg-X includes into elements siteObservation all observations and measurements that do not refer to vegetation itself, i.e. abiotic measurements, soil type classifications, etc. Since our example source data sets did not include any of such measurements, we created a column pH
with constant values in the moki_site
data frame. The function that allows adding site observations is addSiteObservations()
and the following code should be rather self-explanatory by now:
mapping = list(plotName = "Plot", subPlotName = "Subplot", obsStartDate = "PlotObsStartDate")
moki_vegx = addSiteObservations(moki_vegx, moki_site,
plotObservationMapping = mapping,
soilMeasurementMapping = list(a = "pH"),
soilMeasurementMethods = list(a = "pH/0-14"))
## Measurement method 'pH/0-14' added for 'a'.
## 25 plot(s) parsed, 0 new plot(s) added.
## 25 plot observation(s) parsed, 0 new plot observation(s) added.
## 25 record(s) parsed, 25 new site observation(s) added.
In contrast with other add...
functions, ‘a’ is only used here in the context of of the addSiteObservations()
function (i.e., there will be no variable called ‘a’ in the Veg-X document). When displaying site observations, columns soil_1_*
, soil_2_*
only indicate the numbering of soil variables:
head(showElementTable(moki_vegx, "siteObservation"))
## plotName obsStartDate soil_1_method soil_1_value
## 1 LGM08r 2011-02-17 pH/0-14 7
## 2 LGM08r_1Q 2011-02-17 pH/0-14 7
## 3 LGM08r_2Q 2011-02-17 pH/0-14 7
## 4 LGM08r_3Q 2011-02-17 pH/0-14 7
## 5 LGM08r_4Q 2011-02-17 pH/0-14 7
## 6 LGM16l 2011-02-15 pH/0-14 7
It is important to distinguish the subject of a method from the method itself. For example, subject would be pH measurement of upper soil solution, whereas a particular methods for this subject would be the measurement in water or measurement in 0.01 mol CaCl. In the former example we added variable ‘pH’ of the input data to the VegX document and defined the measurement method as pH/0-14
, which simply specifies the measurement of pH (the subject) onto a 0-14 scale. Let’s look at its definition:
predefinedMeasurementMethod("pH/0-14")
## An object of class "VegXMethodDefinition"
## Slot "name":
## [1] "pH/0-14"
##
## Slot "description":
## [1] "pH scale from 0 to 14"
##
## Slot "citationString":
## [1] ""
##
## Slot "DOI":
## [1] ""
##
## Slot "subject":
## [1] "pH"
##
## Slot "attributeType":
## [1] "quantitative"
##
## Slot "attributes":
## $`1`
## $`1`$type
## [1] "quantitative"
##
## $`1`$unit
## [1] NA
##
## $`1`$lowerLimit
## [1] 0
##
## $`1`$upperLimit
## [1] 14
Surface cover observations are measurements of the percentage of the plot’s surface that is covered (i.e. when projected onto the ground) by different surface types, such as rocks, bare soil, vegetation, etc. Veg-X allows defining surface types as surfaceType elements, and storing cover values for them in surfaceCoverObservation elements. We use the Mt Fyffe forest data set to illustrate how this kind of observations are added to a Veg-X document. First we inspect table mtfyffe_groundcover
and define a mapping:
head(mtfyffe_groundcover, 3)
## PlotGroundCoverID Plot Subplot Project PlotGroundCover
## 14 6157 6 4 NA FYFFE, MOUNT FOREST 1980 Vegetation
## 17 4176 17 3 NA FYFFE, MOUNT FOREST 1980 Vegetation
## 21 1353 14 2 NA FYFFE, MOUNT FOREST 1980 Vegetation
## TechniqueName MeasureName Value ParentPlotID
## 14 Standard Ground Cover Estimate Ground Cover Category 20 217
## 17 Standard Ground Cover Estimate Ground Cover Category 20 229
## 21 Standard Ground Cover Estimate Ground Cover Category 20 226
## ParentPlotObsID PlotObsID ProjectID PlotObsStartDate
## 14 52241 52241 867 1980-02-07
## 17 52221 52221 867 1980-02-07
## 21 52261 52261 867 1980-02-07
mapping = list(plotName = "Plot", obsStartDate = "PlotObsStartDate",
surfaceName = "PlotGroundCover", coverMeasurement = "Value")
In this case, cover values are specified as percent cover of ground surface, so we need to define an appropriate method:
coverMethod = predefinedMeasurementMethod("Surface cover/%")
We inspect the surface types used in the data set and call function defineSurfaceTypes()
as we have done previously for strata:
unique(mtfyffe_groundcover$PlotGroundCover)
## [1] "Vegetation" "Moss" "Litter" "Exposed Soil" "Rock"
surfaceTypes = defineSurfaceTypes(name = "Default surface types",
description = "Five surface categories",
surfaceNames = c("Vegetation", "Moss", "Litter", "Exposed Soil",
"Rock"))
We can now import surface cover observations using function addSurfaceCoverObservations()
:
mtfyffe_vegx = addSurfaceCoverObservations(mtfyffe_vegx, mtfyffe_groundcover, mapping,
coverMethod, surfaceTypes)
## Measurement method 'Surface cover/%' added for 'coverMeasurement'.
## Surface type definition method 'Default surface types' added.
## 5 new surface type definitions added.
## 4 plot(s) parsed, 0 new added.
## 8 plot observation(s) parsed, 0 new added.
## 40 record(s) parsed, 40 new surface cover observation(s) added.
head(showElementTable(mtfyffe_vegx, "surfaceCoverObservation", 3))
## plotObservationID plotName obsStartDate surfaceTypeID surfaceName cover_attID
## 1 1 6 4 1980-02-07 1 Vegetation 8
## 2 4 17 3 1980-02-07 1 Vegetation 8
## 3 3 14 2 1980-02-07 1 Vegetation 8
## 4 2 12 1 1980-02-07 1 Vegetation 8
## 5 1 6 4 1980-02-07 2 Moss 8
## 6 3 14 2 1980-02-07 2 Moss 8
## cover_method cover_value
## 1 Surface cover/% 20
## 2 Surface cover/% 20
## 3 Surface cover/% 20
## 4 Surface cover/% 20
## 5 Surface cover/% 20
## 6 Surface cover/% 20
Analogously to the case of strata, the function added surface type definitions to the Veg-X document, in addition to adding the cover values, themselves.
The Takitimu grassland data set also includes surface cover observations, although the surface types are slightly different:
head(taki_groundcover, 3)
## PlotGroundCoverID Plot Subplot Project
## 26 396242 R3 1 (R3 1_TRA) NA TAKITIMU GRASSLAND 1968-1969
## 27 396243 R3 1 (R3 1_TRA) NA TAKITIMU GRASSLAND 1968-1969
## 28 396244 R3 1 (R3 1_TRA) NA TAKITIMU GRASSLAND 1968-1969
## PlotGroundCover TechniqueName MeasureName Value
## 26 Litter Standard Ground Cover Estimate Ground Cover Category 25
## 27 Rock Standard Ground Cover Estimate Ground Cover Category 25
## 28 Vegetation Standard Ground Cover Estimate Ground Cover Category 25
## ParentPlotID ParentPlotObsID PlotObsID ProjectID PlotObsStartDate
## 26 781860 1153882 1153882 500 1968-02-07
## 27 781860 1153882 1153882 500 1968-02-07
## 28 781860 1153882 1153882 500 1968-02-07
unique(taki_groundcover$PlotGroundCover)
## [1] "Litter" "Rock" "Vegetation" "Erosion Pavement"
## [5] "Soil"
Therefore, we must define a new set of surface types before calling function addSurfaceCoverObservations()
:
surfaceTypes = defineSurfaceTypes(name = "Default surface types",
description = "Five surface categories",
surfaceNames = c("Vegetation", "Soil", "Erosion Pavement", "Litter",
"Rock"))
taki_vegx = addSurfaceCoverObservations(taki_vegx, taki_groundcover, mapping,
coverMethod, surfaceTypes)
## Measurement method 'Surface cover/%' added for 'coverMeasurement'.
## Surface type definition method 'Default surface types' added.
## 5 new surface type definitions added.
## 5 plot(s) parsed, 0 new added.
## 5 plot observation(s) parsed, 0 new added.
## 17 record(s) parsed, 17 new surface cover observation(s) added.
One of the purposes of importing data into Veg-X, is the possibility to combine and harmonize documents from different sources. In this section we illustrate how documents should be merged, and some functions that can be used to harmonize their contents.
When combining VegX objects from different sources it is important to pay attention to plot names, because plots from two different sources may have been given the same name while in fact they correspond to different sampled areas. To combine two vegetation sources while avoiding confusion in plot identity one should use plot unique identifiers, i.e. sub-element plotUniqueIdentifier of plot. When populating a Veg-X object from a single source data set, unique identifiers are not normally available nor needed, and the functions that add observations to the object will only look at plotName to identifying plots uniquely. However, when merging VegX objects unique identifiers should be defined, and two plots should be considered to be the same only if both their plot name and unique identifier have the same values in both plots. While less critical than plot unique identifiers, the Veg-X standard also allows unique identifiers for plot observations, via the sub-element plotObservationUniqueIdentifier of plotObservation.
The VegX package provides two ways to supply unique identifiers. The function addPlotObservations()
allows specifying mappings for both plotUniqueIdentifier and plotObservationUniqueIdentifier:
mapping = list(projectTitle = "Project", plotName = "Plot", subPlotName = "Subplot",
obsStartDate = "PlotObsStartDate", obsEndDate = "PlotObsStopDate",
plotUniqueIdentifier = "PlotID", plotObservationUniqueIdentifier = "PlotObsID")
vegx_ids = addPlotObservations(newVegX(), moki_site, mapping = mapping, verbose = FALSE)
head(showElementTable(vegx_ids, "plot"), 3)
## plotName plotUniqueIdentifier relatedPlotName plotRelationship
## 1 LGM08r 789033 <NA> <NA>
## 2 LGM08r_1Q 789034 LGM08r subplot
## 3 LGM08r_2Q 789035 LGM08r subplot
head(showElementTable(vegx_ids, "plotObservation"), 3)
## plotName obsStartDate obsEndDate plotObservationUniqueIdentifier
## 1 LGM08r 2011-02-17 2011-02-17 1161630
## 2 LGM08r_1Q 2011-02-17 2011-02-17 1161631
## 3 LGM08r_2Q 2011-02-17 2011-02-17 1161632
## projectTitle
## 1 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 2 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 3 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
We could use function addPlotObservations()
to define unique identifiers because these were available from our source data. Note however, that IDs coming from NVS are only unique within the context of this data bank. In cases the source data does not include unique identifiers or those available may not be unique in all situations, one can generate universally unique identifiers (or replace the current identifiers) using function fillUniqueIdentifiers()
:
moki_vegx = fillUniqueIdentifiers(target = moki_vegx, element = "plot")
head(showElementTable(moki_vegx, "plot"),3)
## plotName plotUniqueIdentifier area_method area_value
## 1 LGM08r 165a4d88-2e3e-4606-9234-7965efd90c6e Plot area/m2 400
## 2 LGM08r_1Q 9edd1675-0c30-45ea-af60-e5620c9ee0bc Plot area/m2 100
## 3 LGM08r_2Q 5fda70fd-df56-459d-8845-8787dbf62044 Plot area/m2 100
## shape length_method length_value width_method width_value coordX
## 1 rectangle Plot dimension/m 20 Plot dimension/m 20 172.1407
## 2 rectangle Plot dimension/m 10 Plot dimension/m 10 NA
## 3 rectangle Plot dimension/m 10 Plot dimension/m 10 NA
## coordY spatialReference elevation_method elevation_value
## 1 -41.5417 +proj=longlat +datum=WGS84 Elevation/m 95
## 2 NA <NA> <NA> NA
## 3 NA <NA> <NA> NA
## slope_method slope_value aspect_method aspect_value relatedPlotName
## 1 Slope/degrees 40 Aspect/degrees 360 <NA>
## 2 <NA> NA <NA> NA LGM08r
## 3 <NA> NA <NA> NA LGM08r
## plotRelationship
## 1 <NA>
## 2 subplot
## 3 subplot
A UUID (Universal Unique Identifier) is a 128-bit number used to uniquely identify some object or entity. When generated according to the standard methods, UUIDs are for practical purposes unique, without depending for their uniqueness on a central registration authority or coordination between the parties generating them, unlike most other numbering schemes. While the probability that a UUID will be duplicated is not zero, it is close enough to zero to be negligible. If we are interested in merging different documents it is important to ensure that unique identifiers are defined for plots. Function fillUniqueIdentifiers()
generates UUIDs by calling function UUIDgenerate()
from the R package uuid.
As we did for Mokihinui VegX object, we generate universally unique identifiers for the other two VegX objects:
mtfyffe_vegx = fillUniqueIdentifiers(target = mtfyffe_vegx, element = "plot")
taki_vegx = fillUniqueIdentifiers(target = taki_vegx, element = "plot")
head(showElementTable(moki_vegx, "organismIdentity"),10)
## identityName originalOrganismName taxon
## 1 Weinmannia racemosa Weinmannia racemosa TRUE
## 2 Coprosma grandifolia Coprosma grandifolia TRUE
## 3 Cyathea smithii Cyathea smithii TRUE
## 4 Dacrycarpus dacrydioides Dacrycarpus dacrydioides TRUE
## 5 Dicksonia squarrosa Dicksonia squarrosa TRUE
## 6 Pseudowintera axillaris Pseudowintera axillaris TRUE
## 7 Quintinia acutifolia Quintinia acutifolia TRUE
## 8 Coprosma lucida Coprosma lucida TRUE
## 9 Melicytus ramiflorus Melicytus ramiflorus TRUE
## 10 Myrsine salicina Myrsine salicina TRUE
moki_vegx = setPreferredTaxonNomenclature(moki_vegx, moki_lookup,
c(originalOrganismName = "NVSSpeciesName", preferredTaxonName = "PreferredSpeciesName"))
## Preferred taxon name was set on 149 organism identities.
## Preferred taxon name is now different than original organism name on 22 organism identities.
## 21 new organism name(s) added.
a = showElementTable(moki_vegx, "organismIdentity")
a[which(a$identityName!= a$originalOrganismName),]
## identityName originalOrganismName taxon
## 13 Fuscospora fusca Nothofagus fusca TRUE
## 15 Fuscospora truncata Nothofagus truncata TRUE
## 22 Podocarpus laetus Podocarpus hallii TRUE
## 28 Lophozonia menziesii Nothofagus menziesii TRUE
## 36 Carex horizontalis Uncinia rupestris TRUE
## 37 Veronica leiophylla Hebe leiophylla TRUE
## 41 Carex corynoidea Uncinia clavata TRUE
## 45 Carex uncinata Uncinia uncinata TRUE
## 55 Phlegmariurus varius Huperzia varia TRUE
## 61 Dendrobium cunninghamii Winika cunninghamii TRUE
## 66 Carex species Uncinia species TRUE
## 67 Notogrammitis billardierei Grammitis billardierei TRUE
## 68 Hymenophyllum nephrophyllum Trichomanes reniforme TRUE
## 73 Veronica salicifolia Hebe salicifolia TRUE
## 76 Melicytus species Hymenanthera species TRUE
## 87 Erigeron bilbaoanus Conyza bilbaoana TRUE
## 90 Veronica lyallii Parahebe lyallii TRUE
## 95 Leontodon saxatilis Leontodon taraxacoides TRUE
## 97 Lolium arundinaceum Schedonorus arundinaceus TRUE
## 113 Notogrammitis heterophylla Ctenopteris heterophylla TRUE
## 127 Austroderia richardii Cortaderia richardii TRUE
## 136 Coprosma dumosa Coprosma tayloriae TRUE
## preferredTaxonName
## 13 Fuscospora fusca
## 15 Fuscospora truncata
## 22 Podocarpus laetus
## 28 Lophozonia menziesii
## 36 Carex horizontalis
## 37 Veronica leiophylla
## 41 Carex corynoidea
## 45 Carex uncinata
## 55 Phlegmariurus varius
## 61 Dendrobium cunninghamii
## 66 Carex species
## 67 Notogrammitis billardierei
## 68 Hymenophyllum nephrophyllum
## 73 Veronica salicifolia
## 76 Melicytus species
## 87 Erigeron bilbaoanus
## 90 Veronica lyallii
## 95 Leontodon saxatilis
## 97 Lolium arundinaceum
## 113 Notogrammitis heterophylla
## 127 Austroderia richardii
## 136 Coprosma dumosa
mtfyffe_vegx = setPreferredTaxonNomenclature(mtfyffe_vegx, mtfyffe_lookup,
c(originalOrganismName = "NVSSpeciesName", preferredTaxonName = "PreferredSpeciesName"))
## Preferred taxon name was set on 59 organism identities.
## Preferred taxon name is now different than original organism name on 13 organism identities.
## 9 new organism name(s) added.
a = showElementTable(mtfyffe_vegx, "organismIdentity")
a[which(a$identityName!= a$originalOrganismName),]
## identityName originalOrganismName taxon
## 6 Coprosma dumosa Coprosma tayloriae TRUE
## 9 Coprosma grandifolia Coprosma australis TRUE
## 12 Podocarpus laetus Podocarpus hallii TRUE
## 15 Fuscospora fusca Nothofagus fusca TRUE
## 19 Prumnopitys taxifolia Podocarpus spicatus TRUE
## 30 Veronica leiophylla Hebe gracillima TRUE
## 37 Blechnum novae-zelandiae Blechnum capense TRUE
## 41 Carex uncinata Uncinia uncinata TRUE
## 42 Raukaua simplex Pseudopanax simplex TRUE
## 44 Microsorum pustulatum Phymatosorus diversifolius TRUE
## 48 Polystichum neozelandicum Polystichum richardii TRUE
## 49 Carex healyi Uncinia scabra TRUE
## 58 Veronica brachysiphon Hebe brachysiphon TRUE
## preferredTaxonName
## 6 Coprosma dumosa
## 9 Coprosma grandifolia
## 12 Podocarpus laetus
## 15 Fuscospora fusca
## 19 Prumnopitys taxifolia
## 30 Veronica leiophylla
## 37 Blechnum novae-zelandiae
## 41 Carex uncinata
## 42 Raukaua simplex
## 44 Microsorum pustulatum
## 48 Polystichum neozelandicum
## 49 Carex healyi
## 58 Veronica brachysiphon
taki_vegx = setPreferredTaxonNomenclature(taki_vegx, taki_lookup,
c(originalOrganismName = "NVSSpeciesName", preferredTaxonName = "PreferredSpeciesName"))
## Preferred taxon name was set on 38 organism identities.
## Preferred taxon name is now different than original organism name on 5 organism identities.
## 5 new organism name(s) added.
a = showElementTable(taki_vegx, "organismIdentity")
a[which(a$identityName!= a$originalOrganismName),]
## identityName originalOrganismName taxon
## 9 Veronica lycopodioides Hebe lycopodioides TRUE
## 14 Rytidosperma setifolium Notodanthonia setifolia TRUE
## 15 Zotovia colensoi Petriella colensoi TRUE
## 17 Veronica species Hebe species TRUE
## 32 Lobelia angulata Pratia angulata TRUE
## preferredTaxonName
## 9 Veronica lycopodioides
## 14 Rytidosperma setifolium
## 15 Zotovia colensoi
## 17 Veronica species
## 32 Lobelia angulata
Function mergeVegX()
is used to merge two Veg-X documents into a single one. This function puts all the input elements into the same containers and, whenever elements are considered to be the same, they are merged. Each element kind has its own way to determine when two instances refer to the same entity. For example, two plots will be considered to be equal if they have the same plot name and, if defined, they have the same plotUniqueIdentifier. By default, however, plots and organism identities are not merged. This is a security measure to avoid plots with the same name but from different sources to be identified as equal. A call to mergeVegX()
to merge the mtfyffe_vegx
and taki_vegx
VegX objects produces the following output:
comb_vegx = mergeVegX(moki_vegx, mtfyffe_vegx)
## Final number of party(ies): 1. Data pooled for 0 party(ies).
## Final number of literature citations: 1. Data pooled for 0 citation(s).
## Final number of methods: 15. Data pooled for 5 method(s).
## Final number of attributes: 20.
## Final number of strata: 13. Data pooled for 0 strata.
## Final number of surface types: 5. Data pooled for 0 surface types.
## Final number of organism names: 209. Data pooled for 30 organism name(s).
## Final number of taxon concepts: 0. Data pooled for 0 organism name(s).
## Final number of organism identities: 209. Data pooled for 0 organism identitie(s).
## Final number of projects: 3. Data pooled for 0 project(s).
## Final number of plots: 190. Data pooled for 0 plot(s).
## Final number of individual organisms: 1368. Data pooled for 0 individual organism(s).
## Final number of plot observations: 351. Data pooled for 0 plot observation(s).
## Final number of stratum observations: 205. Data pooled for 0 stratum observation(s).
## Final number of individual organism observations: 1725. Data pooled for 0 individual organism observation(s).
## Final number of aggregate organism observations: 1115. Data pooled for 0 aggregate organism observation(s).
## Final number of site observations: 25. Data pooled for 0 site observation(s).
## Final number of community observations: 25. Data pooled for 0 community observation(s).
## Final number of surface cover observations: 40. Data pooled for 0 surface cover observation(s).
Plots were all kept separately because allowMergePlots = FALSE
by default. However, in this case the plots had all different names and unique identifiers, so even if we had set allowMergePlots = TRUE
, they would have been all kept separately. The objects to be merged had shared methods, so the function identifies them as equal and avoids repetitions. The decisions to merge (i.e. pool) information of other elements can be interpreted similarly. A special case concerns organismName vs. organismIdentity. Note that some organism names were merged, but identities were not. While merging equal names is always safe, merging identities should be done with extreme caution, because two data sets may have employed the same taxon name but with different taxon concepts. Therefore, by default mergeVegX()
does not merge organism identities. If we want to specify that identities can be merged (when considered equal) we can set parameter allowMergeOrganismIdentities = TRUE
:
# comb_vegx = mergeVegX(moki_vegx, mtfyffe_vegx, allowMergeOrganismIdentities = TRUE)
In this second output, there are the number of merges in organismName than organismIdentity, as the equal names have been forced to mean equal identity.
Note that function mergeVegX()
can also be used to merge two documents that refer to the same data source, i.e. if one has imported different parts of the same source data into different Veg-X objects. In this case the user should specify allowMergePlots = TRUE
.
Users of the VegX package should be aware that organism identities are by default kept separate when merging Veg-X objects. If the user chooses to merge identities, the decision to actually merge two given organism identities is complex, depending on both nomenclature and taxon concepts. Let ‘1’ and ‘2’ be two organism identities being compared. If original taxon concepts (i.e. element originalIdentificationConcept) are missing for both of them, the following table explains decision according to nomenclature and, in case of merging, the nomenclature of the resulting identity (asterisk indicates that a warning will be raised by the merging function):
Case | Orig.1 | Pref.1 | Orig.2 | Pref.2 | Merge | Orig.Res. | Pref.Res. |
---|---|---|---|---|---|---|---|
1 | X | - | Y | - | No | - | - |
2 | X | - | X | - | Yes | X | - |
3 | X | Y | X | - | No* | - | - |
4 | X | Y | Y | - | Yes | - | Y |
5 | X | Y | X | Z | No* | - | - |
6 | X | Y | V | Z | No | - | - |
7 | X | Y | Z | Y | Yes | - | Y |
8 | X | Y | X | Y | Yes | X | Y |
9 | - | X | - | Y | No | - | - |
10 | - | X | - | X | Yes | - | X |
11 | X | - | - | X | Yes | - | X |
12 | X | - | - | Y | No | - | - |
In the more general case where original taxon concepts may have been specified for identity ‘1’, identity ‘2’ or both, the following table explains the decision to merge or not those organism identities, depending on the value of their taxon concepts and whether merging is possible according to nomenclature (asterisk indicates that a warning will be raised by the merging function):
Case | Nom. Merge? | Tax.Con.1 | Tax.Con.2 | Merge | Tax.Con. Res. |
---|---|---|---|---|---|
1 | No | - | - | No | - |
2 | No | X | - | No | - |
3 | No | X | Y | No | - |
4 | No | X | X | No* | - |
5 | Yes | - | - | Yes | - |
6 | Yes | X | - | Yes | - |
7 | Yes | X | Y | No* | - |
8 | Yes | X | X | Yes | X |
The general workflow when working with the VegX package is: (1) Create a new VegX document; (2) Add plot data and plot observations; (3) perform nomenclature corrections; (4) Merge documents. However, it could happen that user’s attempt to add observations to a VegX object that already contains identities, possibly with nomenclatural revisions and associated taxon concepts. When adding organism observations to VegX objects, the add function assumes that the name supplied is an originalOrganismName and it applies the same rules explained to determine whether they refer to the same identity and, if not, then a new organism identity will be created. For example, even if there is a match between the supplied name and the original organism name of an existing identity, a new identity will be created if the original taxon concept has been asserted for the existing one, because there is no way to check that the two organisms involved have indeed the same identity. If existing entity does not have an original taxon concept, then the decision to create or not new organism identities will follow the nomenclature rules of the table above.
heightMethod2 = predefinedMeasurementMethod("Stratum height/cm")
trans_vegx = transformQuantitativeScale(comb_vegx, "Stratum height/m", heightMethod2,
function(x){return(x*10)}, replaceValues = TRUE)
## Target method: 'Stratum height/m'
## Number of quantitative attributes: 1
## Measurement method 'Stratum height/cm' added.
## 70 transformation(s) were applied on stratum observations.
head(showElementTable(trans_vegx, "stratumObservation"),3)
## plotName obsStartDate stratumName lowerLimit_method lowerLimit_value
## 1 LGM38h 2011-02-20 Tier 1 Stratum height/cm 250
## 2 LGM38h 2011-02-20 Tier 2 Stratum height/cm 120
## 3 LGM38h 2011-02-20 Tier 6 Stratum height/cm 0
## upperLimit_method upperLimit_value str_1_method str_1_value
## 1 Stratum height/cm 500 Recce cover scale 3
## 2 Stratum height/cm 250 Recce cover scale 3
## 3 Stratum height/cm 3 Recce cover scale 3
percentScale = predefinedMeasurementMethod("Plant cover/%")
trans_vegx = transformOrdinalScale(comb_vegx, "Recce cover scale", percentScale)
## Target method: 'Recce cover scale'
## Number of attributes: 7
## Number of quantifiable attributes with midpoints: 6
## Limits of the new attribute: [0, 100]
## Measurement method 'Plant cover/%' added.
## 531 transformation(s) were applied on aggregate organism observations.
## 28 transformation(s) were applied on stratum observations.
head(showElementTable(trans_vegx, "stratumObservation"),3)
## plotName obsStartDate stratumName lowerLimit_method lowerLimit_value
## 1 LGM38h 2011-02-20 Tier 1 Stratum height/m 25
## 2 LGM38h 2011-02-20 Tier 2 Stratum height/m 12
## 3 LGM38h 2011-02-20 Tier 6 Stratum height/m 0
## upperLimit_method upperLimit_value str_1_method str_1_value
## 1 Stratum height/m 50.0 Recce cover scale 3
## 2 Stratum height/m 25.0 Recce cover scale 3
## 3 Stratum height/m 0.3 Recce cover scale 3
## str_2_method str_2_value
## 1 Plant cover/% 15
## 2 Plant cover/% 15
## 3 Plant cover/% 15
head(showElementTable(, “organismIdentity”))
The Veg-X exchange standard is currently implemented as an XML schema (but other physical implementations of the standard could be possible). The VegX package provides functions writeVegX()
and readVegX()
that are used, respectively, to write and read XML files with Veg-X documents. An advantage of XML is that it is text that can be read and understood by humans, but its disadvantage is that files tend to be very large, because of the redundancy of text. One possibility to overcome this is to compress XML files (into zip or tar.gz files) for more efficient storage. However, compressing XML files does not avoid the problem that writing/reading XML files can be slow in large data sets.
An alternative to XML that users can employ is to directly save and read Veg-X documents as R objects, using functions saveRDS()
and readRDS()
. This option is fast and will produce much smaller files. The only drawback of saving R objects can arise if the S4 definition of Veg-X documents is changed in future versions of the package. We tried to avoid this potential problem by defining S4 Veg-X objects as lists of the main elements, without defining the internal structure of each main element. If the version of the standard is changed, functions to convert R objects from old to new versions of the standard should be made available to avoid losing backwards compatibility, in the same way that function readVegX()
should be modified to allow reading XML documents formed following old versions of the standard.