About this tutorial

In this vignette you will learn how to use the VegX R package to map, integrate and harmonize vegetation data using the Veg-X standard (v. 2.0). For the examples, we use the data sets provided in the package. If you do not know what the Veg-X standard is, please refer to vignette The Veg-X exchange standard. Here we refer to elements of the Veg-X standard. Readers should refer to the same vignette to understand the definition of these elements and their logical relationships.

Package users and their main interests

We envisage two different kinds of users of the VegX package:

  • Vegetation data providers: These are people that are in possession of plot vegetation data (i.e. plot database managers or people having their own spread sheets) and get requests to share its data with third parties. Data providers would benefit from VegX package, in that they would perform the mapping of their source data into Veg-X only once, and then they would send the Veg-X file to any party requesting access to data. The functions most interesting to data providers are:
    • Create new VegX documents
    • Add information about plot location, plot shape and site characteristics.
    • Add vegetation observations of different kinds: individual plants, aggregated cover values, strata observations…
    • Add site observations (e.g. abiotic measurements).
    • Provide information about the taxon concepts related to organism identifications.
    • Provide information about the research project and contact information.
    • Write Veg-X XML files or other supported export formats.
  • Vegetation data integrators: These are people interested in gathering data from different sources, for the sake of compiling a new vegetation plot data base or conducting analyses based on integrated data sets. The functions most interesting to data integrators are:
    • Load Veg-X XML files.
    • Merge several Veg-X objects from different sources.
    • Assess the degree of compatibility of the different source data.
    • Harmonize measurement units and taxonomic nomenclature across data sets.
    • Generate new observations at a given level of vegetation resolution by aggregating data at a lower level (e.g. from individual tree diameters to basal area estimates).
    • Export data into formats suitable for analysis.

Installing the package and loading source data

The VegX package is currently distributed from GitHub. To install it, you should have package devtools installed and use the following command: devtools::install_github("iavs-org/VegX", build_vignettes=TRUE). Assuming that the package is already installed you begin by loading it, which results in the required package XML also being loaded:

User’s of the VegX package are expected to know how import their source data into R, either using a database connection or by reading files in diverse formats (e.g. txt, csv, xlsx, …). For the examples of this manual, we will use three data sets that were extracted from the New Zealand National Vegetation Survey (NVS) Databank. These are subsets of the original datasets prepared for demonstration purposes only.

  • Mokihinui forest: Forest and riparian data from 5 plots from the west coast of South Island (New Zealand). The data comprises i) site information from each plot; ii) cover scores of taxa within height strata; and iii) dbh measurement of individual trees.
  • Mt Fyffe forest: Forest data from five permanent plots from the east coast of South Island (New Zealand). The data comprises i) site information from each plot; ii) cover scores of taxa within height strata; iii) dbh measurement of individual trees within subplots that fully partition the permanent plot; iv) counts of saplings within subplots that fully partition the permanent plot; and v) counts of seedlings within height strata within subplots that sample the permanent plot. Two full measurements of the permanent plots are provided.
  • Takitimu grassland: Alpine grassland data from 5 plots from the southern South Island (New Zealand). The data comprises i) site information from each plot; ii) frequency values for plant taxa based on observations within subplots along transects; iii) groundcover; iv) disturbance.

Each of the three data sets contains different tables, corresponding to plot location, site observations, taxon observations, … For simplicity, we reduced the number of plots in each example data set to five, although some data sets contain subplots. As the three data sets are included with the package, we load the data from the three data sets into the R workspace using:

data(mokihinui)
data(mtfyffe)
data(takitimu)
ls()
##  [1] "moki_dia"            "moki_loc"            "moki_lookup"        
##  [4] "moki_site"           "moki_str"            "moki_tcv"           
##  [7] "mtfyffe_counts"      "mtfyffe_dia"         "mtfyffe_disturbance"
## [10] "mtfyffe_groundcover" "mtfyffe_loc"         "mtfyffe_lookup"     
## [13] "mtfyffe_site"        "taki_disturbance"    "taki_freq"          
## [16] "taki_groundcover"    "taki_loc"            "taki_lookup"        
## [19] "taki_site"

Creating a new Veg-X document

Before mapping any data to the Veg-X standard, we need to create a new (empty) document for each data set, using newVegX():

moki_vegx = newVegX()
mtfyffe_vegx = newVegX()
taki_vegx = newVegX()

The output from print() command reveals that a Veg-X document is defined in R using a S4 class, each of the different slots being vectors of the main elements of the Veg-X document:

print(moki_vegx)
## An object of class "VegX"
## Slot "VegXVersion":
## [1] "2.0.0"
## 
## Slot "parties":
## list()
## 
## Slot "literatureCitations":
## list()
## 
## Slot "methods":
## list()
## 
## Slot "attributes":
## list()
## 
## Slot "strata":
## list()
## 
## Slot "surfaceTypes":
## list()
## 
## Slot "organismNames":
## list()
## 
## Slot "taxonConcepts":
## list()
## 
## Slot "organismIdentities":
## list()
## 
## Slot "projects":
## list()
## 
## Slot "plots":
## list()
## 
## Slot "individualOrganisms":
## list()
## 
## Slot "plotObservations":
## list()
## 
## Slot "individualObservations":
## list()
## 
## Slot "aggregateObservations":
## list()
## 
## Slot "stratumObservations":
## list()
## 
## Slot "communityObservations":
## list()
## 
## Slot "siteObservations":
## list()
## 
## Slot "surfaceCoverObservations":
## list()

Printing a Veg-X object will normally result in too much data being shown in the console output. More user-friendly information about the Veg-X object can be obtained using the function summary(), which tell us how many instances we have of each of the main elements:

summary(moki_vegx)
## ================================================================
##                     Veg-X object (ver 2.0.0)                   
## ----------------------------------------------------------------
## 
##    Projects: 0
## 
##    Plots: 0  [Parent plots: 0 Sub-plots: 0]
## 
##    Individual organisms: 0
## 
##    Organism names: 0
## 
##    Taxon concepts: 0
## 
##    Organism Identities: 0
## 
##    Vegetation strata: 0
## 
##    Surface types: 0
## 
##    Parties: 0
## 
##    Literature citations: 0
## 
##    Methods: 0
## 
##    Plot observations: 0  [in parent plots: 0 in sub-plots: 0]
## 
##    Individual organism observations: 0
## 
##    Aggregated organism observations: 0
## 
##    Stratum observations: 0
## 
##    Community observations: 0
## 
##    Site observations: 0
## 
##    Surface cover observations: 0
## 
## ================================================================

Of course, moki_vegx is now empty (as are the other two VegX objects).

In the following sections we will progressively add content to the VegX objects. When using the VegX package, the order in which we introduce data to VegX documents is not particularly important, as elements are created as needed. Nevertheless, we will introduce the different functions that add data following a logical sequence. Thus, we begin by introducing plot and survey information, followed by observations of individual organisms, taxa, strata, etc. Later sections of the manual deal with functions that facilitate data integration and harmonization.

Adding plot and survey data to Veg-X documents

Function Description
addPlotObservation() Adds plot observation records to a VegX object from a data table where rows are plot observations.
addPlotLocations() Adds/replaces static plot location information (spatial coordinates, elevation, place names, …) to plot elements of a VegX object.
addPlotGeometries() Adds/replaces static plot geometry information (plot shape, dimensions, …) to plot elements of a VegX object.
addSiteCharacteristics() Adds/replaces static site characteristics (topography, geology, …) to plot elements of a VegX object.
fillProjectInformation() Fills the information for a given research project.

Project, plot and observation dates

In this subsection we show how to introduce information about plot names and survey dates. We start with the Mokinihui forest data set, by inspecting the data in the data frame moki_site:

head(moki_site, 3)
##    PlotID PlotObsID   Plot Subplot                                     Project
## 38 789033   1161630 LGM08r         MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 39 789034   1161631 LGM08r      1Q MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 40 789035   1161632 LGM08r      2Q MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
##    PlotObsCurrentName PlotLocationDescription PlotObsStartDate PlotObsStopDate
## 38             LGM08r  Mokihinui, Lower gorge       2011-02-17      2011-02-17
## 39                 1Q                               2011-02-17      2011-02-17
## 40                 2Q                               2011-02-17      2011-02-17
##    MeanTopHeight MeanTopHeightUnits PlotPermanence PlotObsCanopyPercentage
## 38             3                  m           True                      20
## 39            NA                                                        NA
## 40            NA                                                        NA
##    PlotArea AreaUnits  Shape Altitude AltitudeUnits Drainage
## 38      400         m Square       95             m     Good
## 39      100         m Square       NA                       
## 40      100         m Square       NA                       
##      DrainageTechniqueName AltitudeDatum PlotSlope SlopeUnits PlotTreatment
## 38 Standard Soil Drainage          a.s.l        40    degrees        Normal
## 39                                              NA                         
## 40                                              NA                         
##    PlotAspect AspectDirection PlotObsIsRelocated Physiography
## 38        360              NA                 NA         Face
## 39         NA              NA                 NA             
## 40         NA              NA                 NA             
##    PhysiographyTechniqueName ParentMaterial ParentMaterialTechniqueName
## 38     Standard Physiography             NA                          NA
## 39                                       NA                          NA
## 40                                       NA                          NA
##    PlotRadius RadiusUnits PlotRectangleLength01 PlotRectangleLength02
## 38         NA          NA                    20                    20
## 39         NA          NA                    10                    10
## 40         NA          NA                    10                    10
##    RectangleUnits                    Placement ParentPlotID ParentPlotObsID
## 38              m Objective: Stratified random       789033         1161630
## 39              m                                    789033         1161630
## 40              m                                    789033         1161630
##    ProjectID pH
## 38      2444  7
## 39      2444  7
## 40      2444  7

The data frame has many columns, encompasing both plot shape, site characteirstics, experimental treamtments, etc. The most important columns to parse in the beginning are Plot, Subplot and PlotObsStartDate, because these specify the space and time context of the vegetation observations. Other columns specify identifiers (IDs), but these are specific to the source data base. As Veg-X documents have their own internal IDs, it is not necessary to import the source identifiers.

To import data into Veg-X documents, we almost always need a mapping between the names of elements in the Veg-X standard and the names of columns in the data table used as input. For example, in the following code we define that column "Project" in the source data table contains the information about the projectTitle in Veg-X, column "Plot" contains the information about the plotName element, and so on:

mapping = list(projectTitle = "Project", plotName = "Plot", subPlotName = "Subplot",
               obsStartDate = "PlotObsStartDate", obsEndDate = "PlotObsStopDate")

Once the mapping is defined, we can import the data using addPlotObservations():

moki_vegx = addPlotObservations(moki_vegx, moki_site, mapping = mapping)
##  1 project(s) parsed, 1 new project(s) added.
##  25 plot(s) parsed, 25 new plot(s) added.
##  25 plot observation(s) parsed, 25 new plot observation(s) added.

The console output of the add function informs us of the steps that took place and the modifications of our Veg-X R object (note that we could store the result in a different object instead of replacing moki_vegx). 25 plots were identified, all belonging to the same research project, and one plot observation was read for each plot. If we again call the summary function we will see a change in the number of data elements:

summary(moki_vegx)
## ================================================================
##                     Veg-X object (ver 2.0.0)                   
## ----------------------------------------------------------------
## 
##    Projects: 1
##       1. MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 
##    Plots: 25  [Parent plots: 5 Sub-plots: 20]
## 
##    Individual organisms: 0
## 
##    Organism names: 0
## 
##    Taxon concepts: 0
## 
##    Organism Identities: 0
## 
##    Vegetation strata: 0
## 
##    Surface types: 0
## 
##    Parties: 0
## 
##    Literature citations: 0
## 
##    Methods: 0
## 
##    Plot observations: 25  [in parent plots: 5 in sub-plots: 20]
## 
##    Individual organism observations: 0
## 
##    Aggregated organism observations: 0
## 
##    Stratum observations: 0
## 
##    Community observations: 0
## 
##    Site observations: 0
## 
##    Surface cover observations: 0
## 
## ================================================================

Note that among the 25 plots there are 20 sub-plots (i.e. 4 quadrants for each parent plot). If we want to inspect, at any time, the content of a Veg-X object in more detail, we can use the function showElementTable(), indicating which of the main Veg-X elements we want to inspect:

head(showElementTable(moki_vegx, "plotObservation"),6)
##    plotName obsStartDate obsEndDate                                projectTitle
## 1    LGM08r   2011-02-17 2011-02-17 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 2 LGM08r_1Q   2011-02-17 2011-02-17 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 3 LGM08r_2Q   2011-02-17 2011-02-17 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 4 LGM08r_3Q   2011-02-17 2011-02-17 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 5 LGM08r_4Q   2011-02-17 2011-02-17 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 6    LGM16l   2011-02-15 2011-02-15 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011

When sub-plots are added to a VegX object, the package automatically names them by concatenating the name of the plot with the name of the subplot, with an underscore ’_’ to separate both strings.

Let’s now read plots and plot observations for the Mt Fyffe forest data set. Since it comes from the same vegetation data base (NVS), the data tables have similar column names and we will not show them again. In this case, however, there is no information about the sampling end date, only the start date. We modify our mapping accordingly and we call addPlotObservations() :

mapping = list(projectTitle = "Project", plotName = "Plot", subPlotName = "Subplot", 
               obsStartDate = "PlotObsStartDate")
mtfyffe_vegx = addPlotObservations(mtfyffe_vegx, mtfyffe_site, mapping)
##  2 project(s) parsed, 2 new project(s) added.
##  165 plot(s) parsed, 165 new plot(s) added.
##  326 plot observation(s) parsed, 326 new plot observation(s) added.
summary(mtfyffe_vegx)
## ================================================================
##                     Veg-X object (ver 2.0.0)                   
## ----------------------------------------------------------------
## 
##    Projects: 2
##       1. FYFFE, MOUNT FOREST 1980
##       2. FYFFE, MOUNT FOREST 2007-2008
## 
##    Plots: 165  [Parent plots: 4 Sub-plots: 161]
## 
##    Individual organisms: 0
## 
##    Organism names: 0
## 
##    Taxon concepts: 0
## 
##    Organism Identities: 0
## 
##    Vegetation strata: 0
## 
##    Surface types: 0
## 
##    Parties: 0
## 
##    Literature citations: 0
## 
##    Methods: 0
## 
##    Plot observations: 326  [in parent plots: 8 in sub-plots: 318]
## 
##    Individual organism observations: 0
## 
##    Aggregated organism observations: 0
## 
##    Stratum observations: 0
## 
##    Community observations: 0
## 
##    Site observations: 0
## 
##    Surface cover observations: 0
## 
## ================================================================

In this source data set there are many more sub-plots for each parent plot, and each plot was visited twice (in 1980 and the austral summer of 2007-2008). Moreover, each survey corresponds to a different project. Veg-X does not require projects to be equated to surveys, but this data set is structured this way. We now turn our attention to the Takitimu grassland data set.

mapping = list(projectTitle = "Project", plotName = "Plot", subPlotName = "Subplot", 
               obsStartDate = "PlotObsStartDate")
taki_vegx = addPlotObservations(taki_vegx, taki_site, mapping)
##  1 project(s) parsed, 1 new project(s) added.
##  5 plot(s) parsed, 5 new plot(s) added.
##  5 plot observation(s) parsed, 5 new plot observation(s) added.
summary(taki_vegx)
## ================================================================
##                     Veg-X object (ver 2.0.0)                   
## ----------------------------------------------------------------
## 
##    Projects: 1
##       1. TAKITIMU GRASSLAND 1968-1969
## 
##    Plots: 5  [Parent plots: 5 Sub-plots: 0]
## 
##    Individual organisms: 0
## 
##    Organism names: 0
## 
##    Taxon concepts: 0
## 
##    Organism Identities: 0
## 
##    Vegetation strata: 0
## 
##    Surface types: 0
## 
##    Parties: 0
## 
##    Literature citations: 0
## 
##    Methods: 0
## 
##    Plot observations: 5  [in parent plots: 5 in sub-plots: 0]
## 
##    Individual organism observations: 0
## 
##    Aggregated organism observations: 0
## 
##    Stratum observations: 0
## 
##    Community observations: 0
## 
##    Site observations: 0
## 
##    Surface cover observations: 0
## 
## ================================================================

According to this summary, this third data set contains again 5 plots, but with no sub-plots, so even though we specified a mapping for sub-plots, there were no sub-plots in the source data table to populate the VegX object.

Project information

In the previous subsection, we specified a mapping for research project titles, and this lead to the creation of project elements in Veg-X documents. However, we did not introduce any data describing the project:

showElementTable(moki_vegx, "project")
##                            title
## 1 MOKIHINUI HYDRO PROPOSAL - ...

Veg-X package provides the function fillProjectInformation() to fill project data. It can be used to fill the data for an existing project (identified by its title) or to define a new project. In this case the data is introduced directly as text to the parameters of the function, instead of being read from a data frame. As an example, we provide the information for the project that led to the collection of data in the Mokihinui forest:

moki_vegx = fillProjectInformation(moki_vegx, "MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011",
              personnel = c(contributor = "Susan K. Wiser"),
              abstract = paste("Characterise the forest and riparian vegetation",
                               "in the lower Mokihinui gorge,",
                               "and compare this with the vegetation",
                               "in (a) North Branch gorge of Mokihinui",
                               "and (b) Karamea catchment."),
             studyAreaDescription = paste("Mokihinui and Karamea catchments.",
                                          " Forest riparian habitat."))
##  1 new party(ies) added to the document as individuals.
showElementTable(moki_vegx, "project")
##                            title                       abstract
## 1 MOKIHINUI HYDRO PROPOSAL - ... Characterise the forest and...
##             studyAreaDescription
## 1 Mokihinui and Karamea catch...

Note that filling the information about the project led to the definition of personnel involved in the project. In the Veg-X standard any individual/organization/position involved in the creation of a data set is stored in a party element. We may fill contact information for party elements using the function fillPartyInformation().

Plot coordinates

The next piece of information we will introduce are the geographic locations of plots (sampling dates were already mapped with addPlotObservations()). We thus take a look at moki_loc data frame:

head(moki_loc, 3)
##    AbsoluteCoordID   Plot Subplot                                     Project
## 7           201568 LGM45h      NA MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 11          201531 LGM08r      NA MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 14          201561 LGM38h      NA MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
##                     Type AbsoluteCoordXEastLong AbsoluteCoordYNorthLat    Datum
## 7  Grid Coordinate (map)                2429288                5960763 NZGD1949
## 11 Grid Coordinate (map)                2438288                5962438 NZGD1949
## 14 Grid Coordinate (map)                2431238                5962363 NZGD1949
##    MapProjection MapSeries MapSheet                 GPS Method
## 7           NZMG  NZMS 260     L28  Garmin GPSMap 60CSX    GPS
## 11          NZMG  NZMS 260     L28  Garmin GPSMap 60CSX    GPS
## 14          NZMG  NZMS 260     L28  Garmin GPSMap 60CSX    GPS
##                 Source EastingMG NorthingMG Longitude Latitude PrecisionMetres
## 7  Original Coordinate   2429288    5960763  172.0326 -41.5559               5
## 11 Original Coordinate   2438288    5962438  172.1407 -41.5417               5
## 14 Original Coordinate   2431238    5962363  172.0562 -41.5417               5
##    PrecisionShape ParentPlotID ParentPlotObsID PlotCoordinateID PlotObsID
## 7          Circle       789218         1161815           287136   1161815
## 11         Circle       789033         1161630           287099   1161630
## 14         Circle       789183         1161780           287129   1161780
##    ProjectID
## 7       2444
## 11      2444
## 14      2444

Locations are expressed using different coordinate systems, but the easiest and more common way of exchanging geographic information is by using latitude and longitude. Hence, we define a new mapping and use the function addPlotLocations():

mapping = list(plotName = "Plot", x = "Longitude", y = "Latitude")
moki_vegx = addPlotLocations(moki_vegx, moki_loc, mapping, 
                             proj4string = "+proj=longlat +datum=WGS84")
##  5 plot(s) parsed, 0 new plot(s) added.
##  5 record(s) parsed.

When defining the mapping xand yare used to map coordinates. We should also include plotName, because otherwise the function does not know how to match coordinates with the plots already defined in moki_vegx (subPlotName should be included if coordinates are available for subplots). Parameter proj4string is used to supply the spatial reference system of the coordinates. The console output indicates that no new plots have been added (they were previously defined), but they would if we had started populating an empty Veg-X object using addPlotLocations(). We can inspect the data recently entered using the following command:

head(showElementTable(moki_vegx, "plot"),3)
##    plotName   coordX   coordY           spatialReference relatedPlotName
## 1    LGM08r 172.1407 -41.5417 +proj=longlat +datum=WGS84            <NA>
## 2 LGM08r_1Q       NA       NA                       <NA>          LGM08r
## 3 LGM08r_2Q       NA       NA                       <NA>          LGM08r
##   plotRelationship
## 1             <NA>
## 2          subplot
## 3          subplot

When calling showElementTable() for plot elements we are showing the plot/sub-plot relationships. Note that sub-plots have no explicit coordinates associated to them (they are not given in moki_loc). It is up to the user to provide them in the source data. Using the same mapping we can parse the coordinates of the Mt Fyffe forest data set:

mtfyffe_vegx = addPlotLocations(mtfyffe_vegx, mtfyffe_loc, mapping)
##  4 plot(s) parsed, 0 new plot(s) added.
##  8 record(s) parsed.

In this example, 8 records were parsed, but coordinates are availble for four plots only. Coordinate records are duplicated in mtfyffe_loc, because they are provided independently for each survey. The function addPlotLocations() will only keep the most recently read location records of each plot. Finally, we parse plot coordinates for the Takikimu grassland data set, realizing that they are missing for three of the plots.

taki_vegx = addPlotLocations(taki_vegx, taki_loc, mapping)
##  5 plot(s) parsed, 0 new plot(s) added.
##  5 record(s) parsed.
##  3 record(s) with missing value(s) not added.

The function addPlotLocations() accepts coordinates in any spatial reference system (which is specified using the parameter proj4string). Setting toWGS84 = TRUE will indicate to the function that it should attempt to translate the input coordinates into longitude and latitude, but this was not required in our examples.

Plot elevation

While x and y specify horizontal plot position, the vertical position of a plot is specified using elevation (normally above sea level). Since plot elevation is a measurement, it is important to specify a measurement method (i.e. instruments) and a measurement scale (i.e. measurement units) because this metadata decreases potential errors when pooling data from different sources. In the Veg-X standard, this information is specified via defining method and attribute elements, whereas the VegX package has a S4 class named VegXMethod that encapsulates both things. Users can define their own methods, but the package provides function predefinedMeasurementMethod() to easily define methods for the most common variables. For example, we can define the measurement for elevation in meters above sea level using:

elevMethod = predefinedMeasurementMethod("Elevation/m")

Plot elevation is added to Veg-X documents using addPlotLocations() as before. However, in our Mokihinui data set elevation is included in the data frame moki_site (and not moki_loc), so we could not add it using the same call that we used for plot coordinates. Having our elevation method defined, we use again:

mapping = list(plotName = "Plot", elevation = "Altitude")
moki_vegx = addPlotLocations(moki_vegx, moki_site, mapping, 
                             methods = list(elevation = elevMethod))
##  Measurement method 'Elevation/m' added for 'elevation'.
##  5 plot(s) parsed, 0 new plot(s) added.
##  25 record(s) parsed.
##  20 record(s) with missing value(s) not added.

Only the parent plots have elevation data (i.e., the records of sub-plots are missing). If we inspect again the plot elements of our document we find that elevation data has been added to plot coordinates:

head(showElementTable(moki_vegx, "plot"),3)
##    plotName   coordX   coordY           spatialReference elevation_method
## 1    LGM08r 172.1407 -41.5417 +proj=longlat +datum=WGS84      Elevation/m
## 2 LGM08r_1Q       NA       NA                       <NA>             <NA>
## 3 LGM08r_2Q       NA       NA                       <NA>             <NA>
##   elevation_value relatedPlotName plotRelationship
## 1              95            <NA>             <NA>
## 2              NA          LGM08r          subplot
## 3              NA          LGM08r          subplot

Analogous calls to addPlotLocations() can be made to fill elevation data for the Mt Fyffe forest and the Takitimu grassland data sets:

mtfyffe_vegx = addPlotLocations(mtfyffe_vegx, mtfyffe_site, mapping,
                                methods = c(elevation = elevMethod))
##  Measurement method 'Elevation/m' added for 'elevation'.
##  4 plot(s) parsed, 0 new plot(s) added.
##  326 record(s) parsed.
##  318 record(s) with missing value(s) not added.
taki_vegx = addPlotLocations(taki_vegx, taki_site, mapping, 
                             methods = list(elevation = "Elevation/m"))
##  Measurement method 'Elevation/m' added for 'elevation'.
##  5 plot(s) parsed, 0 new plot(s) added.
##  5 record(s) parsed.

Note that in this case we specified the method for elevation using a string directly. This avoids having to call function predefinedMeasurementMethod().

Plot geometry

By plot geometry, we refer to plot area, shape and dimensions. Veg-X allows different plot shapes (circle, rectangle, line or polygon), and each plot shape implies different dimensions. Plot geometry is specified using function addPlotGeometries() and, analogously to addPlotLocation(), the function will replace any previous information regarding geometry. We start by looking at the plot geometry fields in the Mokihinui forest data set table moki_site:

names(moki_site)
##  [1] "PlotID"                      "PlotObsID"                  
##  [3] "Plot"                        "Subplot"                    
##  [5] "Project"                     "PlotObsCurrentName"         
##  [7] "PlotLocationDescription"     "PlotObsStartDate"           
##  [9] "PlotObsStopDate"             "MeanTopHeight"              
## [11] "MeanTopHeightUnits"          "PlotPermanence"             
## [13] "PlotObsCanopyPercentage"     "PlotArea"                   
## [15] "AreaUnits"                   "Shape"                      
## [17] "Altitude"                    "AltitudeUnits"              
## [19] "Drainage"                    "DrainageTechniqueName"      
## [21] "AltitudeDatum"               "PlotSlope"                  
## [23] "SlopeUnits"                  "PlotTreatment"              
## [25] "PlotAspect"                  "AspectDirection"            
## [27] "PlotObsIsRelocated"          "Physiography"               
## [29] "PhysiographyTechniqueName"   "ParentMaterial"             
## [31] "ParentMaterialTechniqueName" "PlotRadius"                 
## [33] "RadiusUnits"                 "PlotRectangleLength01"      
## [35] "PlotRectangleLength02"       "RectangleUnits"             
## [37] "Placement"                   "ParentPlotID"               
## [39] "ParentPlotObsID"             "ProjectID"                  
## [41] "pH"
table(moki_site$Shape)
## 
## Square 
##     25

After realizing that plot/subplot shapes are rectangular and both length and width are available, we define the following mapping for rectangular (or square) plots:

mapping = list(plotName = "Plot", subPlotName = "Subplot",
               area = "PlotArea", shape = "Shape",
               length = "PlotRectangleLength01", width = "PlotRectangleLength02")

Like elevation, plot area and plot dimensions are measurements so we need to define them. We are now ready to import plot geometry using addPlotGeometries(), where we specify both the mapping and the list of methods corresponding to the Veg-X element names of the mapping (i.e. area, length and width for rectangular plots):

moki_vegx = addPlotGeometries(moki_vegx, moki_site, mapping,
              list(area = "Plot area/m2", width = "Plot dimension/m", length = "Plot dimension/m"))
##  Measurement method 'Plot area/m2' added for 'area'.
##  Measurement method 'Plot dimension/m' added for 'width'.
##  Measurement method 'Plot dimension/m' for 'length' already included.
##  25 plot(s) parsed, 0 new plot(s) added.
##  25 record(s) parsed.
head(showElementTable(moki_vegx, "plot"),3)
##    plotName  area_method area_value     shape    length_method length_value
## 1    LGM08r Plot area/m2        400 rectangle Plot dimension/m           20
## 2 LGM08r_1Q Plot area/m2        100 rectangle Plot dimension/m           10
## 3 LGM08r_2Q Plot area/m2        100 rectangle Plot dimension/m           10
##       width_method width_value   coordX   coordY           spatialReference
## 1 Plot dimension/m          20 172.1407 -41.5417 +proj=longlat +datum=WGS84
## 2 Plot dimension/m          10       NA       NA                       <NA>
## 3 Plot dimension/m          10       NA       NA                       <NA>
##   elevation_method elevation_value relatedPlotName plotRelationship
## 1      Elevation/m              95            <NA>             <NA>
## 2             <NA>              NA          LGM08r          subplot
## 3             <NA>              NA          LGM08r          subplot

Like before, no new plots were added, as previous call functions had already defined them. In the call to showElementTable() we have now the plot geometry added to the plot location and plot/subplot relationships. Importing plot geometry for the Takitimu grassland data set is analogous:

taki_vegx = addPlotGeometries(taki_vegx, taki_site, mapping,
              list(area = "Plot area/m2", width = "Plot dimension/m", length = "Plot dimension/m"))
##  Measurement method 'Plot area/m2' added for 'area'.
##  Measurement method 'Plot dimension/m' added for 'width'.
##  Measurement method 'Plot dimension/m' for 'length' already included.
##  5 plot(s) parsed, 0 new plot(s) added.
##  5 record(s) parsed.
head(showElementTable(taki_vegx, "plot"))
##            plotName  area_method area_value     shape    length_method
## 1 WB2 2 (WB2 2_TRA) Plot area/m2        400 rectangle Plot dimension/m
## 2   R1 1 (R1 1_TRA) Plot area/m2        400 rectangle Plot dimension/m
## 3   T2 2 (T2 2_TRA) Plot area/m2        400 rectangle Plot dimension/m
## 4   T1 2 (T1 2_TRA) Plot area/m2        400 rectangle Plot dimension/m
## 5   R3 1 (R3 1_TRA) Plot area/m2        400 rectangle Plot dimension/m
##   length_value     width_method width_value   coordX   coordY elevation_method
## 1           20 Plot dimension/m          20 167.9438 -45.6321      Elevation/m
## 2           20 Plot dimension/m          20       NA       NA      Elevation/m
## 3           20 Plot dimension/m          20       NA       NA      Elevation/m
## 4           20 Plot dimension/m          20       NA       NA      Elevation/m
## 5           20 Plot dimension/m          20 167.8307 -45.7138      Elevation/m
##   elevation_value
## 1            1463
## 2            1097
## 3             914
## 4            1097
## 5            1067

In the case of Mt Fyffe forest data set, shape is missing for many of plots. Other plots are circular but radius is not defined in the mtfyffe_site data frame.

names(mtfyffe_site)
##  [1] "PlotID"                    "PlotObsID"                
##  [3] "Plot"                      "Subplot"                  
##  [5] "Project"                   "PlotObsCurrentName"       
##  [7] "PlotLocationDescription"   "PlotObsStartDate"         
##  [9] "PlotObsStopDate"           "MeanTopHeight"            
## [11] "MeanTopHeightUnits"        "PlotPermanence"           
## [13] "PlotArea"                  "AreaUnits"                
## [15] "Shape"                     "Altitude"                 
## [17] "AltitudeUnits"             "Drainage"                 
## [19] "DrainageTechniqueName"     "PlotSlope"                
## [21] "SlopeUnits"                "PlotTreatment"            
## [23] "PlotAspect"                "Physiography"             
## [25] "PhysiographyTechniqueName" "Placement"                
## [27] "ParentPlotID"              "ParentPlotObsID"          
## [29] "ProjectID"
table(mtfyffe_site$Shape)
## 
##        Circle 
##    135    191

Hence, we define the following mapping, and a call addPlotGeometries() produces the following result:

mapping = list(plotName = "Plot", subPlotName = "Subplot",
               area = "PlotArea", shape = "Shape")
mtfyffe_vegx = addPlotGeometries(mtfyffe_vegx, mtfyffe_site, mapping,
                                 list(area = "Plot area/m2"))
##  Measurement method 'Plot area/m2' added for 'area'.
##  165 plot(s) parsed, 0 new plot(s) added.
##  326 record(s) parsed.
##  1 record(s) with missing value(s) not added.
head(showElementTable(mtfyffe_vegx, "plot"), 3)
##   plotName  area_method area_value   coordX   coordY elevation_method
## 1      6 4 Plot area/m2        400 173.6370 -42.3276      Elevation/m
## 2     12 1 Plot area/m2        400 173.6479 -42.2728      Elevation/m
## 3     14 2 Plot area/m2        400 173.6532 -42.2563      Elevation/m
##   elevation_value relatedPlotName plotRelationship shape
## 1             480            <NA>             <NA>  <NA>
## 2             575            <NA>             <NA>  <NA>
## 3             879            <NA>             <NA>  <NA>

Other static site characteristics

To finish with static plot information, the next data we should add to our Veg-X documents is plot topography. This can be done using function addSiteCharacteristics(), which also allows introducing other site attributes that are considered static in time for the time scales of vegetation dynamics (e.g. geological parent material). Having inspected data frame moki_site before makes us suspect that an appropriate mapping is:

sitemapping = list(plotName = "Plot", subPlotName = "Subplot",
                   slope = "PlotSlope", aspect = "PlotAspect")

Since slope and aspect are again measurements, we also need to provide methods for them. After checking the units in the source data we are ready to import the data:

moki_vegx = addSiteCharacteristics(moki_vegx, moki_site, mapping = sitemapping,
                measurementMethods = list(slope = "Slope/degrees", aspect = "Aspect/degrees"))
##  Measurement method 'Slope/degrees' added for 'slope'.
##  Measurement method 'Aspect/degrees' added for 'aspect'.
##  25 plot(s) parsed, 0 new plot(s) added.
##  25 record(s) parsed.
##  40 measurement(s) with missing value(s) not added.
head(showElementTable(moki_vegx, "plot"), 3)
##    plotName  area_method area_value     shape    length_method length_value
## 1    LGM08r Plot area/m2        400 rectangle Plot dimension/m           20
## 2 LGM08r_1Q Plot area/m2        100 rectangle Plot dimension/m           10
## 3 LGM08r_2Q Plot area/m2        100 rectangle Plot dimension/m           10
##       width_method width_value   coordX   coordY           spatialReference
## 1 Plot dimension/m          20 172.1407 -41.5417 +proj=longlat +datum=WGS84
## 2 Plot dimension/m          10       NA       NA                       <NA>
## 3 Plot dimension/m          10       NA       NA                       <NA>
##   elevation_method elevation_value  slope_method slope_value  aspect_method
## 1      Elevation/m              95 Slope/degrees          40 Aspect/degrees
## 2             <NA>              NA          <NA>          NA           <NA>
## 3             <NA>              NA          <NA>          NA           <NA>
##   aspect_value relatedPlotName plotRelationship
## 1          360            <NA>             <NA>
## 2           NA          LGM08r          subplot
## 3           NA          LGM08r          subplot

Again, no new plots are added, and missing values correspond to subplots. When calling showElementTable() the topography information is shown along with the plot information previously added. Since the site data frames for Mt Fyffe forest and Takitimu grassland data sets have the same structure as that of Mokihinui, adding topography information for the former data sets is rather straightforward:

mtfyffe_vegx = addSiteCharacteristics(mtfyffe_vegx, mtfyffe_site, mapping = sitemapping,
                measurementMethods = list(slope = "Slope/degrees", aspect = "Aspect/degrees"))
##  Measurement method 'Slope/degrees' added for 'slope'.
##  Measurement method 'Aspect/degrees' added for 'aspect'.
##  165 plot(s) parsed, 0 new plot(s) added.
##  326 record(s) parsed.
##  636 measurement(s) with missing value(s) not added.
taki_vegx = addSiteCharacteristics(taki_vegx, taki_site, mapping = sitemapping,
                measurementMethods = list(slope = "Slope/degrees", aspect = "Aspect/degrees"))
##  Measurement method 'Slope/degrees' added for 'slope'.
##  Measurement method 'Aspect/degrees' added for 'aspect'.
##  5 plot(s) parsed, 0 new plot(s) added.
##  5 record(s) parsed.

Adding observation data

In the beginning of the previous section we specified plot observation dates for the plots of our examples, using function addPlotObservation(). While this function defines survey events for plots, it does not add any observation or measurement made on plot visits. In this section we show how to add such information.

Function Description
addIndividualOrganismObservations() Adds individual organism observation records (e.g. tree diameters or heights) to a VegX object.
addAggregateOrganismObservations() Adds aggregate organism observation records (e.g. % cover of a particular taxon) to a VegX object.
addStratumObservations() Adds stratum observation records (e.g. % cover of plants in the tree layer) to a VegX object.
addCommunityObservations() Adds community observation records (e.g. stand age or total basal area) to a VegX object.
addSiteObservations() Adds site observation records (e.g. abiotic measurements such as pH) to a VegX object.
addSurfaceCoverObservations() Adds surface cover observation records (e.g. percent of ground covered by bare soil or rocks) to a VegX object.

Individual organism observations

First we focus on observations made on individual organisms (e.g. diameter values measured on individual trees). Since individual organisms can be labelled and re-measured in different plot surveys, Veg-X uses the element individualOrganism to keep track of the organism itself. Then, different elements individualOrganismObservations can be used to contain measurements made on the individual organism each time there was an observation of the plot (i.e. each time the plot was revisited). The individual organism (e.g. a particular tree) is uniquely identified using the plot name and an organism label (i.e. a tag on the specimen). Thus, the same label can be repeated in different plots without causing data integrity problems. Individual organisms and their observations are added to Veg-X using the function addIndividualOrganismObservations(). We first show how it works using the data frame moki_dia, which contains diameter measurements for trees in the Mokihinui forest data set:

head(moki_dia, 3)
##     DiameterID   Plot Subplot                                     Project
## 912    2965105 LGM38h      1Q MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 920    2965112 LGM38h      1Q MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 921    2965114 LGM38h      1Q MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
##     EntryNo ItemCurrentIdentifier Identifier AliveState    MethodName
## 912       1                    NA         NA      Alive Stem Diameter
## 920       8                    NA         NA      Alive Stem Diameter
## 921      10                    NA         NA      Alive Stem Diameter
##     Verbatim.Species TaxonID CurrentTaxonID NVSCode       NVSSpeciesName
## 912           WEIRAC    1747           1747  WEIRAC  Weinmannia racemosa
## 920           COPGRA    2396           2396  COPGRA Coprosma grandifolia
## 921           CYASMI    1146           1146  CYASMI      Cyathea smithii
##     Diameter DiameterValueUnits AssociationNo Association HeightOrLength
## 912     10.6                 cm            NA                         NA
## 920      4.0                 cm            NA                         NA
## 921     11.3                 cm            NA                         NA
##     LinearDimensionUnits ItemObsNote  ItemID ItemObsID ParentPlotID
## 912                   NA             2321109   3016862       789183
## 920                   NA             2321116   3016869       789183
## 921                   NA             2321118   3016871       789183
##     ParentPlotObsID PlotMethodID PlotObsID ProjectID SampleMethodID
## 912         1161780      1392927   1161781      2444          13406
## 920         1161780      1392927   1161781      2444          13406
## 921         1161780      1392927   1161781      2444          13406
##     PlotObsStartDate
## 912       2011-02-20
## 920       2011-02-20
## 921       2011-02-20
unique(moki_dia$Identifier)
## [1] NA

Note that there is a column called Identifier but no data in it. Fortunately, the data set includes a single survey, so that there is no need to provide labels for individual organisms. Hence, we can define our mapping as follows:

mapping = list(plotName = "Plot", subPlotName = "Subplot", obsStartDate = "PlotObsStartDate",
               taxonName = "NVSSpeciesName", diameterMeasurement = "Diameter")

If no mapping is provided for individualOrganismLabel, function addIndividualOrganismObservations() will assume that each record corresponds to a different organism. To define the identity of organisms we can use mapping for either organismName or taxonName. The first option is used to specify names that are not taxa (e.g. “tree #1”, “tree #2”, or morphospecies), while the second option explicitly identifies names as taxa. The call to the function produces the following output:

moki_vegx = addIndividualOrganismObservations(moki_vegx, moki_dia, mapping = mapping,
                                      methods = list(diameterMeasurement = "DBH/cm"))
##  Measurement method 'DBH/cm' added for 'diameterMeasurement'.
##  23 plot(s) parsed, 0 new added.
##  18 plot observation(s) parsed, 0 new added.
##  28 organism names(s) parsed, 28 new added.
##  0 taxon concept(s) parsed, 0 new added.
##  28 organism identitie(s) parsed, 28 new added.
##  0 individual organism(s) parsed, 643 new added.
##  643 record(s) parsed, 643 new individual organism observation(s) added.

where we see that the number of individual organisms is equal to the number of observations. We can inspect the added individual organism observations using:

head(showElementTable(moki_vegx, "individualOrganismObservation"), 3)
##    plotName obsStartDate individualOrganismLabel organismIdentityName
## 1 LGM38h_1Q   2011-02-20                    ind1  Weinmannia racemosa
## 2 LGM38h_1Q   2011-02-20                    ind2 Coprosma grandifolia
## 3 LGM38h_1Q   2011-02-20                    ind3      Cyathea smithii
##   diameter_method diameter_value
## 1          DBH/cm           10.6
## 2          DBH/cm            4.0
## 3          DBH/cm           11.3

Note that the column individualOrganismLabel contains labels created by the function itself, by numbering all individuals of each plot. The call to function addIndividualOrganismObservations() also led to the definition of elements organismName (used to store the different organism/taxon names that are used in the Veg-X document) and elements organismIdentity (which define the identity of organisms, as with links to organism names and taxon concepts). Let’s inspect the latter:

head(showElementTable(moki_vegx, "organismIdentity"), 3)
##           identityName originalOrganismName taxon
## 1  Weinmannia racemosa  Weinmannia racemosa  TRUE
## 2 Coprosma grandifolia Coprosma grandifolia  TRUE
## 3      Cyathea smithii      Cyathea smithii  TRUE

In this case, the identity is simply the species name coming from the source data, but it could be another name considered nomenclaturally more valid for the same species.

The Mt Fyffe forest data set also contains tree diameter measurements, but in this case there have been two surveys, so in order to add individual tree observations we need the mapping individualOrganismLabel to specify which column identifies each tree in each plot:

head(mtfyffe_dia, 3)
##    DiameterID Plot Subplot                  Project EntryNo
## 17     119220 17 3       L FYFFE, MOUNT FOREST 1980      91
## 20     121700  6 4       I FYFFE, MOUNT FOREST 1980     154
## 29     121685  6 4       D FYFFE, MOUNT FOREST 1980     139
##    ItemCurrentIdentifier Identifier AliveState            MethodName
## 17                  5167       5167      Alive Quadrat tree diameter
## 20                  7778       7778      Alive Quadrat tree diameter
## 29                  7701       7701      Alive Quadrat tree diameter
##    Verbatim.Species TaxonID CurrentTaxonID NVSCode       NVSSpeciesName
## 17           COPLUC    2400           2400  COPLUC      Coprosma lucida
## 20           RUBCIS    2756           2756  RUBCIS      Rubus cissoides
## 29           PSEARB    2568           2568  PSEARB Pseudopanax arboreus
##    Diameter DiameterValueUnits AssociationNo Association ItemObsNote ItemID
## 17     7.90                 cm            NA                      NA  13431
## 20     2.90                 cm            NA                      NA  11093
## 29     4.60                 cm            NA                      NA  11049
##    ItemObsID ParentPlotID ParentPlotObsID PlotMethodID PlotObsID ProjectID
## 17    902441          229           52221        89460    145010       867
## 20    902572          217           52241        67343    149638       867
## 29    902750          217           52241        24783    149634       867
##    SampleMethodID PlotObsStartDate
## 17           4629       1980-02-07
## 20           4629       1980-02-07
## 29           4629       1980-02-07
mapping = list(plotName = "Plot", subPlotName = "Subplot", obsStartDate = "PlotObsStartDate",
               taxonName = "NVSSpeciesName", individualOrganismLabel = "Identifier", 
               diameterMeasurement = "Diameter")

Since the diameter measurement method is the same as before, we can directly run `addIndividualOrganismObservations()`` and inspect the result:

mtfyffe_vegx = addIndividualOrganismObservations(mtfyffe_vegx, mtfyffe_dia, 
                                  mapping = mapping,
                                  methods = list(diameterMeasurement = "DBH/cm"))
##  Measurement method 'DBH/cm' added for 'diameterMeasurement'.
##  67 plot(s) parsed, 0 new added.
##  117 plot observation(s) parsed, 0 new added.
##  30 organism names(s) parsed, 30 new added.
##  0 taxon concept(s) parsed, 0 new added.
##  30 organism identitie(s) parsed, 30 new added.
##  725 individual organism(s) parsed, 725 new added.
##  1082 record(s) parsed, 1082 new individual organism observation(s) added.
##  105 individual organism observation(s) with missing diameter value(s) not added.
head(showElementTable(mtfyffe_vegx, "individualOrganismObservation"), 3)
##   plotName obsStartDate individualOrganismLabel organismIdentityName
## 1   17 3_L   1980-02-07                    5167      Coprosma lucida
## 2    6 4_I   1980-02-07                    7778      Rubus cissoides
## 3    6 4_D   1980-02-07                    7701 Pseudopanax arboreus
##   diameter_method diameter_value
## 1          DBH/cm            7.9
## 2          DBH/cm            2.9
## 3          DBH/cm            4.6

Note that in this case the number of observations of individual trees is higher than the number of trees, because of the repeated measurements. Although we will not show it here in any example, it is possible to associate organism observations to particular heights where organisms are observed or to particular strata, by linking them to stratum observations in the same way as we did for aggregate organism observations.

Aggregate organism observations

Aggregate organism observations include measurements that apply to a set of organisms collectively, normally all organisms of the same species identity. The most common examples are abundance values (e.g. cover) for species. Function addAggregateOrganismObservations() can be used to import such data into a VegX document. We first inspect the Mokihinui forest data frame moki_tcv to decide what information should be mapped:

head(moki_tcv,3)
##     TaxonCategoryValueID   Plot Subplot
## 920              4302767 LGM38h      NA
## 921              4302768 LGM38h      NA
## 922              4302769 LGM38h      NA
##                                         Project EntryNo      MethodName
## 920 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011       1 Recce Inventory
## 921 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011       1 Recce Inventory
## 922 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011       1 Recce Inventory
##     Verbatim.Species TaxonID CurrentTaxonID NVSCode        NVSSpeciesName
## 920           DACCUP    1178           1178  DACCUP Dacrydium cupressinum
## 921           DACCUP    1178           1178  DACCUP Dacrydium cupressinum
## 922           DACCUP    1178           1178  DACCUP Dacrydium cupressinum
##       Tier TierDescription TierTechniqueName TierLower TierUpper
## 920 Tier 1          > 25 m    Standard Recce        25      50.0
## 921 Tier 2       12 - 25 m    Standard Recce        12      25.0
## 922 Tier 6         < 30 cm    Standard Recce         0       0.3
##     TierHeightUnits Category CategoryDescription  CategoryTechniqueName
## 920               m        3               6-25% Standard Recce (Allen)
## 921               m        3               6-25% Standard Recce (Allen)
## 922               m        1                 <1% Standard Recce (Allen)
##     TaxonObsNote ParentPlotID ParentPlotObsID PlotMethodID PlotObsID ProjectID
## 920                    789183         1161780      1392859   1161780      2444
## 921                    789183         1161780      1392859   1161780      2444
## 922                    789183         1161780      1392859   1161780      2444
##     SampleMethodID TaxonObsID  TierID PlotObsStartDate
## 920          13405    7448485 2133018       2011-02-20
## 921          13405    7448485 2133019       2011-02-20
## 922          13405    7448485 2133023       2011-02-20

As before, taxon names can be drawn from column NVSSpeciesName. Column Tier contains information about the stratum where species were recorded, whereas column Category contains cover values codified in a cover ordinal scale. First, we define a mapping for these variables as well as for plot and observation start date, which together specify a plotObservation (aggregate organism observations were not done in subplots for this data set).

mapping = list(plotName = "Plot", obsStartDate = "PlotObsStartDate", 
               taxonName = "NVSSpeciesName",
               stratumName = "Tier", cover = "Category")

In order to parse cover values, we could use a method of percent cover, but in this data set cover is specified using cover classes. Thus, we need to define an ordinal scale that can be used to interpret Category values; this can be done with function defineOrdinalScaleMethod():

coverscale = defineOrdinalScaleMethod(name = "Recce cover scale",
                   description = "Recce recording method by Hurst/Allen",
                   subject = "plant cover",
                   citation = "Hurst, JM and Allen, RB. (2007) The Recce method for describing 
                               Zealand vegetation – Field protocols. Landcare Research, Lincoln.",
                   codes = c("P","1","2","3", "4", "5", "6"),
                   quantifiableCodes = c("1","2","3", "4", "5", "6"),
                   breaks = c(0, 1, 5, 25, 50, 75, 100),
                   midPoints = c(0.05, 0.5, 15, 37.5, 62.5, 87.5),
                   definitions = c("Presence", "<1%", "1-5%","6-25%", "26-50%", 
                                   "51-75%", "76-100%"))

As the source data specifies taxon abundances within vegetation strata, we also need to supply information on how the strata are defined. The VegX R package provides three different ways of defining strata: by heights, by categories and using a mixed approach. This last option is used in the following code:

moki_strataDef = defineMixedStrata(name = "Recce strata",
                   description = "Standard Recce stratum definition",
                   citation = "Hurst, JM and Allen, RB. (2007) The Recce method for describing 
                               Zealand vegetation – Field protocols. Landcare Research, Lincoln.",
                   heightStrataBreaks = c(0, 0.3,2.0,5, 12, 25, 50),
                   heightStrataNames = paste0("Tier ",1:6),
                   categoryStrataNames = "Tier 7",
                   categoryStrataDefinition = "Epiphytes")

Having the mapping, the cover scale and the stratum definition we can proceed to import species cover values by strata using function addAggregateOrganismObservations():

moki_vegx = addAggregateOrganismObservations(moki_vegx, moki_tcv, mapping,
                        methods = list(cover=coverscale),
                        stratumDefinition = moki_strataDef)
##  1 additional aggregate organism measurements found: Category.
##  Measurement method 'Recce cover scale' added for 'cover'.
##  Stratum definition method 'Recce strata' added.
##  7 new stratum definitions added.
##  5 plot(s) parsed, 0 new added.
##  5 plot observation(s) parsed, 0 new added.
##  148 organism names(s) parsed, 121 new added.
##  148 organism identitie(s) parsed, 121 new added.
##  33 stratum observation(s) parsed, 33 new added.
##  582 record(s) parsed, 582 new aggregate organism observation(s) added.

Note that the both the stratum definition and the cover scale contain methods that are added to the Veg-X document. The strata themselves are also added to the document (i.e. stratum elements). Other elements that are added are organism identities (i.e. taxon names), stratum observations (because species were observed while focusing on particular strata) and, finally, aggregate organism observation themselves. Less organism names and organism names have been added than those parsed, because the Veg-X document already contained some from individual organism observations. We can inspect the newly added taxon cover observations using:

head(showElementTable(moki_vegx, "aggregateOrganismObservation"),3)
##   plotName obsStartDate  organismIdentityName stratumName      agg_1_method
## 1   LGM38h   2011-02-20 Dacrydium cupressinum      Tier 1 Recce cover scale
## 2   LGM38h   2011-02-20 Dacrydium cupressinum      Tier 2 Recce cover scale
## 3   LGM38h   2011-02-20 Dacrydium cupressinum      Tier 6 Recce cover scale
##   agg_1_value
## 1           3
## 2           3
## 3           1

The Mt Fyffe forest data set includes individual counts by species and stratum (i.e. another kind of aggregate organism observations) in a data frame mtfyffe_counts, which has a similar structure as moki_tcv, but with counts being in column value:

head(mtfyffe_counts, 3)
##   TaxonSimpleValueID Plot Subplot                  Project EntryNo
## 2            1496938 12 1       E FYFFE, MOUNT FOREST 1980      11
## 3            1496939 12 1       E FYFFE, MOUNT FOREST 1980      12
## 4            1496951 12 1       M FYFFE, MOUNT FOREST 1980      26
##        MethodName Verbatim.Species TaxonID CurrentTaxonID NVSCode
## 2 Quadrat sapling           PSECOL    2588           2588  PSECOL
## 3 Quadrat sapling           RUBCIS    2756           2756  RUBCIS
## 4 Quadrat sapling           PSECRA    2571           2571  PSECRA
##             NVSSpeciesName Tier TierDescription TierTechniqueName Value Units
## 2   Pseudowintera colorata   NA                                       1 count
## 3          Rubus cissoides   NA                                       1 count
## 4 Pseudopanax crassifolius   NA                                       1 count
##     MeasureName TaxonObsNote ParentPlotID ParentPlotObsID PlotMethodID
## 2 Sapling Count                       221           52256       508043
## 3 Sapling Count                       221           52256       508043
## 4 Sapling Count                       221           52256       420498
##   PlotObsID ProjectID SampleMethodID TaxonObsID TierID PlotObsStartDate
## 2    142900       867           3512    2185897     NA       1980-02-07
## 3    142900       867           3512    2185898     NA       1980-02-07
## 4    326011       867           3512    2185912     NA       1980-02-07
mapping = list(plotName = "Plot", subPlotName = "Subplot", obsStartDate = "PlotObsStartDate", 
               taxonName = "NVSSpeciesName", stratumName = "Tier", counts = "Value")

Analogously to the previous case, we need to specify a measurement method for counts, and in this case we can use function predefinedMeasurementMethod():

countscale = predefinedMeasurementMethod("Individual plant counts")

Then we also need to provide the strata definition, which is different from that of the previous data set. Here all strata are defined by height, so we can use a function called defineHeightStrata():

mtfyffe_strataDef = defineHeightStrata(name = "Standard seedling/sapling strata",
                              description = "Seedling/sapling stratum definition",
                              heightBreaks = c(0, 15, 45, 75, 105, 135, 200),
                              strataNames = as.character(1:6),
                              strataDefinitions = c("0-15 cm", "16-45 cm", "46-75 cm", 
                                                    "76-105 cm", "106-135 cm", "> 135 cm"))

Now, we are ready to import the data:

mtfyffe_vegx = addAggregateOrganismObservations(mtfyffe_vegx, mtfyffe_counts, mapping,
                        methods = list(counts=countscale),
                        stratumDefinition = mtfyffe_strataDef)
##  1 additional aggregate organism measurements found: Value.
##  Measurement method 'Individual plant counts' added for 'counts'.
##  Stratum definition method 'Standard seedling/sapling strata' added.
##  6 new stratum definitions added.
##  131 plot(s) parsed, 0 new added.
##  194 plot observation(s) parsed, 0 new added.
##  55 organism names(s) parsed, 30 new added.
##  55 organism identitie(s) parsed, 30 new added.
##  170 stratum observation(s) parsed, 170 new added.
##  533 record(s) parsed, 533 new aggregate organism observation(s) added.
head(showElementTable(mtfyffe_vegx, "aggregateOrganismObservation"),3)
##   plotName obsStartDate     organismIdentityName            agg_1_method
## 1   12 1_E   1980-02-07   Pseudowintera colorata Individual plant counts
## 2   12 1_E   1980-02-07          Rubus cissoides Individual plant counts
## 3   12 1_M   1980-02-07 Pseudopanax crassifolius Individual plant counts
##   agg_1_value stratumName
## 1           1        <NA>
## 2           1        <NA>
## 3           1        <NA>

Again, this involves that elements of several kinds are added to our Veg-X document. The process for the Takitimu grassland data set is similar, but in this case, the observations are not organized by strata, and as abundance values we have frequency of occurrence.

head(taki_freq, 3)
##   TaxonSimpleValueID            Plot Subplot                      Project
## 3            7514719 T2 2 (T2 2_TRA)      NA TAKITIMU GRASSLAND 1968-1969
## 6            7566591 T2 2 (T2 2_TRA)      NA TAKITIMU GRASSLAND 1968-1969
## 9            7583612 R3 1 (R3 1_TRA)      NA TAKITIMU GRASSLAND 1968-1969
##   EntryNo         MethodName Verbatim.Species TaxonID CurrentTaxonID NVSCode
## 3       0 Transect frequency             MOSS    2184           2184    MOSS
## 6       0 Transect frequency           SCHPAU    2843           2843  SCHPAU
## 9       0 Transect frequency           HELICH     843            843  HELICH
##         NVSSpeciesName TierDescription TierTechniqueName Value Units
## 3         Moss species              NA                NA    38     %
## 6 Schoenus pauciflorus              NA                NA     9     %
## 9  Helichrysum species              NA                NA     2     %
##            MeasureName TaxonObsNote ParentPlotID ParentPlotObsID PlotMethodID
## 3 Percentage Frequency           NA       781855         1153877      1386555
## 6 Percentage Frequency           NA       781855         1153877      1386555
## 9 Percentage Frequency           NA       781860         1153882      1385873
##   PlotObsID ProjectID SampleMethodID TaxonObsID TierID PlotObsStartDate
## 3   1153877       500          13184    7262934     NA       1968-02-07
## 6   1153877       500          13184    7262217     NA       1968-02-07
## 9   1153882       500          13184    7247662     NA       1968-02-07
mapping = list(plotName = "Plot", obsStartDate = "PlotObsStartDate", 
               taxonName = "NVSSpeciesName", freq = "Value")

Hence, we define the new measurement scale and call again addAggregateOrganismObservations():

taki_vegx = addAggregateOrganismObservations(taki_vegx, taki_freq, mapping,
                        methods = list(freq="Plant frequency/%"))
##  1 additional aggregate organism measurements found: Value.
##  Measurement method 'Plant frequency/%' added for 'freq'.
##  5 plot(s) parsed, 0 new added.
##  5 plot observation(s) parsed, 0 new added.
##  38 organism names(s) parsed, 38 new added.
##  38 organism identitie(s) parsed, 38 new added.
##  94 record(s) parsed, 94 new aggregate organism observation(s) added.
head(showElementTable(taki_vegx, "aggregateOrganismObservation"), 3)
##          plotName obsStartDate organismIdentityName      agg_1_method
## 1 T2 2 (T2 2_TRA)   1968-02-07         Moss species Plant frequency/%
## 2 T2 2 (T2 2_TRA)   1968-02-07 Schoenus pauciflorus Plant frequency/%
## 3 R3 1 (R3 1_TRA)   1968-02-07  Helichrysum species Plant frequency/%
##   agg_1_value
## 1          38
## 2           9
## 3           2

As expected, no stratum definition nor stratum observations are added to the Veg-X document, but we still see the addition of organism names, organism identities and aggregate organism observations.

While aggregated organism observations are often related to strata, it is possible to indicate that measurements of cover of counts were done focusing on a particular height, by mapping to heightMeasurement instead of using stratumName.

Stratum observations

In the previous subsections we stated that both individual and aggregate organism observations can be positioned in a particular vegetation stratum (e.g. the moss layer). However, one could imagine measurements that apply to the stratum itself, like the overall cover or basal area of all organisms in the stratum, regardless of their identity. Other common stratum measurements are those that define its vertical limits (e.g. at which height did the tree layer started?). Veg-X allows storing this information in elements stratumObservation. We showed that of this kind these were automatically created and added when dealing with aggregate taxon observations, but here we show how to add measurements that specifically refer to strata using function addStratumObservations().

To illustrate how to add stratum observations to a Veg-X document, we take again the Mokihinui forest data set as data source and inspect the data frame moki_str, which contains strata cover measurements:

head(moki_str, 3)
##     TierID   Plot Subplot                                     Project
## 26 2133018 LGM38h      NA MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 27 2133019 LGM38h      NA MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 28 2133020 LGM38h      NA MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
##         MethodName   Tier TierDescription TierTechniqueName TierLower TierUpper
## 26 Recce Inventory Tier 1          > 25 m    Standard Recce        25        50
## 27 Recce Inventory Tier 2       12 - 25 m    Standard Recce        12        25
## 28 Recce Inventory Tier 3        5 - 12 m    Standard Recce         5        12
##    TierHeightUnits CoverClass CoverClassDescription ParentPlotID
## 26               m          3                    NA       789183
## 27               m          3                    NA       789183
## 28               m          4                    NA       789183
##    ParentPlotObsID PlotObsID ProjectID SampleMethodID PlotMethodID
## 26         1161780   1161780      2444          13405      1392859
## 27         1161780   1161780      2444          13405      1392859
## 28         1161780   1161780      2444          13405      1392859
##    PlotObsStartDate
## 26       2011-02-20
## 27       2011-02-20
## 28       2011-02-20

The data table also contains stratum height limits, although our definition of strata to import taxon cover data already contained height limits for most strata. We will assume that the data in moki_str indeed contains actual measurements and define the mapping accordingly:

mapping = list(plotName = "Plot", obsStartDate = "PlotObsStartDate", stratumName = "Tier",
               lowerLimitMeasurement = "TierLower", upperLimitMeasurement = "TierUpper",
               cover = "CoverClass")

Both the cover ordinal scale and the strata definitions have been used before, so we do not need to redefine them. We do need, however, to create a definition of the method applying to height measurements, before calling addStratumObservations():

heightMethod = predefinedMeasurementMethod("Stratum height/m")

moki_vegx = addStratumObservations(moki_vegx, moki_str, mapping = mapping,
                        methods = list(lowerLimitMeasurement = heightMethod,
                                       upperLimitMeasurement = heightMethod,
                                       cover=coverscale),
                        stratumDefinition = moki_strataDef)
##  1 stratum measurement variables found.
##  Measurement method 'Stratum height/m' added for 'lowerLimitMeasurement'.
##  Measurement method 'Stratum height/m' for 'upperLimitMeasurement' already included.
##  Measurement method 'Recce cover scale' for 'cover' already included.
##  Stratum definition 'Recce strata' already included.
##  5 plot(s) parsed, 0 new added.
##  5 plot observation(s) parsed, 0 new added.
##  35 record(s) parsed, 2 new stratum observation(s) added.
##  7 measurement(s) with missing value(s) not added.

Note that no new strata definitions are added, as they were already included when adding aggregate stratum observations. We do have some new stratum observations. The status of the stratum observations can be shown using:

head(showElementTable(moki_vegx, "stratumObservation"), 3)
##   plotName obsStartDate stratumName lowerLimit_method lowerLimit_value
## 1   LGM38h   2011-02-20      Tier 1  Stratum height/m               25
## 2   LGM38h   2011-02-20      Tier 2  Stratum height/m               12
## 3   LGM38h   2011-02-20      Tier 6  Stratum height/m                0
##   upperLimit_method upperLimit_value      str_1_method str_1_value
## 1  Stratum height/m             50.0 Recce cover scale           3
## 2  Stratum height/m             25.0 Recce cover scale           3
## 3  Stratum height/m              0.3 Recce cover scale           3

Community observations

Veg-X includes into elements communityObservation all biotic observations and measurements that are naturally defined at the plant community (or vegetation stand) level, such as basal area, species richness or stand age. Since our example source data sets did not include any of such measurements, we start by adding a column BA with simulated basal area values in the moki_site data frame using a Normal distribution:

moki_site$BA = pmax(0, rnorm(nrow(moki_site), 10, 5))

Adding community observations requires, as usual, a mapping where in addition to mapping plots and surveys we can specify mappings for measurements defined at the community level:

# Define mapping
mapping = list(plotName = "Plot", subPlotName = "Subplot",
               obsStartDate = "PlotObsStartDate", basal_area = "BA")

Of course, for each measurement we will need to provide a method that describes the measured subject, units, etc. Function addCommunityObservations() is used to add community observations to a VegX object:

# Add basal area measurements to the VegX object
moki_vegx = addCommunityObservations(moki_vegx, moki_site, mapping = mapping,
                        methods = list(basal_area = "basal area"))
##  Measurement method 'Basal area/m2*ha-1' added for 'basal_area'.
##  25 plot(s) parsed, 0 new plot(s) added.
##  25 plot observation(s) parsed, 0 new plot observation(s) added.
##  25 record(s) parsed, 25 new community observation(s) added.
# Inspect the result
head(showElementTable(moki_vegx, "communityObservation"),3)
##    plotName obsStartDate      comm_1_method comm_1_value
## 1    LGM08r   2011-02-17 Basal area/m2*ha-1    0.6565694
## 2 LGM08r_1Q   2011-02-17 Basal area/m2*ha-1    9.5813046
## 3 LGM08r_2Q   2011-02-17 Basal area/m2*ha-1   12.0541588

Site observations

Veg-X includes into elements siteObservation all observations and measurements that do not refer to vegetation itself, i.e. abiotic measurements, soil type classifications, etc. Since our example source data sets did not include any of such measurements, we created a column pH with constant values in the moki_site data frame. The function that allows adding site observations is addSiteObservations() and the following code should be rather self-explanatory by now:

mapping = list(plotName = "Plot", subPlotName = "Subplot", obsStartDate = "PlotObsStartDate")
moki_vegx = addSiteObservations(moki_vegx, moki_site,
                         plotObservationMapping = mapping,
                         soilMeasurementMapping = list(a = "pH"),
                         soilMeasurementMethods = list(a = "pH/0-14"))
##  Measurement method 'pH/0-14' added for 'a'.
##  25 plot(s) parsed, 0 new plot(s) added.
##  25 plot observation(s) parsed, 0 new plot observation(s) added.
##  25 record(s) parsed, 25 new site observation(s) added.

In contrast with other add... functions, ‘a’ is only used here in the context of of the addSiteObservations() function (i.e., there will be no variable called ‘a’ in the Veg-X document). When displaying site observations, columns soil_1_*, soil_2_* only indicate the numbering of soil variables:

head(showElementTable(moki_vegx, "siteObservation"))
##    plotName obsStartDate soil_1_method soil_1_value
## 1    LGM08r   2011-02-17       pH/0-14            7
## 2 LGM08r_1Q   2011-02-17       pH/0-14            7
## 3 LGM08r_2Q   2011-02-17       pH/0-14            7
## 4 LGM08r_3Q   2011-02-17       pH/0-14            7
## 5 LGM08r_4Q   2011-02-17       pH/0-14            7
## 6    LGM16l   2011-02-15       pH/0-14            7

It is important to distinguish the subject of a method from the method itself. For example, subject would be pH measurement of upper soil solution, whereas a particular methods for this subject would be the measurement in water or measurement in 0.01 mol CaCl. In the former example we added variable ‘pH’ of the input data to the VegX document and defined the measurement method as pH/0-14, which simply specifies the measurement of pH (the subject) onto a 0-14 scale. Let’s look at its definition:

## An object of class "VegXMethodDefinition"
## Slot "name":
## [1] "pH/0-14"
## 
## Slot "description":
## [1] "pH scale from 0 to 14"
## 
## Slot "citationString":
## [1] ""
## 
## Slot "DOI":
## [1] ""
## 
## Slot "subject":
## [1] "pH"
## 
## Slot "attributeType":
## [1] "quantitative"
## 
## Slot "attributes":
## $`1`
## $`1`$type
## [1] "quantitative"
## 
## $`1`$unit
## [1] NA
## 
## $`1`$lowerLimit
## [1] 0
## 
## $`1`$upperLimit
## [1] 14

Surface cover observations

Surface cover observations are measurements of the percentage of the plot’s surface that is covered (i.e. when projected onto the ground) by different surface types, such as rocks, bare soil, vegetation, etc. Veg-X allows defining surface types as surfaceType elements, and storing cover values for them in surfaceCoverObservation elements. We use the Mt Fyffe forest data set to illustrate how this kind of observations are added to a Veg-X document. First we inspect table mtfyffe_groundcover and define a mapping:

head(mtfyffe_groundcover, 3)
##    PlotGroundCoverID Plot Subplot                  Project PlotGroundCover
## 14              6157  6 4      NA FYFFE, MOUNT FOREST 1980      Vegetation
## 17              4176 17 3      NA FYFFE, MOUNT FOREST 1980      Vegetation
## 21              1353 14 2      NA FYFFE, MOUNT FOREST 1980      Vegetation
##                     TechniqueName           MeasureName Value ParentPlotID
## 14 Standard Ground Cover Estimate Ground Cover Category    20          217
## 17 Standard Ground Cover Estimate Ground Cover Category    20          229
## 21 Standard Ground Cover Estimate Ground Cover Category    20          226
##    ParentPlotObsID PlotObsID ProjectID PlotObsStartDate
## 14           52241     52241       867       1980-02-07
## 17           52221     52221       867       1980-02-07
## 21           52261     52261       867       1980-02-07
mapping = list(plotName = "Plot", obsStartDate = "PlotObsStartDate",
               surfaceName = "PlotGroundCover", coverMeasurement = "Value")

In this case, cover values are specified as percent cover of ground surface, so we need to define an appropriate method:

coverMethod = predefinedMeasurementMethod("Surface cover/%")

We inspect the surface types used in the data set and call function defineSurfaceTypes() as we have done previously for strata:

unique(mtfyffe_groundcover$PlotGroundCover)
## [1] "Vegetation"   "Moss"         "Litter"       "Exposed Soil" "Rock"
surfaceTypes = defineSurfaceTypes(name = "Default surface types",
                     description = "Five surface categories",
                     surfaceNames = c("Vegetation", "Moss", "Litter", "Exposed Soil", 
                                      "Rock"))

We can now import surface cover observations using function addSurfaceCoverObservations():

mtfyffe_vegx = addSurfaceCoverObservations(mtfyffe_vegx, mtfyffe_groundcover, mapping,
                                coverMethod, surfaceTypes)
##  Measurement method 'Surface cover/%' added for 'coverMeasurement'.
##  Surface type definition method 'Default surface types' added.
##  5 new surface type definitions added.
##  4 plot(s) parsed, 0 new added.
##  8 plot observation(s) parsed, 0 new added.
##  40 record(s) parsed, 40 new surface cover observation(s) added.
head(showElementTable(mtfyffe_vegx, "surfaceCoverObservation", 3))
##   plotObservationID plotName obsStartDate surfaceTypeID surfaceName cover_attID
## 1                 1      6 4   1980-02-07             1  Vegetation           8
## 2                 4     17 3   1980-02-07             1  Vegetation           8
## 3                 3     14 2   1980-02-07             1  Vegetation           8
## 4                 2     12 1   1980-02-07             1  Vegetation           8
## 5                 1      6 4   1980-02-07             2        Moss           8
## 6                 3     14 2   1980-02-07             2        Moss           8
##      cover_method cover_value
## 1 Surface cover/%          20
## 2 Surface cover/%          20
## 3 Surface cover/%          20
## 4 Surface cover/%          20
## 5 Surface cover/%          20
## 6 Surface cover/%          20

Analogously to the case of strata, the function added surface type definitions to the Veg-X document, in addition to adding the cover values, themselves.

The Takitimu grassland data set also includes surface cover observations, although the surface types are slightly different:

head(taki_groundcover, 3)
##    PlotGroundCoverID            Plot Subplot                      Project
## 26            396242 R3 1 (R3 1_TRA)      NA TAKITIMU GRASSLAND 1968-1969
## 27            396243 R3 1 (R3 1_TRA)      NA TAKITIMU GRASSLAND 1968-1969
## 28            396244 R3 1 (R3 1_TRA)      NA TAKITIMU GRASSLAND 1968-1969
##    PlotGroundCover                  TechniqueName           MeasureName Value
## 26          Litter Standard Ground Cover Estimate Ground Cover Category    25
## 27            Rock Standard Ground Cover Estimate Ground Cover Category    25
## 28      Vegetation Standard Ground Cover Estimate Ground Cover Category    25
##    ParentPlotID ParentPlotObsID PlotObsID ProjectID PlotObsStartDate
## 26       781860         1153882   1153882       500       1968-02-07
## 27       781860         1153882   1153882       500       1968-02-07
## 28       781860         1153882   1153882       500       1968-02-07
unique(taki_groundcover$PlotGroundCover)
## [1] "Litter"           "Rock"             "Vegetation"       "Erosion Pavement"
## [5] "Soil"

Therefore, we must define a new set of surface types before calling function addSurfaceCoverObservations():

surfaceTypes = defineSurfaceTypes(name = "Default surface types",
                     description = "Five surface categories",
                     surfaceNames = c("Vegetation", "Soil", "Erosion Pavement", "Litter",
                                      "Rock"))

taki_vegx = addSurfaceCoverObservations(taki_vegx, taki_groundcover, mapping,
                                coverMethod, surfaceTypes)
##  Measurement method 'Surface cover/%' added for 'coverMeasurement'.
##  Surface type definition method 'Default surface types' added.
##  5 new surface type definitions added.
##  5 plot(s) parsed, 0 new added.
##  5 plot observation(s) parsed, 0 new added.
##  17 record(s) parsed, 17 new surface cover observation(s) added.

Combining and harmonizing Veg-X documents

One of the purposes of importing data into Veg-X, is the possibility to combine and harmonize documents from different sources. In this section we illustrate how documents should be merged, and some functions that can be used to harmonize their contents.

Adding unique identifiers

When combining VegX objects from different sources it is important to pay attention to plot names, because plots from two different sources may have been given the same name while in fact they correspond to different sampled areas. To combine two vegetation sources while avoiding confusion in plot identity one should use plot unique identifiers, i.e. sub-element plotUniqueIdentifier of plot. When populating a Veg-X object from a single source data set, unique identifiers are not normally available nor needed, and the functions that add observations to the object will only look at plotName to identifying plots uniquely. However, when merging VegX objects unique identifiers should be defined, and two plots should be considered to be the same only if both their plot name and unique identifier have the same values in both plots. While less critical than plot unique identifiers, the Veg-X standard also allows unique identifiers for plot observations, via the sub-element plotObservationUniqueIdentifier of plotObservation.

The VegX package provides two ways to supply unique identifiers. The function addPlotObservations() allows specifying mappings for both plotUniqueIdentifier and plotObservationUniqueIdentifier:

mapping = list(projectTitle = "Project", plotName = "Plot", subPlotName = "Subplot",
               obsStartDate = "PlotObsStartDate", obsEndDate = "PlotObsStopDate",
               plotUniqueIdentifier = "PlotID", plotObservationUniqueIdentifier = "PlotObsID")
vegx_ids = addPlotObservations(newVegX(), moki_site, mapping = mapping, verbose = FALSE)

head(showElementTable(vegx_ids, "plot"), 3)
##    plotName plotUniqueIdentifier relatedPlotName plotRelationship
## 1    LGM08r               789033            <NA>             <NA>
## 2 LGM08r_1Q               789034          LGM08r          subplot
## 3 LGM08r_2Q               789035          LGM08r          subplot
head(showElementTable(vegx_ids, "plotObservation"), 3)
##    plotName obsStartDate obsEndDate plotObservationUniqueIdentifier
## 1    LGM08r   2011-02-17 2011-02-17                         1161630
## 2 LGM08r_1Q   2011-02-17 2011-02-17                         1161631
## 3 LGM08r_2Q   2011-02-17 2011-02-17                         1161632
##                                  projectTitle
## 1 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 2 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011
## 3 MOKIHINUI HYDRO PROPOSAL - LOWER GORGE 2011

We could use function addPlotObservations() to define unique identifiers because these were available from our source data. Note however, that IDs coming from NVS are only unique within the context of this data bank. In cases the source data does not include unique identifiers or those available may not be unique in all situations, one can generate universally unique identifiers (or replace the current identifiers) using function fillUniqueIdentifiers():

moki_vegx = fillUniqueIdentifiers(target = moki_vegx, element = "plot")
head(showElementTable(moki_vegx, "plot"),3)
##    plotName                 plotUniqueIdentifier  area_method area_value
## 1    LGM08r 165a4d88-2e3e-4606-9234-7965efd90c6e Plot area/m2        400
## 2 LGM08r_1Q 9edd1675-0c30-45ea-af60-e5620c9ee0bc Plot area/m2        100
## 3 LGM08r_2Q 5fda70fd-df56-459d-8845-8787dbf62044 Plot area/m2        100
##       shape    length_method length_value     width_method width_value   coordX
## 1 rectangle Plot dimension/m           20 Plot dimension/m          20 172.1407
## 2 rectangle Plot dimension/m           10 Plot dimension/m          10       NA
## 3 rectangle Plot dimension/m           10 Plot dimension/m          10       NA
##     coordY           spatialReference elevation_method elevation_value
## 1 -41.5417 +proj=longlat +datum=WGS84      Elevation/m              95
## 2       NA                       <NA>             <NA>              NA
## 3       NA                       <NA>             <NA>              NA
##    slope_method slope_value  aspect_method aspect_value relatedPlotName
## 1 Slope/degrees          40 Aspect/degrees          360            <NA>
## 2          <NA>          NA           <NA>           NA          LGM08r
## 3          <NA>          NA           <NA>           NA          LGM08r
##   plotRelationship
## 1             <NA>
## 2          subplot
## 3          subplot

A UUID (Universal Unique Identifier) is a 128-bit number used to uniquely identify some object or entity. When generated according to the standard methods, UUIDs are for practical purposes unique, without depending for their uniqueness on a central registration authority or coordination between the parties generating them, unlike most other numbering schemes. While the probability that a UUID will be duplicated is not zero, it is close enough to zero to be negligible. If we are interested in merging different documents it is important to ensure that unique identifiers are defined for plots. Function fillUniqueIdentifiers() generates UUIDs by calling function UUIDgenerate() from the R package uuid.

As we did for Mokihinui VegX object, we generate universally unique identifiers for the other two VegX objects:

mtfyffe_vegx = fillUniqueIdentifiers(target = mtfyffe_vegx, element = "plot")
taki_vegx = fillUniqueIdentifiers(target = taki_vegx, element = "plot")

Updating taxon nomenclature

head(showElementTable(moki_vegx, "organismIdentity"),10)
##                identityName     originalOrganismName taxon
## 1       Weinmannia racemosa      Weinmannia racemosa  TRUE
## 2      Coprosma grandifolia     Coprosma grandifolia  TRUE
## 3           Cyathea smithii          Cyathea smithii  TRUE
## 4  Dacrycarpus dacrydioides Dacrycarpus dacrydioides  TRUE
## 5       Dicksonia squarrosa      Dicksonia squarrosa  TRUE
## 6   Pseudowintera axillaris  Pseudowintera axillaris  TRUE
## 7      Quintinia acutifolia     Quintinia acutifolia  TRUE
## 8           Coprosma lucida          Coprosma lucida  TRUE
## 9      Melicytus ramiflorus     Melicytus ramiflorus  TRUE
## 10         Myrsine salicina         Myrsine salicina  TRUE
moki_vegx = setPreferredTaxonNomenclature(moki_vegx, moki_lookup,
                   c(originalOrganismName = "NVSSpeciesName", preferredTaxonName = "PreferredSpeciesName"))
##  Preferred taxon name was set on 149 organism identities.
##  Preferred taxon name is now different than original organism name on 22 organism identities.
##  21 new organism name(s) added.
a = showElementTable(moki_vegx, "organismIdentity")
a[which(a$identityName!= a$originalOrganismName),]
##                    identityName     originalOrganismName taxon
## 13             Fuscospora fusca         Nothofagus fusca  TRUE
## 15          Fuscospora truncata      Nothofagus truncata  TRUE
## 22            Podocarpus laetus        Podocarpus hallii  TRUE
## 28         Lophozonia menziesii     Nothofagus menziesii  TRUE
## 36           Carex horizontalis        Uncinia rupestris  TRUE
## 37          Veronica leiophylla          Hebe leiophylla  TRUE
## 41             Carex corynoidea          Uncinia clavata  TRUE
## 45               Carex uncinata         Uncinia uncinata  TRUE
## 55         Phlegmariurus varius           Huperzia varia  TRUE
## 61      Dendrobium cunninghamii      Winika cunninghamii  TRUE
## 66                Carex species          Uncinia species  TRUE
## 67   Notogrammitis billardierei   Grammitis billardierei  TRUE
## 68  Hymenophyllum nephrophyllum    Trichomanes reniforme  TRUE
## 73         Veronica salicifolia         Hebe salicifolia  TRUE
## 76            Melicytus species     Hymenanthera species  TRUE
## 87          Erigeron bilbaoanus         Conyza bilbaoana  TRUE
## 90             Veronica lyallii         Parahebe lyallii  TRUE
## 95          Leontodon saxatilis   Leontodon taraxacoides  TRUE
## 97          Lolium arundinaceum Schedonorus arundinaceus  TRUE
## 113  Notogrammitis heterophylla Ctenopteris heterophylla  TRUE
## 127       Austroderia richardii     Cortaderia richardii  TRUE
## 136             Coprosma dumosa       Coprosma tayloriae  TRUE
##              preferredTaxonName
## 13             Fuscospora fusca
## 15          Fuscospora truncata
## 22            Podocarpus laetus
## 28         Lophozonia menziesii
## 36           Carex horizontalis
## 37          Veronica leiophylla
## 41             Carex corynoidea
## 45               Carex uncinata
## 55         Phlegmariurus varius
## 61      Dendrobium cunninghamii
## 66                Carex species
## 67   Notogrammitis billardierei
## 68  Hymenophyllum nephrophyllum
## 73         Veronica salicifolia
## 76            Melicytus species
## 87          Erigeron bilbaoanus
## 90             Veronica lyallii
## 95          Leontodon saxatilis
## 97          Lolium arundinaceum
## 113  Notogrammitis heterophylla
## 127       Austroderia richardii
## 136             Coprosma dumosa
mtfyffe_vegx = setPreferredTaxonNomenclature(mtfyffe_vegx, mtfyffe_lookup,
                   c(originalOrganismName = "NVSSpeciesName", preferredTaxonName = "PreferredSpeciesName"))
##  Preferred taxon name was set on 59 organism identities.
##  Preferred taxon name is now different than original organism name on 13 organism identities.
##  9 new organism name(s) added.
a = showElementTable(mtfyffe_vegx, "organismIdentity")
a[which(a$identityName!= a$originalOrganismName),]
##                 identityName       originalOrganismName taxon
## 6            Coprosma dumosa         Coprosma tayloriae  TRUE
## 9       Coprosma grandifolia         Coprosma australis  TRUE
## 12         Podocarpus laetus          Podocarpus hallii  TRUE
## 15          Fuscospora fusca           Nothofagus fusca  TRUE
## 19     Prumnopitys taxifolia        Podocarpus spicatus  TRUE
## 30       Veronica leiophylla            Hebe gracillima  TRUE
## 37  Blechnum novae-zelandiae           Blechnum capense  TRUE
## 41            Carex uncinata           Uncinia uncinata  TRUE
## 42           Raukaua simplex        Pseudopanax simplex  TRUE
## 44     Microsorum pustulatum Phymatosorus diversifolius  TRUE
## 48 Polystichum neozelandicum      Polystichum richardii  TRUE
## 49              Carex healyi             Uncinia scabra  TRUE
## 58     Veronica brachysiphon          Hebe brachysiphon  TRUE
##           preferredTaxonName
## 6            Coprosma dumosa
## 9       Coprosma grandifolia
## 12         Podocarpus laetus
## 15          Fuscospora fusca
## 19     Prumnopitys taxifolia
## 30       Veronica leiophylla
## 37  Blechnum novae-zelandiae
## 41            Carex uncinata
## 42           Raukaua simplex
## 44     Microsorum pustulatum
## 48 Polystichum neozelandicum
## 49              Carex healyi
## 58     Veronica brachysiphon
taki_vegx = setPreferredTaxonNomenclature(taki_vegx, taki_lookup,
                   c(originalOrganismName = "NVSSpeciesName", preferredTaxonName = "PreferredSpeciesName"))
##  Preferred taxon name was set on 38 organism identities.
##  Preferred taxon name is now different than original organism name on 5 organism identities.
##  5 new organism name(s) added.
a = showElementTable(taki_vegx, "organismIdentity")
a[which(a$identityName!= a$originalOrganismName),]
##               identityName    originalOrganismName taxon
## 9   Veronica lycopodioides      Hebe lycopodioides  TRUE
## 14 Rytidosperma setifolium Notodanthonia setifolia  TRUE
## 15        Zotovia colensoi      Petriella colensoi  TRUE
## 17        Veronica species            Hebe species  TRUE
## 32        Lobelia angulata         Pratia angulata  TRUE
##         preferredTaxonName
## 9   Veronica lycopodioides
## 14 Rytidosperma setifolium
## 15        Zotovia colensoi
## 17        Veronica species
## 32        Lobelia angulata

Merging two Veg-X documents

Function mergeVegX() is used to merge two Veg-X documents into a single one. This function puts all the input elements into the same containers and, whenever elements are considered to be the same, they are merged. Each element kind has its own way to determine when two instances refer to the same entity. For example, two plots will be considered to be equal if they have the same plot name and, if defined, they have the same plotUniqueIdentifier. By default, however, plots and organism identities are not merged. This is a security measure to avoid plots with the same name but from different sources to be identified as equal. A call to mergeVegX() to merge the mtfyffe_vegx and taki_vegx VegX objects produces the following output:

comb_vegx = mergeVegX(moki_vegx, mtfyffe_vegx)
##  Final number of party(ies): 1. Data pooled for 0 party(ies).
##  Final number of literature citations: 1. Data pooled for 0 citation(s).
##  Final number of methods: 15. Data pooled for 5 method(s).
##  Final number of attributes: 20.
##  Final number of strata: 13. Data pooled for 0 strata.
##  Final number of surface types: 5. Data pooled for 0 surface types.
##  Final number of organism names: 209. Data pooled for 30 organism name(s).
##  Final number of taxon concepts: 0. Data pooled for 0 organism name(s).
##  Final number of organism identities: 209. Data pooled for 0 organism identitie(s).
##  Final number of projects: 3. Data pooled for 0 project(s).
##  Final number of plots: 190. Data pooled for 0 plot(s).
##  Final number of individual organisms: 1368. Data pooled for 0 individual organism(s).
##  Final number of plot observations: 351. Data pooled for 0 plot observation(s).
##  Final number of stratum observations: 205. Data pooled for 0 stratum observation(s).
##  Final number of individual organism observations: 1725. Data pooled for 0 individual organism observation(s).
##  Final number of aggregate organism observations: 1115. Data pooled for 0 aggregate organism observation(s).
##  Final number of site observations: 25. Data pooled for 0 site observation(s).
##  Final number of community observations: 25. Data pooled for 0 community observation(s).
##  Final number of surface cover observations: 40. Data pooled for 0 surface cover observation(s).

Plots were all kept separately because allowMergePlots = FALSE by default. However, in this case the plots had all different names and unique identifiers, so even if we had set allowMergePlots = TRUE, they would have been all kept separately. The objects to be merged had shared methods, so the function identifies them as equal and avoids repetitions. The decisions to merge (i.e. pool) information of other elements can be interpreted similarly. A special case concerns organismName vs. organismIdentity. Note that some organism names were merged, but identities were not. While merging equal names is always safe, merging identities should be done with extreme caution, because two data sets may have employed the same taxon name but with different taxon concepts. Therefore, by default mergeVegX() does not merge organism identities. If we want to specify that identities can be merged (when considered equal) we can set parameter allowMergeOrganismIdentities = TRUE:

# comb_vegx = mergeVegX(moki_vegx, mtfyffe_vegx, allowMergeOrganismIdentities = TRUE)

In this second output, there are the number of merges in organismName than organismIdentity, as the equal names have been forced to mean equal identity.

Note that function mergeVegX() can also be used to merge two documents that refer to the same data source, i.e. if one has imported different parts of the same source data into different Veg-X objects. In this case the user should specify allowMergePlots = TRUE.

Merging rules

Users of the VegX package should be aware that organism identities are by default kept separate when merging Veg-X objects. If the user chooses to merge identities, the decision to actually merge two given organism identities is complex, depending on both nomenclature and taxon concepts. Let ‘1’ and ‘2’ be two organism identities being compared. If original taxon concepts (i.e. element originalIdentificationConcept) are missing for both of them, the following table explains decision according to nomenclature and, in case of merging, the nomenclature of the resulting identity (asterisk indicates that a warning will be raised by the merging function):

Case Orig.1 Pref.1 Orig.2 Pref.2 Merge Orig.Res. Pref.Res.
1 X - Y - No - -
2 X - X - Yes X -
3 X Y X - No* - -
4 X Y Y - Yes - Y
5 X Y X Z No* - -
6 X Y V Z No - -
7 X Y Z Y Yes - Y
8 X Y X Y Yes X Y
9 - X - Y No - -
10 - X - X Yes - X
11 X - - X Yes - X
12 X - - Y No - -

In the more general case where original taxon concepts may have been specified for identity ‘1’, identity ‘2’ or both, the following table explains the decision to merge or not those organism identities, depending on the value of their taxon concepts and whether merging is possible according to nomenclature (asterisk indicates that a warning will be raised by the merging function):

Case Nom. Merge? Tax.Con.1 Tax.Con.2 Merge Tax.Con. Res.
1 No - - No -
2 No X - No -
3 No X Y No -
4 No X X No* -
5 Yes - - Yes -
6 Yes X - Yes -
7 Yes X Y No* -
8 Yes X X Yes X

Adding observations to VegX objects that have identities defined

The general workflow when working with the VegX package is: (1) Create a new VegX document; (2) Add plot data and plot observations; (3) perform nomenclature corrections; (4) Merge documents. However, it could happen that user’s attempt to add observations to a VegX object that already contains identities, possibly with nomenclatural revisions and associated taxon concepts. When adding organism observations to VegX objects, the add function assumes that the name supplied is an originalOrganismName and it applies the same rules explained to determine whether they refer to the same identity and, if not, then a new organism identity will be created. For example, even if there is a match between the supplied name and the original organism name of an existing identity, a new identity will be created if the original taxon concept has been asserted for the existing one, because there is no way to check that the two organisms involved have indeed the same identity. If existing entity does not have an original taxon concept, then the decision to create or not new organism identities will follow the nomenclature rules of the table above.

Transforming quantitative scales

heightMethod2 = predefinedMeasurementMethod("Stratum height/cm")
trans_vegx = transformQuantitativeScale(comb_vegx, "Stratum height/m", heightMethod2,
                               function(x){return(x*10)}, replaceValues = TRUE)
##  Target method: 'Stratum height/m'
##  Number of quantitative attributes: 1
##  Measurement method 'Stratum height/cm' added.
##  70 transformation(s) were applied on stratum observations.
head(showElementTable(trans_vegx, "stratumObservation"),3)
##   plotName obsStartDate stratumName lowerLimit_method lowerLimit_value
## 1   LGM38h   2011-02-20      Tier 1 Stratum height/cm              250
## 2   LGM38h   2011-02-20      Tier 2 Stratum height/cm              120
## 3   LGM38h   2011-02-20      Tier 6 Stratum height/cm                0
##   upperLimit_method upperLimit_value      str_1_method str_1_value
## 1 Stratum height/cm              500 Recce cover scale           3
## 2 Stratum height/cm              250 Recce cover scale           3
## 3 Stratum height/cm                3 Recce cover scale           3

Transforming ordinal scales

percentScale = predefinedMeasurementMethod("Plant cover/%")
trans_vegx = transformOrdinalScale(comb_vegx, "Recce cover scale", percentScale)
##  Target method: 'Recce cover scale'
##  Number of attributes: 7
##  Number of quantifiable attributes with midpoints: 6
##  Limits of the new attribute: [0, 100]
##  Measurement method 'Plant cover/%' added.
##  531 transformation(s) were applied on aggregate organism observations.
##  28 transformation(s) were applied on stratum observations.
head(showElementTable(trans_vegx, "stratumObservation"),3)
##   plotName obsStartDate stratumName lowerLimit_method lowerLimit_value
## 1   LGM38h   2011-02-20      Tier 1  Stratum height/m               25
## 2   LGM38h   2011-02-20      Tier 2  Stratum height/m               12
## 3   LGM38h   2011-02-20      Tier 6  Stratum height/m                0
##   upperLimit_method upperLimit_value      str_1_method str_1_value
## 1  Stratum height/m             50.0 Recce cover scale           3
## 2  Stratum height/m             25.0 Recce cover scale           3
## 3  Stratum height/m              0.3 Recce cover scale           3
##    str_2_method str_2_value
## 1 Plant cover/%          15
## 2 Plant cover/%          15
## 3 Plant cover/%          15

head(showElementTable(, “organismIdentity”))

Writing and reading Veg-X documents

The Veg-X exchange standard is currently implemented as an XML schema (but other physical implementations of the standard could be possible). The VegX package provides functions writeVegX() and readVegX() that are used, respectively, to write and read XML files with Veg-X documents. An advantage of XML is that it is text that can be read and understood by humans, but its disadvantage is that files tend to be very large, because of the redundancy of text. One possibility to overcome this is to compress XML files (into zip or tar.gz files) for more efficient storage. However, compressing XML files does not avoid the problem that writing/reading XML files can be slow in large data sets.

An alternative to XML that users can employ is to directly save and read Veg-X documents as R objects, using functions saveRDS()and readRDS(). This option is fast and will produce much smaller files. The only drawback of saving R objects can arise if the S4 definition of Veg-X documents is changed in future versions of the package. We tried to avoid this potential problem by defining S4 Veg-X objects as lists of the main elements, without defining the internal structure of each main element. If the version of the standard is changed, functions to convert R objects from old to new versions of the standard should be made available to avoid losing backwards compatibility, in the same way that function readVegX() should be modified to allow reading XML documents formed following old versions of the standard.