Containers in riskyData
If you have seen the previous vignette riskyData-Welcome
you might have recognised that when data are pulled from the API they
aren’t in a conventional format. When data are scraped using the
loadAPI()
function , gauge metadata are also pulled
simultaneously.
Using the bewdley
dataset we can investigate;
library(riskyData)
data(bewdley)
bewdley
#>
#> ── Class: HydroImport ──────────────────────────────────────────────────────────
#>
#> ── Metadata: ──
#>
#> Data Type: Raw Import
#> Station name: Bewdley
#> WISKI ID: 2001
#> Parameter Type: Flow
#> Modifications: NA
#> Start: 2008-10-01 09:00:00
#> End: 2022-10-01 08:45:00
#> Time Step: 900
#> Observations: 490848
#> Easting: 378235
#> Northing: 276165
#> Longitude: -2.321186
#> Latitude: 52.383072
#>
#> ── Observed data: ──
#>
#> dateTime value quality qcode
#> <POSc> <num> <char> <char>
#> 1: 2008-10-01 09:00:00 25.3 Good <NA>
#> 2: 2008-10-01 09:15:00 25.5 Good <NA>
#> 3: 2008-10-01 09:30:00 25.6 Good <NA>
#> 4: 2008-10-01 09:45:00 25.6 Good <NA>
#> 5: 2008-10-01 10:00:00 25.7 Good <NA>
#> ---
#> 490844: 2022-10-01 07:45:00 12.0 Good <NA>
#> 490845: 2022-10-01 08:00:00 11.8 Good <NA>
#> 490846: 2022-10-01 08:15:00 11.6 Good <NA>
#> 490847: 2022-10-01 08:30:00 11.6 Good <NA>
#> 490848: 2022-10-01 08:45:00 11.6 Good <NA>
#> For more details use the $methods() function, the format should be as
#> `Object_name`$methods()
When the dataset is printed there are 2 sections:
private and public. These are both
grouped into a container called bewdley, and have a class of
HydroImport
and R6
.
class(bewdley)
#> [1] "HydroImport" "R6"
Using R6 allows you to define private fields and methods, in addition to the public ones. What private means is that fields and methods can only be accessed within the class, and not from the outside. Whereas with public, you can interact and modify the fields.
R6 is an implemention of encapsulated object-oriented programming for R, and is a simpler, faster, lighter-weight alternative to R's built-in reference classes. This style of programming is also sometimes referred to as classical object-oriented programming.
Some features of R6:
R6 objects have reference semantics.
R6 cleanly supports inheritance across packages.
R6 classes have public and private members.
In contrast to R's reference classes, R6 is not built on the S4 class system, so it does not require the methods package. Unlike reference classes, R6 classes can be cleanly inherited across different packages.Public data
Public members
Public members are accessible from outside the class and can be used and modified directly.
To call the raw data in riskyData
you can do so with
$data
, this does add one level of complexity against a
normal data frame but the benefits outstrip the negatives.
To call up the data on the bewdley dataset use;
bewdley$data
#> dateTime value quality qcode
#> <POSc> <num> <char> <char>
#> 1: 2008-10-01 09:00:00 25.3 Good <NA>
#> 2: 2008-10-01 09:15:00 25.5 Good <NA>
#> 3: 2008-10-01 09:30:00 25.6 Good <NA>
#> 4: 2008-10-01 09:45:00 25.6 Good <NA>
#> 5: 2008-10-01 10:00:00 25.7 Good <NA>
#> ---
#> 490844: 2022-10-01 07:45:00 12.0 Good <NA>
#> 490845: 2022-10-01 08:00:00 11.8 Good <NA>
#> 490846: 2022-10-01 08:15:00 11.6 Good <NA>
#> 490847: 2022-10-01 08:30:00 11.6 Good <NA>
#> 490848: 2022-10-01 08:45:00 11.6 Good <NA>
As with a normal dataframe we can interact with functions from
outside of the riskyData
package;
mean(bewdley$data$value, na.rm = TRUE)
#> [1] 61.3803
max(bewdley$data$value, na.rm = TRUE)
#> [1] 523
min(bewdley$data$value, na.rm = TRUE)
#> [1] 7.61
with(bewdley$data, plot(x = dateTime, y = value, type = 'l'))
Private data
Private members are only accessible from within the class, and they are encapsulated to ensure data integrity.
All the gauge metadata are stored within the private section, with this you cannot directly interact with or edit these data. For example let’s say I wished to change the catchment area of the bewdley dataset
bewdley$start <- now()
#> Error in bewdley$start <- now() : cannot add bindings to a locked environment
The data are protected, this ensures that they can be used at all times in other dependent functions. There is only one way to interact with the private metadata and that is through functions built into the container.
Active Bindings
Active bindings are special methods that allow you to compute values on-the-fly when accessing an attribute.
Container functions
Functions specific to the data stored within R6 containers are called
methods. To use these you interact with them in a different manner to
how you normally do in R. They can all be applied using the
$
operator, meaning that you don’t have to encapsulate an
object within parenthesis.
Do find all the methods available to you use $methods()
after the object name. For example;
bewdley$methods()
#> ┌ Methods ───────────────────────────────────────────────────────────────────────────────────┐
#> │ obj$data → Returns the raw data imported via the API │
#> │ obj$rating → Returns the user imported rating details │
#> │ obj$meta() → Returns the metadata associated with the object │
#> │ obj$asVol() → Calculates the volume of water relative to the time step, see ?asVol │
#> │ obj$hydroYearDay() → Calculates the hydrological year and day, see ?hydroYearDay │
#> │ obj$rmVol() → Removes the volume column │
#> │ obj$rmHY() → Removes the hydroYear column │
#> │ obj$rmHYD() → Removes the hydroYearDay column │
#> │ obj$summary() → Provides a quick summary of the raw data │
#> │ obj$coords() → Returns coordinates from the metadata │
#> │ obj$nrfa() → Returns the NRFA data from the metadata │
#> │ obj$dataAgg() → Aggregate data by, hour, day, month calendar year and hydroYear │
#> │ obj$rollingAggs() → Uses user specified aggregation timings, see ?rollingAggs │
#> │ obj$dayStats() → Daily statistics of flow, carried out on hydrological or calendar … │
#> │ obj$quality() → Provides a quick summary table of the data quality flags │
#> │ obj$missing() → Quickly finds the positions of missing data points │
#> │ obj$exceed → Show how many times observed data exceed a given threshold │
#> │ obj$plot() → Create a plot of each year of data, by hydrological year │
#> │ obj$window() → Extracts the subset of data observed between the times start and end │
#> │ obj$rateFlow() → Converts stage into a rated flow using the specified rating table │
#> │ obj$rateStage() → Converts flow into a rated stage using the specified rating table │
#> └────────────────────────────────────────────────────────────────────────────────────────────┘
Using methods inbuilt to the containers, you can interact with the private metadata;
# Return NRFA details
bewdley$nrfa()
#> WISKI codeNRFA urlNRFA
#> <char> <char> <char>
#> 1: 2001 54001 https://nrfa.ceh.ac.uk/data/station/info/54001.html
# Return gauge coordinate data
bewdley$coords()
#> stationName WISKI Easting Northing Latitude Longitude
#> <char> <char> <int> <int> <num> <num>
#> 1: Bewdley 2001 378235 276165 52.38307 -2.321186
# Return all the metadata
bewdley$meta()
#> dataType modifications stationName riverName WISKI RLOID stationGuide
#> <char> <lgcl> <char> <char> <char> <char> <lgcl>
#> 1: Raw Import NA Bewdley River Severn 2001 2001 NA
#> baseURL
#> <char>
#> 1: http://environment.data.gov.uk/hydrology/id/measures/
#> dataURL
#> <char>
#> 1: 8820d897-a09e-4857-8095-5834fee6962f-flow-i-900-m3s-qualified/readings.json?_limit=2000000&mineq-date=2008-10-01&max-date=2022-10-02
#> measureURL idNRFA
#> <char> <char>
#> 1: 8820d897-a09e-4857-8095-5834fee6962f-flow-i-900-m3s-qualified 54001
#> urlNRFA easting northing
#> <char> <int> <int>
#> 1: https://nrfa.ceh.ac.uk/data/station/info/54001.html 378235 276165
#> latitude longitude area parameter unitName
#> <num> <num> <num> <char> <char>
#> 1: 52.38307 -2.321186 4325 Flow m3/s
#> unit datum boreholeDepth
#> <char> <lgcl> <lgcl>
#> 1: http://qudt.org/1.1/vocab/unit#CubicMeterPerSecond NA NA
#> aquifer start end timeStep timeZone records
#> <lgcl> <POSc> <POSc> <num> <char> <int>
#> 1: NA 2008-10-01 09:00:00 2022-10-01 08:45:00 900 GMT 490848
More details on methods will be covered in the methods vignette.
Inherited Members
Inherited members are methods and fields inherited from the parent class if your R6 class inherits from another.
When data are imported with the loadAPI()
function, the
container used is HydroImport
. This is the parent class. If
an aggregation method applied, it can fundamentally change the data
structure. For this reason a child class was developed called
HydroAggs
. Most functionality is inherited from the
HydroImport
parent class, however some methods had to be
amended to ensure that they would still work.
To generate a HydroAggs
dataset we can use the
$dataAgg()
function. Here we will calculate the hourly
maximum data from the bewdley dataset and export it as class
HydroAggs
;
bewdley$dataAgg(type = "hourly", method = "max")
#>
#> ── Class: HydroAggs ────────────────────────────────────────────────────────────
#>
#> ── Private: ──
#>
#> Data Type: Aggregated - hourly max
#> Station name: Bewdley
#> WISKI ID: 2001
#> Data Type: Flow
#> Modifications: hourly max
#> Start: 2008-10-01 9
#> End: 2022-10-01 8
#> Time Step: Hourly Unstable
#> Easting: 378235
#> Northing: 276165
#> Longitude: -2.321186
#> Latitude: 52.383072
#>
#> ── Public: ──
#>
#> dateTime hourlyMax
#> <char> <num>
#> 1: 2008-10-01 9 25.6
#> 2: 2008-10-01 10 25.8
#> 3: 2008-10-01 11 25.8
#> 4: 2008-10-01 12 25.8
#> 5: 2008-10-01 13 25.8
#> ---
#> 122708: 2022-10-01 4 11.2
#> 122709: 2022-10-01 5 11.4
#> 122710: 2022-10-01 6 11.9
#> 122711: 2022-10-01 7 12.0
#> 122712: 2022-10-01 8 11.8
#> For more details use the $methods() function, the format should be as
#> `Object_name`$methods()
Though there quite a lot of changes under the hood the output looks
very similar to class HydroImport
. The public data is
different as we have applied an aggregation, with the private metadata
the start and end times of the series differ. One of the key changes is
the modifications line, all the containers used in
riskyData
track changes made to the data, in this case it
indicates that an “hourly max” aggregation has been applied.