Skip to contents

Containers in riskyData

If you have seen the previous vignette riskyData-Welcome you might have recognised that when data are pulled from the API they aren’t in a conventional format. When data are scraped using the loadAPI() function , gauge metadata are also pulled simultaneously.

Using the bewdley dataset we can investigate;

library(riskyData)
data(bewdley)
bewdley
#> 
#> ── Class: HydroImport ──────────────────────────────────────────────────────────
#> 
#> ── Metadata: ──
#> 
#> Data Type: Raw Import
#> Station name: Bewdley
#> WISKI ID: 2001
#> Parameter Type: Flow
#> Modifications: NA
#> Start: 2008-10-01 09:00:00
#> End: 2022-10-01 08:45:00
#> Time Step: 900
#> Observations: 490848
#> Easting: 378235
#> Northing: 276165
#> Longitude: -2.321186
#> Latitude: 52.383072
#> 
#> ── Observed data: ──
#> 
#>                    dateTime value quality  qcode
#>                      <POSc> <num>  <char> <char>
#>      1: 2008-10-01 09:00:00  25.3    Good   <NA>
#>      2: 2008-10-01 09:15:00  25.5    Good   <NA>
#>      3: 2008-10-01 09:30:00  25.6    Good   <NA>
#>      4: 2008-10-01 09:45:00  25.6    Good   <NA>
#>      5: 2008-10-01 10:00:00  25.7    Good   <NA>
#>     ---                                         
#> 490844: 2022-10-01 07:45:00  12.0    Good   <NA>
#> 490845: 2022-10-01 08:00:00  11.8    Good   <NA>
#> 490846: 2022-10-01 08:15:00  11.6    Good   <NA>
#> 490847: 2022-10-01 08:30:00  11.6    Good   <NA>
#> 490848: 2022-10-01 08:45:00  11.6    Good   <NA>
#> For more details use the $methods() function, the format should be as
#> `Object_name`$methods()

When the dataset is printed there are 2 sections: private and public. These are both grouped into a container called bewdley, and have a class of HydroImport and R6.

class(bewdley)
#> [1] "HydroImport" "R6"

Using R6 allows you to define private fields and methods, in addition to the public ones. What private means is that fields and methods can only be accessed within the class, and not from the outside. Whereas with public, you can interact and modify the fields.

R6 is an implemention of encapsulated object-oriented programming for R, and is a simpler, faster, lighter-weight alternative to R's built-in reference classes. This style of programming is also sometimes referred to as classical object-oriented programming.

Some features of R6:

  • R6 objects have reference semantics.

  • R6 cleanly supports inheritance across packages.

  • R6 classes have public and private members.

In contrast to R's reference classes, R6 is not built on the S4 class system, so it does not require the methods package. Unlike reference classes, R6 classes can be cleanly inherited across different packages.Public data

Public members

Public members are accessible from outside the class and can be used and modified directly.

To call the raw data in riskyData you can do so with $data, this does add one level of complexity against a normal data frame but the benefits outstrip the negatives.

To call up the data on the bewdley dataset use;

bewdley$data
#>                    dateTime value quality  qcode
#>                      <POSc> <num>  <char> <char>
#>      1: 2008-10-01 09:00:00  25.3    Good   <NA>
#>      2: 2008-10-01 09:15:00  25.5    Good   <NA>
#>      3: 2008-10-01 09:30:00  25.6    Good   <NA>
#>      4: 2008-10-01 09:45:00  25.6    Good   <NA>
#>      5: 2008-10-01 10:00:00  25.7    Good   <NA>
#>     ---                                         
#> 490844: 2022-10-01 07:45:00  12.0    Good   <NA>
#> 490845: 2022-10-01 08:00:00  11.8    Good   <NA>
#> 490846: 2022-10-01 08:15:00  11.6    Good   <NA>
#> 490847: 2022-10-01 08:30:00  11.6    Good   <NA>
#> 490848: 2022-10-01 08:45:00  11.6    Good   <NA>

As with a normal dataframe we can interact with functions from outside of the riskyData package;

mean(bewdley$data$value, na.rm = TRUE)
#> [1] 61.3803
max(bewdley$data$value, na.rm = TRUE)
#> [1] 523
min(bewdley$data$value, na.rm = TRUE)
#> [1] 7.61
with(bewdley$data, plot(x = dateTime, y = value, type = 'l'))

Private data

Private members are only accessible from within the class, and they are encapsulated to ensure data integrity.

All the gauge metadata are stored within the private section, with this you cannot directly interact with or edit these data. For example let’s say I wished to change the catchment area of the bewdley dataset

bewdley$start <- now()
#> Error in bewdley$start <- now() : cannot add bindings to a locked environment

The data are protected, this ensures that they can be used at all times in other dependent functions. There is only one way to interact with the private metadata and that is through functions built into the container.

Active Bindings

Active bindings are special methods that allow you to compute values on-the-fly when accessing an attribute.

Container functions

Functions specific to the data stored within R6 containers are called methods. To use these you interact with them in a different manner to how you normally do in R. They can all be applied using the $ operator, meaning that you don’t have to encapsulate an object within parenthesis.

Do find all the methods available to you use $methods() after the object name. For example;

bewdley$methods()
#> ┌ Methods ───────────────────────────────────────────────────────────────────────────────────┐
#> │ obj$data    →    Returns the raw data imported via the API                                 │
#> │ obj$rating    →    Returns the user imported rating details                                │
#> │ obj$meta()    →    Returns the metadata associated with the object                         │
#> │ obj$asVol()    →    Calculates the volume of water relative to the time step, see ?asVol   │
#> │ obj$hydroYearDay()    →    Calculates the hydrological year and day, see ?hydroYearDay     │
#> │ obj$rmVol()    →    Removes the volume column                                              │
#> │ obj$rmHY()    →    Removes the hydroYear column                                            │
#> │ obj$rmHYD()    →    Removes the hydroYearDay column                                        │
#> │ obj$summary()    →    Provides a quick summary of the raw data                             │
#> │ obj$coords()    →    Returns coordinates from the metadata                                 │
#> │ obj$nrfa()    →    Returns the NRFA data from the metadata                                 │
#> │ obj$dataAgg()    →    Aggregate data by, hour, day, month calendar year and hydroYear      │
#> │ obj$rollingAggs()    →    Uses user specified aggregation timings, see ?rollingAggs        │
#> │ obj$dayStats()    →    Daily statistics of flow, carried out on hydrological or calendar … │
#> │ obj$quality()    →    Provides a quick summary table of the data quality flags             │
#> │ obj$missing()    →    Quickly finds the positions of missing data points                   │
#> │ obj$exceed    →    Show how many times observed data exceed a given threshold              │
#> │ obj$plot()    →    Create a plot of each year of data, by hydrological year                │
#> │ obj$window()    →    Extracts the subset of data observed between the times start and end  │
#> │ obj$rateFlow()    →    Converts stage into a rated flow using the specified rating table   │
#> │ obj$rateStage()    →    Converts flow into a rated stage using the specified rating table  │
#> └────────────────────────────────────────────────────────────────────────────────────────────┘

Using methods inbuilt to the containers, you can interact with the private metadata;

# Return NRFA details
bewdley$nrfa()
#>     WISKI codeNRFA                                             urlNRFA
#>    <char>   <char>                                              <char>
#> 1:   2001    54001 https://nrfa.ceh.ac.uk/data/station/info/54001.html

# Return gauge coordinate data
bewdley$coords()
#>    stationName  WISKI Easting Northing Latitude Longitude
#>         <char> <char>   <int>    <int>    <num>     <num>
#> 1:     Bewdley   2001  378235   276165 52.38307 -2.321186

# Return all the metadata
bewdley$meta()
#>      dataType modifications stationName    riverName  WISKI  RLOID stationGuide
#>        <char>        <lgcl>      <char>       <char> <char> <char>       <lgcl>
#> 1: Raw Import            NA     Bewdley River Severn   2001   2001           NA
#>                                                  baseURL
#>                                                   <char>
#> 1: http://environment.data.gov.uk/hydrology/id/measures/
#>                                                                                                                                 dataURL
#>                                                                                                                                  <char>
#> 1: 8820d897-a09e-4857-8095-5834fee6962f-flow-i-900-m3s-qualified/readings.json?_limit=2000000&mineq-date=2008-10-01&max-date=2022-10-02
#>                                                       measureURL idNRFA
#>                                                           <char> <char>
#> 1: 8820d897-a09e-4857-8095-5834fee6962f-flow-i-900-m3s-qualified  54001
#>                                                urlNRFA easting northing
#>                                                 <char>   <int>    <int>
#> 1: https://nrfa.ceh.ac.uk/data/station/info/54001.html  378235   276165
#>    latitude longitude  area parameter unitName
#>       <num>     <num> <num>    <char>   <char>
#> 1: 52.38307 -2.321186  4325      Flow     m3/s
#>                                                  unit  datum boreholeDepth
#>                                                <char> <lgcl>        <lgcl>
#> 1: http://qudt.org/1.1/vocab/unit#CubicMeterPerSecond     NA            NA
#>    aquifer               start                 end timeStep timeZone records
#>     <lgcl>              <POSc>              <POSc>    <num>   <char>   <int>
#> 1:      NA 2008-10-01 09:00:00 2022-10-01 08:45:00      900      GMT  490848

More details on methods will be covered in the methods vignette.

Inherited Members

Inherited members are methods and fields inherited from the parent class if your R6 class inherits from another.

When data are imported with the loadAPI() function, the container used is HydroImport. This is the parent class. If an aggregation method applied, it can fundamentally change the data structure. For this reason a child class was developed called HydroAggs. Most functionality is inherited from the HydroImport parent class, however some methods had to be amended to ensure that they would still work.

To generate a HydroAggs dataset we can use the $dataAgg() function. Here we will calculate the hourly maximum data from the bewdley dataset and export it as class HydroAggs;

bewdley$dataAgg(type = "hourly", method = "max")
#> 
#> ── Class: HydroAggs ────────────────────────────────────────────────────────────
#> 
#> ── Private: ──
#> 
#> Data Type: Aggregated - hourly max
#> Station name: Bewdley
#> WISKI ID: 2001
#> Data Type: Flow
#> Modifications: hourly max
#> Start: 2008-10-01 9
#> End: 2022-10-01 8
#> Time Step: Hourly Unstable
#> Easting: 378235
#> Northing: 276165
#> Longitude: -2.321186
#> Latitude: 52.383072
#> 
#> ── Public: ──
#> 
#>              dateTime hourlyMax
#>                <char>     <num>
#>      1:  2008-10-01 9      25.6
#>      2: 2008-10-01 10      25.8
#>      3: 2008-10-01 11      25.8
#>      4: 2008-10-01 12      25.8
#>      5: 2008-10-01 13      25.8
#>     ---                        
#> 122708:  2022-10-01 4      11.2
#> 122709:  2022-10-01 5      11.4
#> 122710:  2022-10-01 6      11.9
#> 122711:  2022-10-01 7      12.0
#> 122712:  2022-10-01 8      11.8
#> For more details use the $methods() function, the format should be as
#> `Object_name`$methods()

Though there quite a lot of changes under the hood the output looks very similar to class HydroImport. The public data is different as we have applied an aggregation, with the private metadata the start and end times of the series differ. One of the key changes is the modifications line, all the containers used in riskyData track changes made to the data, in this case it indicates that an “hourly max” aggregation has been applied.