Quantcast
Channel: Recent Questions - Stack Overflow
Viewing all articles
Browse latest Browse all 12111

Delimit precipitation events from xts objects using `split.zoo`

$
0
0

I'd like to split an xts object of precipitation depths into chunks in order to be able to delimit individual events for subsequent analysis from a continuous record of observations.

Let's assume I'm working with this data:

datetime <- seq(as.POSIXct("2020-01-01 00:00", tz = "UTC"),                 by = "1 min",                 length.out = 1440)vals <- rep(0, length(datetime))x <- xts::xts(vals, order.by = datetime)# fill xts object with some random values# event #1zoo::coredata(x["2020-01-01 00:30/2020-01-01 02:35"]) <- runif(126, min = 0.01, max = 0.2)# event #2zoo::coredata(x["2020-01-01 08:45/2020-01-01 12:50"]) <- runif(246, min = 0.01, max = 0.2)# event #3zoo::coredata(x["2020-01-01 17:15/2020-01-01 17:30"]) <- runif(16, min = 0.01, max = 0.2)zoo::coredata(x["2020-01-01 18:15/2020-01-01 19:00"]) <- runif(46, min = 0.01, max = 0.2)zoo::coredata(x["2020-01-01 22:30/2020-01-01 23:00"]) <- runif(31, min = 0.01, max = 0.2)

In order to delimit events, I'd like them to meet the follwing criteria:

The event starts with the first value > 0 and ends with the last value > 0 if there is no further precipitation recorded during the upcoming 4 hours.

From my research I'd need a vector of factors with length(x) and levels according to the "event id" to use this as input for split.zoo. Event #1 would be characterized by level = 1, #2 by 2, and so on. I'm not interested in precipitation breaks per se, so they could be simply mapped as 0 (or even dismissed at this point).

My expected result would be a list of zoo/xts objects encompassing the individual events:

# looking for `g` in the end, making use of some efficient rolling approachsplit(x, g) |> str()#> List of 4#>  $ 0:'zoo' series from 2020-01-01 to 2020-01-01 23:59:00#>   Data: num [1:722, 1] 0 0 0 0 0 0 0 0 0 0 ...#>   Index:  POSIXct[1:722], format: "2020-01-01 00:00:00" "2020-01-01 00:01:00" ...#>  $ 1:'zoo' series from 2020-01-01 00:30:00 to 2020-01-01 02:35:00#>   Data: num [1:126, 1] 0.1737 0.0958 0.0491 0.1861 0.1877 ...#>   Index:  POSIXct[1:126], format: "2020-01-01 00:30:00" "2020-01-01 00:31:00" ...#>  $ 2:'zoo' series from 2020-01-01 08:45:00 to 2020-01-01 12:50:00#>   Data: num [1:246, 1] 0.1136 0.1473 0.0433 0.1311 0.1741 ...#>   Index:  POSIXct[1:246], format: "2020-01-01 08:45:00" "2020-01-01 08:46:00" ...#>  $ 3:'zoo' series from 2020-01-01 17:15:00 to 2020-01-01 23:00:00#>   Data: num [1:346, 1] 0.1614 0.0632 0.1216 0.1888 0.0967 ...#>   Index:  POSIXct[1:346], format: "2020-01-01 17:15:00" "2020-01-01 17:16:00" ...

Since my real-world minutely data spans over decades, I'd prefer a fast approach. And ideally the solution could be applied on data of various temporal resolutions, i.e. also work with 5-minutely or hourly data.


Viewing all articles
Browse latest Browse all 12111

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>