I have a set of customers, their daily activity log, and if they had and issue raised on their account. A customer can have an issue raised on their account multiple times throughout their duration as a customer and an issue can stay raised for multiple days. I want to figure out the duration that an issue was raised on each account. The dates can start from anywhere and the issue can occur at anytime, saving as TRUE=1 and FALSE=0 in the sample below.
Some sample data:
df <- data.frame(customer= c('AB','AB','AB', 'AB','AB','BC','BC','BC','CD','CD','CD','CD'), date=as.Date(c("11/09/2000","12/09/2000","13/09/2000","14/09/2000","15/09/2000","13/09/2000","14/09/2000","15/09/2000","23/05/2001","24/05/2001","25/05/2001", "26/05/2001"), "%d/%m/%Y"), issue=c(0,1,1,1,1,0,0,1,1,0,1,1))
I tried making an index counter, along with some other variations found in this thread: Calculate days since last event in R, but then it doesn't count the continuous days (i.e. AB kept showing up with duration of 1 on each day instead of 2,3,4), as shown below:
customer | date | issue | duration |
---|---|---|---|
AB | 2000-09-11 | 0 | 0 |
AB | 2000-09-12 | 1 | 1 |
AB | 2000-09-13 | 1 | 1 |
AB | 2000-09-14 | 1 | 1 |
AB | 2000-09-15 | 1 | 1 |
What I need the output to be something similar to:
customer | date | issue | duration |
---|---|---|---|
AB | 2000-09-11 | 0 | 0 |
AB | 2000-09-12 | 1 | 1 |
AB | 2000-09-13 | 1 | 2 |
AB | 2000-09-14 | 1 | 3 |
AB | 2000-09-15 | 1 | 4 |
BC | 2000-09-13 | 0 | 0 |
BC | 2000-09-14 | 0 | 0 |
BC | 2000-09-15 | 1 | 1 |
CD | 2000-05-23 | 1 | 1 |
CD | 2000-05-24 | 0 | 0 |
CD | 2000-05-25 | 1 | 1 |
CD | 2000-05-26 | 1 | 2 |
Any help would be great. Thanks!