how to convert yearly time data in to specific hourly interval data in R -
i have time series dataset containing 10000 rows 1 year of info looks this
2012-01-01 06:23:02 c d10 2012-01-01 08:12:12 d d2 ........................... 2012-12-31 08:22:24 s d5
it has 3 fields
date_time, category1, category2 category1 contains 4 categorical values (c,v,d,s) category2 contains 10 categorical values(d1....d10).
i want calculate individual count of categorical values c,v,d,s respect each categorical values d1......d10. should how many c,v,d,s nowadays d1,d2....d10 respect time frame 0-1, 1-2, .... 22-23
how represent above info in time series starting 1-2
, 2-3
, 3-4
,.....23-24
sample output should this
1-2 2-3 3-4 ........23-24
d1 c=2,d=3,v=3
s=4
d2 c=3 d=3,v=2,s=2
..................
d10 have tried using lubridate,data.table packages couldn't find out expected solution
not clear expected result. may helps:
indx <- with(dat1, as.numeric(format(as.posixct(cut(date_time, breaks='hour')),'%h'))) dat1$indx1 <- interaction(indx, indx+1, sep="-", lex.order=true, drop=true) dat1$date_time <- as.character(dat1$date_time) library(reshape2) res1 <- dcast(dat1, category1+category2~indx1, value.var='date_time') res1[,-(1:2)] <- lapply(res1[,-(1:2)], as.posixct) head(res1,2) # category1 category2 0-1 1-2 2-3 3-4 4-5 5-6 6-7 7-8 #1 c1 d1 <na> 2012-01-03 01:43:02 <na> <na> <na> <na> <na> <na> #2 c1 d10 <na> <na> <na> <na> <na> <na> <na> <na> # 8-9 9-10 10-11 11-12 12-13 13-14 14-15 15-16 #1 <na> 2012-01-01 09:13:02 <na> <na> <na> <na> <na> <na> #2 <na> 2012-01-02 09:43:02 <na> 2012-01-02 11:03:02 <na> <na> <na> <na> # 16-17 17-18 18-19 19-20 20-21 21-22 22-23 23-24 #1 <na> <na> <na> <na> <na> <na> <na> <na> #2 <na> <na> <na> <na> <na> <na> <na> <na>
update if want counts
res2 <- dcast(dat1, category1+category2~indx1, value.var='date_time', length) res2[1:3,1:3] # category1 category2 0-1 #1 c1 d1 0 #2 c1 d10 0 #3 c1 d11 0
data set.seed(24) dat1 <- data.frame(date_time=seq(as.posixct('2012-01-01 06:23:02', '%y-%m-%d %h:%m:%s'), length.out=300, by='10 min'), category1 = sample(paste0('c',1:20), 300, replace=true), category2 = sample(paste0('d', 1:20), 300, replace=true))
r time-series data.table timespan
No comments:
Post a Comment