Sunday, 15 July 2012

how to convert yearly time data in to specific hourly interval data in R -



how to convert yearly time data in to specific hourly interval data in R -

i have time series dataset containing 10000 rows 1 year of info looks this

2012-01-01 06:23:02 c d10 2012-01-01 08:12:12 d d2 ........................... 2012-12-31 08:22:24 s d5

it has 3 fields

date_time, category1, category2 category1 contains 4 categorical values (c,v,d,s) category2 contains 10 categorical values(d1....d10).

i want calculate individual count of categorical values c,v,d,s respect each categorical values d1......d10. should how many c,v,d,s nowadays d1,d2....d10 respect time frame 0-1, 1-2, .... 22-23

how represent above info in time series starting 1-2, 2-3, 3-4,.....23-24

sample output should this

1-2 2-3 3-4 ........23-24

d1 c=2,d=3,v=3s=4

d2 c=3 d=3,v=2,s=2 ..................

d10 have tried using lubridate,data.table packages couldn't find out expected solution

not clear expected result. may helps:

indx <- with(dat1, as.numeric(format(as.posixct(cut(date_time, breaks='hour')),'%h'))) dat1$indx1 <- interaction(indx, indx+1, sep="-", lex.order=true, drop=true) dat1$date_time <- as.character(dat1$date_time) library(reshape2) res1 <- dcast(dat1, category1+category2~indx1, value.var='date_time') res1[,-(1:2)] <- lapply(res1[,-(1:2)], as.posixct) head(res1,2) # category1 category2 0-1 1-2 2-3 3-4 4-5 5-6 6-7 7-8 #1 c1 d1 <na> 2012-01-03 01:43:02 <na> <na> <na> <na> <na> <na> #2 c1 d10 <na> <na> <na> <na> <na> <na> <na> <na> # 8-9 9-10 10-11 11-12 12-13 13-14 14-15 15-16 #1 <na> 2012-01-01 09:13:02 <na> <na> <na> <na> <na> <na> #2 <na> 2012-01-02 09:43:02 <na> 2012-01-02 11:03:02 <na> <na> <na> <na> # 16-17 17-18 18-19 19-20 20-21 21-22 22-23 23-24 #1 <na> <na> <na> <na> <na> <na> <na> <na> #2 <na> <na> <na> <na> <na> <na> <na> <na> update

if want counts

res2 <- dcast(dat1, category1+category2~indx1, value.var='date_time', length) res2[1:3,1:3] # category1 category2 0-1 #1 c1 d1 0 #2 c1 d10 0 #3 c1 d11 0 data set.seed(24) dat1 <- data.frame(date_time=seq(as.posixct('2012-01-01 06:23:02', '%y-%m-%d %h:%m:%s'), length.out=300, by='10 min'), category1 = sample(paste0('c',1:20), 300, replace=true), category2 = sample(paste0('d', 1:20), 300, replace=true))

r time-series data.table timespan

No comments:

Post a Comment