Saturday, 15 June 2013

r - Fill in NA's based on logic for the whole day -



r - Fill in NA's based on logic for the whole day -

i did poor job asking question first time around , apologize. i've simplified question , makes more sense!

my goal create script assigns na's in master_df_ex$ls_flag sum of ls_flag 0 each asof_dt.

i have algorithm has 3 columns: date, rank, updn_flag, ls_flag. rank , updn_flag determined algorithm. ls_flag takes updn_flag if rank in top 50% (in case, since it's 4, rank less or equal 2, utilize updn_flag ls_flag.)

asof_dt<-c("2014-10-01","2014-10-01","2014-10-01","2014-10-01", "2014-10-02","2014-10-02","2014-10-02","2014-10-02", "2014-10-03","2014-10-03","2014-10-03","2014-10-03") rank_mag<-c(1,2,3,4,1,2,3,4,1,2,3,4) updn_flag<-c(-1,-1,1,-1,1,1,1,-1,-1,1,-1,-1) ls_flag <-c(-1,-1,na,na,1,1,na,na,-1,1,na,na) master_df_ex<-data.frame(asof_dt,rank_mag,updn_flag,ls_flag) master_df_ex<-group_by(master_df_ex,asof_dt) arrange(master_df_ex,asof_dt,rank_mag) > arrange(master_df_ex,asof_dt,rank_mag) asof_dt rank_mag updn_flag ls_flag 1 2014-10-01 1 -1 -1 2 2014-10-01 2 -1 -1 3 2014-10-01 3 1 na 4 2014-10-01 4 -1 na 5 2014-10-02 1 1 1 6 2014-10-02 2 1 1 7 2014-10-02 3 1 na 8 2014-10-02 4 -1 na 9 2014-10-03 1 -1 -1 10 2014-10-03 2 1 1 11 2014-10-03 3 -1 na 12 2014-10-03 4 -1 na

again, goal create script assigns na's in master_df_ex$ls_flag sum of ls_flag 0 each asof_dt.

for 2014-10-01, since both assigned ls_flags -1, both na's should -1.

for 2014-10-02, since both assigned ls_flags 1, both na's should -1.

for 2014-10-03, since there's 1 of each, want 3 take updn_flag's -1 first, 4 have whatever makes sum on day 0 (in case, 1).

one caveat note don't want hardcode 4 per day. may vary in numbers day day.

i not sure if need loop or create work table create work. please allow me know. give thanks you!

i think there error in question: "the sum of ls_flag 0 each asof_dt" think 2014-10-01 should 1 then, shouldn't it.

if right utilize next function:

require(dplyr) flag_function <- function(ls_flag){ ind <- which(is.na(ls_flag)) na_count <- length(ind) count <- sum(ls_flag[-ind]) ls_flag[ind] <- c(rep(-sign(count), abs(count)), rep_len(c(-1,1), na_count-abs(count))) ls_flag } master_df_ex %>% group_by(asof_dt) %>% mutate(ls_flag = flag_function(ls_flag))

result:

source: local info frame [12 x 4] groups: asof_dt asof_dt rank_mag updn_flag ls_flag 1 2014-10-01 1 -1 -1 2 2014-10-01 2 -1 -1 3 2014-10-01 3 1 1 4 2014-10-01 4 -1 1 5 2014-10-02 1 1 1 6 2014-10-02 2 1 1 7 2014-10-02 3 1 -1 8 2014-10-02 4 -1 -1 9 2014-10-03 1 -1 -1 10 2014-10-03 2 1 1 11 2014-10-03 3 -1 -1 12 2014-10-03 4 -1 1

checking sum==0

master_df_ex %>% group_by(asof_dt) %>% mutate(ls_flag = flag_function(ls_flag)) %>% summarise(sum(ls_flag))

works:

asof_dt sum(ls_flag) 1 2014-10-01 0 2 2014-10-02 0 3 2014-10-03 0

r algorithm na

No comments:

Post a Comment