r - Use group_by to filter specific cases while keeping NAs -
i want filter dataset keep cases observations in specific column. illustrate:
help <- data.frame(deid = c(5, 5, 5, 5, 5, 12, 12, 12, 12, 17, 17, 17), score.a = c(na, 1, 1, 1, na, na, na, na, na, na, 1, na))
creates
deid score.a 1 5 na 2 5 1 3 5 1 4 5 1 5 5 na 6 12 na 7 12 na 8 12 na 9 12 na 10 17 na 11 17 1 12 17 na
and want tell dplyr keep cases have observations in score.a
, including na values. thus, want return:
deid score.a 1 5 na 2 5 1 3 5 1 4 5 1 5 5 na 6 17 na 7 17 1 8 17 na
i ran code help %>% group_by(deid) %>% filter(score.a > 0)
pulls out nas well. thank assistance.
edit: similar question asked here how remove groups of observation dplyr::filter() however, in answer use 'all' condition , requires use of 'any' condition.
try
library(dplyr) %>% group_by(deid) %>% filter(any(score.a >0 & !is.na(score.a))) # deid score.a #1 5 na #2 5 1 #3 5 1 #4 5 1 #5 5 na #6 17 na #7 17 1 #8 17 na
or similar approach data.table
library(data.table) setdt(help)[, if(any(score.a>0 & !is.na(score.a))) .sd , deid] # deid score.a #1: 5 na #2: 5 1 #3: 5 1 #4: 5 1 #5: 5 na #6: 17 na #7: 17 1 #8: 17 na
if condition subset 'deid's values in 'score.a' > 0, above code can modified to,
setdt(help)[, if(!all(is.na(score.a)) & all(score.a[!is.na(score.a)]>0)) .sd , deid] # deid score.a #1: 5 na #2: 5 1 #3: 5 1 #4: 5 1 #5: 5 na #6: 17 na #7: 17 1 #8: 17 na
suppose 1 of 'score.a' in 'deid' group less 0,
help$score.a[3] <- -1
the above code return
setdt(help)[, if(!all(is.na(score.a)) & all(score.a[!is.na(score.a)]>0, deid], # deid score.a #1: 17 na #2: 17 1 #3: 17 na
Comments
Post a Comment