Sort data by factor and output into a matrix (or df) R -
i have looked through other posts , think have idea of do, want clear!
i have large data frame contains 4 variables , number of rows.
chain resid resnum energy 1 c o17 500 -37.03670 2 arg 8 -0.84560 3 leu 24 -0.56739 4 asp 25 -0.98583 5 b arg 8 -0.64880 6 b leu 24 -0.58380 7 b asp 25 -0.85930
each row contains chain (a, b, or c), resid, resnum, , energy. sort data of energy values belonging specific resid , num in each chain clustered together. cluster mean of values "arg 8" grouped or of rows containing "arg 8" grouped. don't know more efficient. ideally, output residues
arg 8 0.000 0.000 0.000
where "0.000" energy values arg 8 or o17 , on.
sorry header breaks, wanted data clean, can't insert images.
data
structure(list(chain = structure(c(3l, 1l, 1l, 1l, 2l, 2l, 2l ), .label = c("a", "b", "c"), class = "factor"), resid = structure(c(4l, 1l, 3l, 2l, 1l, 3l, 2l), .label = c("arg", "asp", "leu", "o17" ), class = "factor"), resnum = c(500l, 8l, 24l, 25l, 8l, 24l, 25l), energy = c(-37.0367, -0.8456, -0.56739, -0.98583, -0.6488, -0.5838, -0.8593)), .names = c("chain", "resid", "resnum", "energy" ), class = "data.frame", row.names = c(na, -7l))
if want convert wide
format
library(reshape2) dcast(df, resid+resnum~paste0('energy.',chain), value.var='energy') # resid resnum energy.a energy.b energy.c #1 arg 8 -0.84560 -0.6488 na #2 asp 25 -0.98583 -0.8593 na #3 leu 24 -0.56739 -0.5838 na #4 o17 500 na na -37.0367
Comments
Post a Comment