Monday, 15 September 2014

r - plyr or dplyr in Python -



r - plyr or dplyr in Python -

this more of conceptual question, not have specific problem

i learning python info analysis, ma familiar r - 1 of great things r plyr (and of course of study ggplot2) , improve dplyr. pandas of course of study has split-apply in r can things (in dplyr, bit different in plyr, , can see how dplyr mimics . notation object programming)

info %.% group_by(c(.....)) %.% summarise(new1 = ...., new2 = ...., ..... newn=....)

in create multiple summary calculations @ same time

how do in python, because

df[...].groupby(.....).sum() sums columns,

while on r can have 1 mean, 1 sum, 1 special function, etc. on 1 call

i realize can operations separately , merge them, , fine if using python, when comes downwards choosing tool, line of code not have type , check , validate adds in time

in addition, in dplyr can add together mutate statements well, seems me way more powerful - missing pandas or python -

my goal learn, have spent lot of effort larn python , worthy investment, still question remains

thanks in advance

i think you're looking agg function, applied groupby objects.

from docs:

in [48]: grouped = df.groupby('a') in [49]: grouped['c'].agg([np.sum, np.mean, np.std]) out[49]: sum mean std bar 0.443469 0.147823 0.301765 foo 2.529056 0.505811 0.96

python r pandas plyr dplyr

No comments:

Post a Comment