python - Use numpy.average with weights for resampling a pandas array -
i need resample info numpys weighted-average-function - , doesn't work... .
this test-case:
import numpy np import pandas pd time_vec = [datetime.datetime(2007,1,1,0,0) ,datetime.datetime(2007,1,1,0,1) ,datetime.datetime(2007,1,1,0,5) ,datetime.datetime(2007,1,1,0,8) ,datetime.datetime(2007,1,1,0,10) ] df = pd.dataframe([2,3,1,7,4],index = time_vec)
a normal resampling without weights works fine (using lambda function parameter how
suggested here: pandas resampling using numpy percentile? thanks!):
df.resample('5min',how = lambda x: np.average(x[0]))
but if seek utilize weights, returns typeerror: axis must specified when shapes of , weights differ
:
df.resample('5min',how = lambda x: np.average(x[0],weights = [1,2,3,4,5]))
i tried many different numbers of weights, did not better:
for in xrange(20): try: print range(i) print df.resample('5min',how = lambda x:np.average(x[0],weights = range(i))) print break except typeerror: print i,'typeerror'
i'd glad suggestions.
the short reply here weights in lambda
need created dynamically based on length of series beingness averaged. in addition, need careful types of objects you're manipulating.
the code got compute think you're trying follows:
df.resample('5min', how=lambda x: np.average(x, weights=1+np.arange(len(x))))
there 2 differences compared line giving problems:
x[0]
x
. x
object in lambda
pd.series
, , x[0]
gives first value in series. working without raising exception in first illustration (without weights) because np.average(c)
returns c
when c
scalar. think computing wrong averages in case, because each of sampled subsets returning first value "average".
the weights created dynamically based on length of info in series
beingness resampled. need because x
in lambda
might series
of different length each time interval beingness computed.
the way figured out through simple type debugging, replacing lambda
proper function definition:
def avg(x): print(type(x), x.shape, type(x[0])) homecoming np.average(x, weights=np.arange(1, 1+len(x))) df.resample('5min', how=avg)
this allow me have @ happening x
variable. hope helps!
python numpy pandas weighted-average
No comments:
Post a Comment