Tuesday, 15 July 2014

python - Numpy arrays with compound keys; find subset in both -


<2> <2> <2> <2> <2> <2> <2> <2>

  (19133l, 12l) ( 248 L, 6L)  

In each case, the first 3 fields make an identifier.

I want to reduce the larger matrix so that there are rows with only that identifier which exists in the second matrix (248 L, 12L) How can I do this?

Then I would like to sort it so that the array can be indexed with first value, second value and third value so that (3 3 4) (3 3) etc. Is there a multi area sorting function?

Edit:

I have tried Panda:

  df1 = DataFrame (arr1.astype (str)) Df2 = DataFrom (arr2 .astype (str)) df1.set_index ([0,1,2]) df2.set_index ([0,1,2]) outside = merge (df1, df2, how-to = "inner") print (out.shape)  

But the result in this (0,13) size

Use Pandas .

Allows multiple keys.

To pierce your dataframe, first set the index in the first three columns (use drop = False, inplace = true ).

In general, the arbitrary dataframe runs out of narrow steam very quickly for manipulation; Your default thing should try Panda, besides a lot more performant.


No comments:

Post a Comment