My Blog: python - append rows to a Pandas groupby object -

Sunday, 15 May 2011

python - append rows to a Pandas groupby object -

I am trying to figure out the best way to put the instrument back into a multi-indexed panda dataframe.

value Datafrem this way to me:

  Metric 1 metric 2 Arpiarpi Fyue 0 1 2 3 B 4 5 6 7 times 8 9 10 11 B12 13 14 15

I would like to get the following results:

  metric 1 metric 2 RPRP FUA 0 1 2 3B 4 5 6 7 average 2 3 4 5 times a 8 9 10 11 b12 13 14 15 average 10 11 12 13

Please note, I know that I df.mean (level = 0) Can be found in the form of a different data frame as the level 0 group. This is not what I want - I want to include the group, that means the rows return in the group.

I think I am able to get the results, but I think I am doing this wrong / possibly a liner which I am missing is that this expensive python repetition Here is my example code:

  import as NP import panda as PD data numpy = np.arange (16) .reshape (4,4) ROW_INDEX = [[("Foo", "a"), ("fu", "b"), ("bar", "a"), ("bar", "b")] col_index = [(((" "," R "), (" matric1 "," p "), (" metric 2 "," r "), (" Metric 2 "," P ")] col_multiindex = pd.MultiIndex.from_tuples (col_index) DF = pd.DataFrame (data, index = PD KMultiIndexkfrom_tuples (ROW_INDEX), Column = Col_multiindex) New_row_index = [] data = [] for the name, group df.groupby (level = 0) for index_tuple, line group.iterrows (): new_row_index.append (index_tuple) data.append (row.tolist ()) new_row_index.append ((name it "average")) data.append (group.mean (). tolist ()) print pd.DataFrame (data, index = pd.MultiIndex. from_tuples (new_row_index), column = col_multiindex )

In which result is:

  Metric 1 metric 2 RPRP times one 8 9 10 11 b 12 to 13 14 to 15 average 10 11 12 13 Fu A 0 1 2 3B 4 5 6 7 AVG 2 3 4 5

For some reason the groups flips the order, but more or less what I want is to do.

The main thing is that you need to do here to attach your means to the main dataset. Before doing so requires a major move queued you Indekss ( reset_index () and set_index () is consistent with so that you will be more or less after adding them Ready to sort by the same key. <35>: In [pre] : df2 = df.groupby (level = 0) .mean () [36]: df2 ['index2' ] = 'average' [37] in :. df2 = df2.reset_index () set_index ([ '. index', 'index2']) attached (DF) .sort () [38]: df2 out [38]: Metric 1 Metric 2 RPR Index Index 2 times Average 10 11 12 13 A 8 9 10 1 1B 12 13 14 15 FU AGG 2 3 4 5A 1 2 3B 4 5 6 7 As far as the rows are ordered, the best thing is to name the names Setting so that sorting puts them in the right place (like A, B, average). For short lines you can use Fancy Indexing:

 In  [3]: Df2.ix [[4,5,3,1,2,0]] Out [39]: Metric 1 Metric 2 RPR Index Index 2Fu A1 1 2 3B 4 5 6 7 AVG 2 3 4 5 times One 8 9 10 11b 12 13 14 15 average 10 11 12 13




Posted by



Unknown




at

03:22











Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest




No comments:







Post a Comment