Wednesday, 15 January 2014

awk - Selecting columns using specific patters then finding sum and ratio -


I want to calculate the sum and ratio values ​​from the data given below. (Actual data includes more than 200,000 columns and 45000 lines (lines)).

For the purpose of clarity I have only given simple data format.

  #Frame BMR_42 @ o22 BMR_49 @ o13 BMR_59 @ o13 BMR_23 oo BMR_10 @ o13 BMR_61 @ o26 BMR_23OO1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 1 1 1 0 1 1 1 1 0 1 1 1 4 1 1 0 0 1 0 1 5 0 0 0 0 6 1 0 1 1 0 1 0 7 1 1 1 1 0 0 1 1 1 1 1 0 0 0 0 9 1 1 1 1 1 1 1 10 0 0 0 0 0 0  

With some criteria the column should be selected.

The column data on which I think is the only column with " @ O13 " below I have selected columns selected from above examples.

  BMR_49 @ O13 BMR_59 @ o13 BMR_10 @ o 13 1 0 1 1 0 1 1 1 0 1 0 1 0 0 1 0 1 1 1 0 1 1 0 1 1 1 0 0 0 0  

With the selected column, I want to calculate:

1) The sum of all "1" in this example we get the value 16.

2) Number of total rows to be "1" (at least one time) is 8 rows from the above example, with at least one occurrence of "1".

Finally,

This is: "All" 1 (1 "1") with the total lines of the ratio of all "1" For example, the total of the total) / (total lines with the opportunity of "1"), for example, " @ O13 "

  awk '{i (i = 1; i & lt; = NF; i ++) if (i ~ / @ O13 /); Print ""} '$ file2  

Although this run does not show the value

Any kind of assistance should be appreciated on offer.

should do this:

  awk 'NR == 1 { For (i = 1; (i i) if ($ i) {s ++; f ++}; (i) in (i) i  with 1 = 16 rows = 1 / s> 

some more Readable:

  awk 'nr == 1 {for (i = 1; i & lt; = nf; i ++) if ($ i ~ / @ O13 /) if (F) R ++} END {print "number of 1" = "s \" \ nrows (a = i) = (for i == "1") {s ++ f ++} 1 = "r \" \ Nratio = "s / r} 'with the file  

No comments:

Post a Comment