awk - How to track lines in large log file that don't appear in the expected order? -
i have big log file includes lines in format
id_number message_type
here illustration log file lines appear in expected order
1 2 1 b 1 c 2 b 2 c
however, not lines appear in expected order in log file , i'd list of id numbers don't appear in expected order. next file
1 2 1 c 1 b 2 b 2 c
i output indicates id number 1 has lines don't appear in expected order. how this, using grep
, sed
, awk
?
this works me:
awk -v "a=abc" 'substr(a, b[$1]++ + 1, 1) != $2 {print $1}' logfile
when run this, id number each out-of-order line printed. if there no out-of-order lines, nil printed.
how works-v "a=abc"
this defines variable a
list of characters in expected order.
substr(a, b[$1]++ + 1, 1) != $2 {print $1}
for each id number, array b
keeps track of are. initially, b
0 ids. initial value, b[$1]==0
, look substr(a, b[$1] + 1, 1)
returns a
our first expected output. status substr(a, b[$1] + 1, 1) != $2
checks if expected output, substr
function, differs actual output shown in sec field, $2
. if differ, id value, $1
, printed.
after substr
look computed, trailing ++
in look b[$1]++
increments value of b[$1]
1 value of b[$1]
ready next time id $1
encountered.
the above prints id number every time out-of-order line encountered. if want each bad id printed once, not multiple times, use:
awk -v "a=abc" 'substr(a, b[$1]++ + 1, 1) != $2 {bad[$1]++} end{for (n in bad) print n}' logfile
awk sed grep
No comments:
Post a Comment