regex - Cleanup file of phone numbers that are not properly formatted -
i have file 10,000 phone numbers in , many not formatted properly, e.g. 123-456-7890
, although i've cleaned still have 1 pattern i'm not sure how handle. used sed
clean of , don't mind using either sed
or awk
, although utilize sed
more awk
, 1 of lastly groups (2306 line) formatted properly
example: 123 4567890
(3 tab 7) needs 123-456-7890
(3 dash 3 dash 4).
i know can find pattern , replace tab plenty using:
sed "^[0-9][0-9][0-9]\t[0-9][0-9][0-9][0-9][0-9][0-9][0-9]/s/\t/-/" infile.txt > outfile.txt
however if augment instruction parse 7 numbers, grouped together, @ same time create easier me clean what's left after round. i've done fair amount of searching although couldn't found list when typed in subject work before next through posting question.
use extended regular expressions , capturing groups:
sed -e 's/^([0-9]{3})\t([0-9]{3})([0-9]{4})$/\1-\2-\3/' infile.txt > outfile.txt
regex osx bash awk sed
No comments:
Post a Comment