linux - Regex replace on specific column with SED/AWK -
i have data looks (tab delimited):
organ k clustno analysis ln k200 c12 gene ontology ln k200 c116 gene ontology cn k200 c2 gene ontology
what want remove c
every row on 3rd column, except header row:
organ k clustno analysis ln k200 12 gene ontology ln k200 116 gene ontology cn k200 2 gene ontology
this won't because affect other columns , header row:
sed 's/c//'
what's right way it?
using awk
awk
tool this:
$ awk -f'\t' -v ofs='\t' 'nr>=2{sub(/^c/, "", $3)} 1' file organ k clustno analysis ln k200 12 gene ontology ln k200 116 gene ontology cn k200 2 gene ontology
how works
-f'\t'
use tab field delimiter on input.
-v ofs='\t'
use tab field delimiter on output
nr>=2 {sub(/^c/, "", $3)}
remove initial
c
field 3 lines after first line.1
this awk's cryptic shorthand print-the-line.
using sed
$ sed -r '2,$ s/(([^\t]+\t+){2})c/\1/' file organ k clustno analysis ln k200 12 gene ontology ln k200 116 gene ontology cn k200 2 gene ontology
-r
use extended regular expressions. (on mac osx or other bsd platform, use
-e
instead.)2,$ s/(([^\t]+\t){2})c/\1/
this substitution applied lines 2 end of file.
(([^\t]+\t){2})
matches first 2 tab-separated columns. assumes 1 tab separates each column. because regex enclosed in parens, matches available later\1
.c
matchc
.\1
replaces matched text first 2 columns, notc
..
Comments
Post a Comment