regex - I'm using PSPad and need a regular expression that will find 4 numbers in a six number set -
i have pages of data in following format:
{1,2,3,4,5,6} {1,3,4,5,6,7} {1,2,4,5,6,7} {1,2,3,5,6,7}
for clarification, i'm going call each instance of 6 comma separated numbers surrounded {}, "word".
so in example above, {1,2,3,4,5,6} word.
so i'm trying find each word contains 4 numbers of choosing. example, in sample above, find of words contain numbers 1, 2, 6 , 7. key point here and. know how find 1, 2, 6 or 7 -- need and. if possible, replace whole word, if finds numbers, can delete remaining 2 numbers of word regular expression.
some more information data: numbers range 1-25. (so need has capability of finding 1, without including numbers 21 or 10-19 or 2, without including 12 or 20-25.) there never repeat of number within single word. numbers within word in order lowest (1) highest (25).
update:
you told using pspad
, want in editor. don't think possible using regular expression. use awk
or programming language of choice.
here comes example using awk
:
awk '{for(i=1;i<=nf;i++)if($i~/\y1\y/&&$i~/\y2\y/&&$i~/\y6\y/&&$i~/\y7\y/)$i=""}1' input.txt
explanation:
the for
loop iterates trough fields of line, if
condition checks if fields matches required numbers. if matches numbers fields gets truncated. following 1
awk
idiom printing fields separated output delimiter (which input delimiter default).
about number matching, i'm using escape sequence \y
before , after number:
$i~/\y1\y/
\y
matches word boundary in case being either {
,
or closing }
make sure pattern above match 1
not match 11
example.
output:
{1,2,3,4,5,6} {1,3,4,5,6,7}
btw, above script can far more readable , maintainable if save file:
remove.awk:
# applies every line of input { for(i=1;i<=nf;i++) { # check whether field matches required numbers if( \ $i~/\y1\y/ \ && $i~/\y2\y/ \ && $i~/\y6\y/ \ && $i~/\y7\y/ \ ) { # truncate field $i="" } } # print modified line print }
you can call script this:
awk -f remove.awk input.txt
Comments
Post a Comment