java - Regular Expression negate whole regex -
i want parse string regular expression groups:
{4: :35b:isin de000xxxxxxx disc.z 11.11.11 xxxx90 1234 (hsbc t r.? b.) /f:12345/r:n/w:n/c:n/s:n/g:n/a:n/f:n /xx/any word :16s:confdet :16r:setdet :22f::setr//trad :11a::fxib//eur :16r:amt :19a::deal//eur222, :16s:amt :16r:amt :19a::loco//eur555 :16s:amt :16r:amt :19a::othr//eur444 :16s:amt :16r:amt :19a::sett//eur333,33 :16s:amt :16s:setdet -}
i created regex (:\d\d[a-za-z]:*(\w*\/\/)?|:\d\d:)([^:]+)
matches in of cases, not in one. want extract groups this:
:35b: => isin de000xxxxxxx disc.z 11.11.11 xxxx90 1234 (xxxx t r.? b.) /f:12345/r:n/w:n/c:n/s:n/g:n/a:n/f:n /xx/any word :16s: => confdet :16r: => setdet
...
i expected there no ':' in second group. maybe can me. need extact whole sting until next :\d\d\w: block.
edit: input string has key-value structure. example :35b: key , behind until next key value (in case of example value 'isin de000xxxxxxx disc.z 11.11.11 xxxx90 1234 (xxxx t r.? b.) /f:12345/r:n/w:n/c:n/s:n/g:n/a:n/f:n /xx/any word'). want extract key-value pairs of input string. here small code example of want:
charsequence swiftmessage = "{4: :35b:isin de000xxxxxxx disc.z 11.11.11 xxxx90 1234 (hsbc t r.? b.) /f:12345/r:n/w:n/c:n/s:n/g:n/a:n/f:n /xx/any word :16s:confdet :16r:setdet :22f::setr//trad :11a::fxib//eur :16r:amt :19a::deal//eur222, :16s:amt :16r:amt :19a::loco//eur555 :16s:amt :16r:amt :19a::othr//eur444 :16s:amt :16r:amt :19a::sett//eur333,33 :16s:amt :16s:setdet -}"; pattern pattern = pattern.compile("(:\\d\\d([a-za-z]):*(\\w*//)?|:\\d\\d:)([^:]+)"); matcher matcher = pattern.matcher(swiftmessage); while( matcher.find() ) { string key = matcher.group(1); string value = matcher.group(4); system.out.println(key + "=>" + value); }
expected output (the stucture key=>value):
:35b:=>isin de000xxxxxxx disc.z 11.11.11 xxxx90 1234 (hsbc t r.? b.) /f:12345/r:n/w:n/c:n/s:n/g:n/a:n/f:n /xx/any word :16s:=>confdet :16r:=>setdet :22f::setr//=>trad :11a::fxib//=>eur :16r:=>amt :19a::deal//=>eur222, :16s:=>amt :16r:=>amt :19a::loco//=>eur555 :16s:=>amt :16r:=>amt :19a::othr//=>eur444 :16s:=>amt :16r:=>amt :19a::sett//=>eur333,33 :16s:=>amt :16s:=>setdet -}
in regex value of key :35b: 'isin de000xxxxxxx disc.z 11.11.11 xxxx90 1234 (hsbc t r.? b.) /f' because regex looks next colon. expexted value should 'isin de000xxxxxxx disc.z 11.11.11 xxxx90 1234 (hsbc t r.? b.) /f:12345/r:n/w:n/c:n/s:n/g:n/a:n/f:n /xx/any word'
hopefully it's better understand.
it looks want find tokens separated (space):
, treat part before first :
in each token key , rest value.
in case can try
(?<key>(?<=\\s):\\d\\d[a-za-z]):(?<value>.*?)(?=\\s:|$)
which try
- find
:\\d\\d[a-za-z]
part preceded space(?<=\\s)
, put group namedkey
- find minimal (since
*?
quantifier reluctant) set of characters until next\\s:
or end of string found, , place part in group namedvalue
.
so code can
pattern pattern = pattern.compile("(?<key>(?<=\\s):\\d\\d[a-za-z]):(?<value>.*?)(?=\\s:|$)"); matcher matcher = pattern.matcher(swiftmessage); while( matcher.find() ) { string key = matcher.group("key"); string value = matcher.group("value"); system.out.println(key + "=>" + value); }
other approach splitting on \\s:
data in parts like
{4: 35b:isin de000xxxxxxx disc.z 11.11.11 xxxx90 1234 (hsbc t r.? b.) /f:12345/r:n/w:n/c:n/s:n/g:n/a:n/f:n /xx/any word 16s:confdet ... 16s:setdet -}
and split again each part on :
limited number of splits 2
(so "foo:bar:baz:".split(":",2)
becomes ["foo", "bar:baz"]
).
with approach code can
for (string token : swiftmessage.tostring().split("\\s:")){ //system.out.println(token); //lets ignore first `{4:` part //maybe if (token.length()<=3) continue; string[] key_value = token.split(":",2); system.out.println(":"+key_value[0]+"=>"+key_value[1]); }
Comments
Post a Comment