regex - OCaml parse large text -
ocaml, how disassemble large multiline text data using page module. ignoring symbol of beginning of new line.
let get_info content = let re = str.regexp "\\(.+?\\)" in match str.string_match re content 0 | true -> print_endline("-->"^(str.matched_group 1 content)^"<--") | false -> print_endline("not found");;
this example returns first line, need text in multiple lines.
according http://pleac.sourceforge.net/pleac_ocaml/patternmatching.html:
- str's regexps lack whitespace-matching pattern.
so, here workaround suggested on page:
#load "str.cma";; ... let whitespace_chars = string.concat "" (list.map (string.make 1) [ char.chr 9; (* ht *) char.chr 10; (* lf *) char.chr 11; (* vt *) char.chr 12; (* ff *) char.chr 13; (* cr *) char.chr 32; (* space *) ])
and then
let re = str.regexp "\\((?:[^" ^ whitespace_chars ^ "]|" ^ whitespace_chars ^ ")+?\\)" in
Comments
Post a Comment