c# - Regular expression for replacing extra characters between markers -
suppose have sample text below:
; </span><year><o:p></o:p> </span><</span><span style=3d'font-size:9.0pt;mso-bidi-font-family:arial'>manufacturer></span><span style=3d'mso-bidi-font-family:arial'> </span><model><o:p> </span><<span class=3dspelle>serial_number</span>><o:p> </span><<span class=3dspelle>accessories_value</span>><o:p></o:p></span> </span><<span class=3dspelle>accessories_list</span>> p; </span><<span class=3dspelle>worldwide_yn</span>> </span><</b><span class=3dspelle><span style=3d'mso-no-proof:yes'>pet_name</span></span><span style=3d'mso- no-proof:yes'>></span><o:p></o:p></p>
i looking find , replace every occurrences of following pattern:
< any_html_tags markers_text any_html_tags >
here :
html_tags: optional, may both opening , closing type, may 0 many times in numbers, there may html marker here.
markers_text: can in 1 of 2 formats either xxxxx (any no. of characters) or xxxx_xxxxxx (text can of length).
like want able find following texts in sample file:
1) <year> 2) <</span><span style=3d'font-size:9.0pt;mso-bidi-font-family:arial'>manufacturer> 3) <model> 4) <<span class=3dspelle>serial_number</span>> 5) <<span class=3dspelle>accessories_value</span>> 6) <<span class=3dspelle>accessories_list</span>> 7) <<span class=3dspelle>worldwide_yn</span>> 8) <</b><span class=3dspelle><span style=3d'mso-no-proof:yes'>pet_name</span></span><span style=3d'mso-no-proof:yes'>>
and replace them corresponding items like:
1) <year> 2) </span><span style=3d'font-size:9.0pt;mso-bidi-font-family:arial'><manufacturer> 3) <model> 4) <span class=3dspelle></span><serial_number> 5) <span class=3dspelle></span><accessories_value> 6) <span class=3dspelle></span><accessories_list> 7) <span class=3dspelle></span><worldwide_yn> 8) </b><span class=3dspelle><span style=3d'mso-no-proof:yes'></span></span><span style=3d'mso-no-proof:yes'><pet_name>
so want between < ; , > ; every tag except marker_text gets removed , come before < ; , doing using c# regex methods.
can please suggest proper regular expression achieve it?
final sample result should like:
; </span><year><o:p></o:p> </span></span><span style=3d'font-size:9.0pt;mso-bidi-font-family:arial'><manufacturer></span><span style=3d'mso-bidi-font-family:arial'> </span><model><o:p> </span><span class=3dspelle></span><serial_number><o:p> </span><span class=3dspelle></span><accessories_value><o:p></o:p></span> </span><span class=3dspelle></span><accessories_list> p; </span><span class=3dspelle></span><worldwide_yn> </b><span class=3dspelle><span style=3d'mso-no-proof:yes'></span></span><span style=3d'mso-no- proof:yes'><pet_name>
this search/replace looking for:
pattern:
<((?:</?span[^>]*>)*)(\w+)((?:</?span[^>]*>)*)>
replacement:
$1<$2>$3
online demo(see "context tab")
Comments
Post a Comment