python - Searching basic comments in C++ by regex -


i'm writing python program searching comments in c++ program using regex. wrote following code:

import re regex = re.compile(r'(\/\/(.*?))\n|(\/\*(.|\n)*\*\/)') comments = [] text = "" while true:     try:         x= raw_input()         text = text + "\n"+ x     except eoferror:         break z = regex.finditer(text) match in z:     print match.group(1) 

this code should detect comment of type //i'm comment , /*blah blah blah blah blah*/ i'm getting following output:

//  program in c++ none //use cout 

which i'm not expecting. thought match.group(1) should capture first parenthesis of (\/\*(.|\n)*\*\/), not. c++ program i'm testing is:

//  program in c++  #include <iostream> /** love c++     awesome **/ using namespace std;  int main () {   cout << "hello world"; //use cout   return 0; } 

you didn't use order since inline comment can include inside multiline comment. need begin pattern multiline comment. example:

/\*[\s\s]*?\*/|//.* 

note can improve pattern if have long multiline comments (this syntax emulation of atomic group feature not supported re module):

/\*(?:(?=([^*]+|\*(?!/))\1)*\*/|//.* 

but note there other traps string contains /*...*/ or //......

so if want avoid these cases, example if want make replacement, need capture before strings , use backreference in replacement string, this:

(pattern strings)|/\*[\s\s]*?\*/|//.* 

replacement: $1


Comments

Popular posts from this blog

c++ - QTextObjectInterface with Qml TextEdit (QQuickTextEdit) -

javascript - angular ng-required radio button not toggling required off in firefox 33, OK in chrome -

xcode - Swift Playground - Files are not readable -