471,319 Members | 1,575 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,319 software developers and data experts.

Extract php code from a php file using RegEx

rizwan6feb
108 100+
I am trying to extract php code from a php file (php file also contains html, css and javascript code). I am using the following regex for this
Expand|Select|Wrap|Line Numbers
  1. <\?[\w\W]*?\?>
  2.  
but this doesn't cater quotation marks (single and double quotes) and comments, i mean how can i skip php tags inside a string (and comments). Please have a look at the following code
Expand|Select|Wrap|Line Numbers
  1. <?php
  2.     include("db.php");
  3.     $name=$_REQUEST['name'];
  4.     /* the regular expression to extract code inside php tags (i.e  <? and ?>) is  */
  5.     $str='|<\?[\w\W]*?\?>|';
  6.     # The regex can also be written using double quotes
  7.     $str="|<\?[\w\W]*?\?>|";
  8. ?>
  9. <html xmlns="http://www.w3.org/1999/xhtml">
  10. <head>
  11.  
  12. <title>Sample PHP File</title>
  13. </head>
  14.  
  15. <body>
  16. <?="<h1>Some output from PHP?>
  17. </body>
  18. </html>
  19.  
I need a regular expression that extracts the two blocks of PHP code from sample code
Jun 8 '09 #1
3 3823
Dormilich
8,656 Expert Mod 8TB
@rizwan6feb
well, if I try, the RegEx gets it all. your problem is that it is extremely difficult to determine, whether a "?>" is a comment, a string or a processing instruction (i.e. you need to kind of parse the string).

maybe it's easier to do the reverse and not extract what's between ?> and <? (though this may also fail in special circumstances)
Jun 11 '09 #2
rizwan6feb
108 100+
Found a solution at http://regexadvice.com/forums/thread/53756.aspx

The regex below is what i needed
<\?(\x22[^\x22]*?\x22|\x27[^\x27]*?\x27|/\*.*?\*/|.)*?\?>
Jun 12 '09 #3
Dormilich
8,656 Expert Mod 8TB
are you sure? the RegEx fails for me (that is, it doesn't get the whole first block, only chunks).

translated into characters… all that's a string, a comment or any char
Expand|Select|Wrap|Line Numbers
  1. <\?("[^"]*?"|'[^\']*?'|/\*.*?\*/|.)*?\?>
Jun 12 '09 #4

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

1 post views Thread by Ori | last post: by
2 posts views Thread by Thief_ | last post: by
2 posts views Thread by Fabian Braennstroem | last post: by
18 posts views Thread by Ecka | last post: by
reply views Thread by rosydwin | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.