473,399 Members | 4,254 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

how can i grab my tables and display them on another page using reg-ex

14
For some reason

page 1.html // get contents from page 2.html

Expand|Select|Wrap|Line Numbers
  1.  
  2. function get_tag($htmlelement,$attr, $value, $html) 
  3. {
  4.     $attr = preg_quote($attr);
  5.     $value = preg_quote($value);
  6.  
  7.     if($attr!='' && $value!='')
  8.     {
  9.  
  10.         $tag_regex = '/<'.$htmlelement.'[^>]*'.$attr.'="'.$value.'" width="100%">(.*?)<\\/'.$htmlelement.'>/si';
  11.  
  12.         $matchCount = preg_match($tag_regex,$html,$matches);
  13.  
  14.             if ($matchCount > 0) 
  15.             {
  16.             echo("$matchCount matches found.\n");
  17.             }
  18.                 else 
  19.                 {
  20.                 echo("no records");
  21.                 }
  22.  
  23.     }
  24. }    
  25. $htmlcontent = file_get_contents("http://page2.html/");
  26. $extract = get_tag("table","class", "tablemenu", $htmlcontent);
  27.  
  28.     echo $extract;
  29.  
  30.  
  31.  
  32.  
page2.html

I would like to grab all of the tables from the page and display them on another one.

Expand|Select|Wrap|Line Numbers
  1.  
  2. // more code
  3.  
  4. <table width="100%" class="tablemenu">
  5. <tbody>
  6. <tr>
  7. <td> 
  8.  
  9. </td>
  10. </tr>
  11. </tbody>
  12. </table>
  13.  
  14. <table width="100%" class="tablemenu">
  15. <tbody>
  16. <tr>
  17. <td> 
  18.    // some data
  19. </td>
  20. </tr>
  21. </tbody>
  22. </table>
  23.  
  24. <table width="100%" class="tablemenu">
  25. <tbody>
  26. <tr>
  27. <td> 
  28.    // some data
  29. </td>
  30. </tr>
  31. </tbody>
  32. </table>
  33.  
  34. <table width="100%" class="tablemenu">
  35. <tbody>
  36. <tr>
  37. <td> 
  38.    // some data
  39. </td>
  40. </tr>
  41. </tbody>
  42. </table>
  43.  
  44. // more code
  45.  
  46.  
I have kept trying but had no luck :(
Aug 26 '13 #1

✓ answered by Atli

Regular expressions are not a good way to find things in a HTML document. HTML syntax is to irregular for it to be reliable. You need an actual parser if you want this to work properly.

PHP has a built in libraries that can be used to parse HTML documents. Most of them are more focused on XML, but many can also be used to deal with HTML. - The DOM extension, for example, has a DOMDocument class with a loadHTMLFile method. With that, you can traverse the HTML structure in much the same way you would in JavaScript.

For example:
Expand|Select|Wrap|Line Numbers
  1. <?php
  2.  
  3. $filePath = "/path/to/HTML/file.html";
  4.  
  5. // Load the HTML file.
  6. $dom = new DOMDocument();
  7. $dom->loadHTMLFile($filePath);
  8.  
  9. // Find all the tables.
  10. $tables = $dom->getElementsByTagName("table");
  11.  
  12. // Go through the table list and look for tables
  13. // with the class: "tablemenu"
  14. if ($tables && $tables->length > 0) {
  15.     for ($i = 0; $i < $tables->length; ++$i) {
  16.         $item = $tables->item($i);
  17.         $class = $item->attributes->getNamedItem("class")->nodeValue;
  18.         if ($class == "tablemenu") {
  19.             // The $item here would be one of the
  20.             // tables we are looking for!
  21.         }
  22.     }
  23. }
  24.  

2 1691
Atli
5,058 Expert 4TB
Regular expressions are not a good way to find things in a HTML document. HTML syntax is to irregular for it to be reliable. You need an actual parser if you want this to work properly.

PHP has a built in libraries that can be used to parse HTML documents. Most of them are more focused on XML, but many can also be used to deal with HTML. - The DOM extension, for example, has a DOMDocument class with a loadHTMLFile method. With that, you can traverse the HTML structure in much the same way you would in JavaScript.

For example:
Expand|Select|Wrap|Line Numbers
  1. <?php
  2.  
  3. $filePath = "/path/to/HTML/file.html";
  4.  
  5. // Load the HTML file.
  6. $dom = new DOMDocument();
  7. $dom->loadHTMLFile($filePath);
  8.  
  9. // Find all the tables.
  10. $tables = $dom->getElementsByTagName("table");
  11.  
  12. // Go through the table list and look for tables
  13. // with the class: "tablemenu"
  14. if ($tables && $tables->length > 0) {
  15.     for ($i = 0; $i < $tables->length; ++$i) {
  16.         $item = $tables->item($i);
  17.         $class = $item->attributes->getNamedItem("class")->nodeValue;
  18.         if ($class == "tablemenu") {
  19.             // The $item here would be one of the
  20.             // tables we are looking for!
  21.         }
  22.     }
  23. }
  24.  
Aug 26 '13 #2
luke
14
Getting an error Trying to get property of non-object in C:\wamp\www

reffers to this line...

$class = $item->attributes->getNamedItem("class")->nodeValue;

but my table does print out.

Expand|Select|Wrap|Line Numbers
  1. $filePath = "http://msn.net/";
  2.  
  3. // Load the HTML file.
  4. $dom = new DOMDocument();
  5. @$dom->loadHTML(file_get_contents($filePath));
  6.  
  7. // Find all the tables.
  8. $tables = $dom->getElementsByTagName("table");
  9.  
  10. if ($tables && $tables->length > 0) {
  11.     for ($i = 0; $i < $tables->length; ++$i) {
  12.         $item = $tables->item($i);
  13.         $class = $item->attributes->getNamedItem("class")->nodeValue;
  14.  
  15.         if ($class == "tablemenu") {
  16.  
  17.            echo $item->nodeValue;
  18.            echo "<br />";
  19.         }
  20.     }
  21. }
  22.  
Aug 27 '13 #3

Sign in to post your reply or Sign up for a free account.

Similar topics

2
by: CV | last post by:
How can I match 'n' number of neighbouring words of a pattern using regular expressions? For example, suppose I am looking for the pattern "length xyz cm" in some text. where xyz is a number -...
16
by: Stephane | last post by:
Hi, I'm trying to replace parenthesis using Regex.replace but I'm always having this error: System.ArgumentException: parsing ":-)" - Too many )'s. Parameter name: :-) Here's my code: ...
4
by: Ya Ya | last post by:
Hi, I have a string with some fixed text and variable text. For example: "this is a fixed text THE NEEDED INFO more more fixed text". How do I get the the variable text (THE NEEDED INFO) from this...
4
by: Joe | last post by:
I need to do a find/replace on a column name in DataColumn.Expression. Is there a way to do the following using RegEx? MyColumn 10 and Desc = "This is MyColumn desc" I need to replace the...
0
by: peridian | last post by:
Hi, I wanted a web page where I could post code to, and have it appear in coloured formatting based on the context of the code. Most of the techniques I have seen for this involve complex use...
14
by: lmttag | last post by:
Hello. We're developing an ASP.NET 2.0 (C#) application and we're trying to AJAX-enable it. We're having problem with a page not showing the page while a long-running process is executing. So,...
6
by: Nightcrawler | last post by:
Hi all. I have a html table with multiple rows (one row example below). I would like to extract everything within the <tdtags into groups on a row by row basis. The process would be: find the...
2
by: KK | last post by:
Dear All I have a string like this: myOutput = myObject.MyMethod(myInput1,myInput2) I would like to parse this string and separate it into 4 groups. group1 contains left of '='. group2 contains...
0
by: rmeshksar | last post by:
Hi, I would like to do text replacement using RegEx and use the following statement: Regex.Replace(input, pattern, replacement, RegexOptions.IgnoreCase) It works fine in all cases except in the...
2
by: daveftl | last post by:
Hello, i've tried to extract certain data using Regex in a File. but it seems not working.No errors and warnings have been found. here is my code: Private Sub extractTxt(ByVal inputFile As...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.