473,219 Members | 1,616 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,219 software developers and data experts.

Extract records from HTML of another site

All,
Can someone help me solve the next step.

First of all let me say I'm new to php. I pieced the following code together
from samples
I found on the net and a book I bought called PHP Cookbook. So please
forgive me if this isn't the best approach - I'm open to suggestions
I finally got my code to work that logs into another site and pulls the
orderstatus page to my server.

<?php
/*
Login to site
*/
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, "/tmp/cookieFileName");
curl_setopt($ch,
CURLOPT_URL,"https://www.homier.com/default.asp?page=signin");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,
"EM****************@swbell.net&Password=1040ez ");
ob_start(); // prevent any output
curl_exec ($ch); // execute the curl command
ob_end_clean(); // stop preventing output
curl_close ($ch);
unset($ch);

/*
Dump html of orderstatus page into a file on my server
*/
$fh = fopen('raw_orderstatus.html','w') or die($php_errormsg);
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_COOKIEFILE, "/tmp/cookieFileName");
curl_setopt($ch,
CURLOPT_URL,"https://www.homier.com/default.asp?page=orderstatus");
curl_setopt($ch, CURLOPT_FILE, $fh);
curl_exec ($ch);
curl_close ($ch);
?>

My problem: How can I capture only the data in the "<td
class='n8n_CCCCCC_default>" tags?
Is there a way to do this at file creation?
I checked with my ISP and I can't use LYNX -DUMP file.html

The goal here is to load these records into MYSQL database.

Thanks in advance
Steve
The html code looks like this
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>homier.com</title>
<LINK REL='stylesheet' TYPE='text/css' HREF='hdcstyle.css'>
<meta http-equiv='Content-Type' content='text/html; charset=iso-8859-1'>
<meta name='description' content=' '>
<meta name='keywords' content=' '>
<meta name='revisit-after' content='7 days'>
<meta name='robots' content='all, index, follow'>
<meta http-equiv="Pragma" content="no-cache">
<script language="JavaScript"
src="https://www.thawte.com/html/certdetails.js"
type="text/javascript"></script>

</head>
<body leftmargin='0' topmargin='0' rightmargin='0' bottommargin='0'
marginwidth='0' marginheight='0'>
<table cellspacing="0" cellpadding="0" border="0" width='770'>
<tr>
<td align='left' valign='top'>
<img src='https://www.homier.com/graphics/hdclogo3.jpg' border='0'
width='370' height='58' alt='Homier Distributing Company, Inc.'></td>

<td align='right' valign='top'>
<table cellspacing="0" cellpadding="0" border="0">
<tr><td align='right' valign='middle' class='menu'>
<a href='https://www.homier.com/default.asp?page=cart' class='menu'><img
src='https://www.homier.com/graphics/cart.gif' border='0' width='17'
height='21' align='absmiddle'>Shopping Cart</a>
| <a href='https://www.homier.com/default.asp?page=stores' class='menu'>Sale
Locations</a>
| <a href='https://www.homier.com/default.asp?page=about' class='menu'>About
Us</a>
| <a href='https://www.homier.com/default.asp?page=contacts'
class='menu'>Contact Us</a>
| <a href='https://www.homier.com/default.asp?page=faq' class='menu'>FAQ</a>
</td></tr>
<tr><td align='right' valign='middle' class='menu'>
<a href='https://www.homier.com/default.asp?page=myprofile' class='menu'>My
Account</a>
| <a href='https://www.homier.com/default.asp?page=orderstatus'
class='menu'>Order Status</a> |
<a href='https://www.homier.com/default.asp?page=dealers'
class='menu'>Dealer Extranet</a>
</td></tr>

</table>
</td>
</tr>
</table>
<table cellspacing="0" cellpadding="0" border="0" width='770'>
<tr>
<td colspan='2' valign='top' align='center'
background='https://www.homier.com/graphics/hdcbk3.jpg'>
<table cellspacing='0' cellpadding='0' border='0'>
<tr><table cellspacing='0' cellpadding='0' border='0'><tr>
<td align='middle' valign='top'><img
src='https://www.homier.com/graphics/tab_start.gif' border='0' width='5'
height='21'></td>
<td align='middle' valign='middle' class='tabs' style='background-image:
url(https://www.homier.com/graphics/tab_bg.gif);'><a
href='https://www.homier.com/default.asp?dpt=0' class='link'
onmouseover="this.style.color='yellow'"
onmouseout="this.style.color='white'">Home</a></td>
<td align='middle' valign='top'><img
src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
height='21'></td>
<td align='middle' valign='middle' class='tabs' style='background-image:
url(https://www.homier.com/graphics/tab_bg.gif);'><a
href='https://www.homier.com/default.asp?dpt=1' class='link'
onmouseover="this.style.color='yellow'"
onmouseout="this.style.color='white'">Tools</a></td>
<td align='middle' valign='top'><img
src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
height='21'></td>
<td align='middle' valign='middle' class='tabs' style='background-image:
url(https://www.homier.com/graphics/tab_bg.gif);'><a
href='https://www.homier.com/default.asp?dpt=2' class='link'
onmouseover="this.style.color='yellow'"
onmouseout="this.style.color='white'">Automotive</a></td>
<td align='middle' valign='top'><img
src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
height='21'></td>
<td align='middle' valign='middle' class='tabs' style='background-image:
url(https://www.homier.com/graphics/tab_bg.gif);'><a
href='https://www.homier.com/default.asp?dpt=4' class='link'
onmouseover="this.style.color='yellow'"
onmouseout="this.style.color='white'">Electronics</a></td>
<td align='middle' valign='top'><img
src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
height='21'></td>
<td align='middle' valign='middle' class='tabs' style='background-image:
url(https://www.homier.com/graphics/tab_bg.gif);'><a
href='https://www.homier.com/default.asp?dpt=6' class='link'
onmouseover="this.style.color='yellow'"
onmouseout="this.style.color='white'">Collectibles </a></td>
<td align='middle' valign='top'><img
src='https://www.homier.com/graphics/tab_end.gif' border='0' width='5'
height='21'></td>
</tr>
</table>
<table cellspacing='0' cellpadding='0' border='0'><tr>
<td align='middle' valign='top'><img
src='https://www.homier.com/graphics/tab_start.gif' border='0' width='5'
height='21'></td>
<td align='middle' valign='middle' class='tabs' style='background-image:
url(https://www.homier.com/graphics/tab_bg.gif);'><a
href='https://www.homier.com/default.asp?dpt=3' class='link'
onmouseover="this.style.color='yellow'"
onmouseout="this.style.color='white'">Outdoor Living</a></td>
<td align='middle' valign='top' class='tab_ends'><img
src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
height='21'></td>
<td align='middle' valign='middle' class='tabs' style='background-image:
url(https://www.homier.com/graphics/tab_bg.gif);'><a
href='https://www.homier.com/default.asp?dpt=5' class='link'
onmouseover="this.style.color='yellow'"
onmouseout="this.style.color='white'">Home Furnishings</a></td>
<td align='middle' valign='top' class='tab_ends'><img
src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
height='21'></td>
<td align='middle' valign='middle' class='tabs' style='background-image:
url(https://www.homier.com/graphics/tab_bg.gif);'><a
href='https://www.homier.com/default.asp?dpt=7' class='link'
onmouseover="this.style.color='yellow'"
onmouseout="this.style.color='white'">General Merchandise</a></td>
<td align='middle' valign='top' class='tab_ends'><img
src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
height='21'></td>
<td align='middle' valign='middle' class='tabs' style='background-image:
url(https://www.homier.com/graphics/tab_bg.gif);'><a
href='https://www.homier.com/default.asp?dpt=99' class='link'
onmouseover="this.style.color='yellow'"
onmouseout="this.style.color='white'">See All</a></td>
<td align='middle' valign='top'><img
src='https://www.homier.com/graphics/tab_end.gif' border='0' width='5'
height='21'></td>
</tr>
</table>
</td>
</tr>
</table>
<table cellspacing='0' cellpadding='0' border='0' width='770'>
<form action='https://www.homier.com/default.asp' method='post'>
<tr>
<td class='tab_background' height='28' valign='middle' align='left'
width='500' nowrap>
<input type='hidden' name='page' value='search'>
<input type='hidden' name='pgndx' value='1'>
&nbsp;&nbsp;Search in:
<select name='SearchIn' class='search'>
<option value='0'
SELECTED
All Departments</option> <option value='99'Catalog Number</option> <option value='1'Tools</option> <option value='2'Automotive</option> <option value='3'Outdoor Living</option> <option value='4'Electronics</option> <option value='5'Home Furnishings</option> <option value='6'Collectibles</option> <option value='7'General Merchandise</option>

</select>
&nbsp;for:
<input type='text' name='SearchFor' size='15' maxlength='30' class='search'
value=''>
<input type='image' src='https://www.homier.com/graphics/go.gif' border='0'
align='absmiddle' alt='Click to search'>
</td>
<td align='left' class='b8n_white_000f46' width='100%'
nowrap>800-348-5004</td>
<td align='right' class='b8n_white_000f46' width='100' nowrap>
<a href='https://www.homier.com/default.asp?page=logout'
class='service'><img src='https://www.homier.com/graphics/lock.gif'
width='11' height='15' border='0' align='absmiddle'>&nbsp;Sign Out&nbsp;
</a></td>
</tr>
</form>
</table>
<table cellspacing='0' cellpadding='0' border='0' width='770'>
<tr><td valign='top' align='center'>
<table cellpadding='0' cellspacing='0' border='0' width='750'>
<tr><td class='e16n_000f46_default'>Order Status & Tracking</td></tr>
<tr><td align='center'><img src='https://www.homier.com/graphics/grey.gif'
border='0' width='750' height='1'></td></tr>
<tr><td class='b9in_default_default' align='right'>Orders 8/1/2004 -
10/30/2004</td></tr>
<tr><td>&nbsp;</td></tr>
</table>
<table cellpadding='2' cellspacing='0' border='0' width='750'>
<tr>
<td class='b8n_default_default'>Order #</td>
<td class='b8n_default_default'>Ref</td>
<td class='b8n_default_default'>Order Date</td>
<td class='b8n_default_default'>Shipped To</td>
<td class='b8n_default_default'>Status</td>
<td class='b8n_default_default'>Tracking</td>
</tr>
<tr>
<td class='n8n_CCCCCC_default'><a
href='https://www.homier.com/default.asp?page=orderdetail&orderid=307377'>16
0710SE</a></td>
<td class='n8n_CCCCCC_default'>307377</td>
<td class='n8n_CCCCCC_default'>10/29/2004</td>
<td class='n8n_CCCCCC_default'>Stan Johnson Blue Springs, MO</td>
<td class='n8n_CCCCCC_default'>AR Processing</td>
<td class='n8n_CCCCCC_default'><a
href='https://www.homier.com/default.asp?page=tracking&trackingnumber='></a>
</td>
</tr>
</table>

</td>
</tr>
<tr><td colspan='2' align='center'><table cellpadding='0' cellspacing='0'
border='0'>
<tr><td align='center'>&nbsp;</td></tr>
<tr><td align='center'><img src='https://www.homier.com/graphics/grey.gif'
border='0' width='350' height='1'></td></tr>
<tr><td align='center'>&nbsp;</td></tr>
<tr><td align='center' class='n7n_default_default'>
<a href='https://www.homier.com/default.asp?page=privacy'
class='menu'>Privacy & Security</a>
| <a href='https://www.homier.com/default.asp?page=terms' class='menu'>Terms
of Use</a>
| <a href='https://www.homier.com/default.asp?page=pressreleases'
class='menu'>Press Releases</a>
</td></tr>
<tr><td align='center' class='n7n_default_default'>
<a href='https://www.homier.com/default.asp?page=sitemap' class='menu'>Site
Map</a>
| <a href='https://www.homier.com/default.asp?page=warranty'
class='menu'>Warranty & Returns</a>
| <a href='https://www.homier.com/default.asp?page=shipping'
class='menu'>Shipping Policy</a>
</td></tr>
<tr><td>&nbsp;</td></tr>
<tr><td align='center' class='copyright'><a
href='https://www.homier.com/default.asp?page=copyright'>Copyright</a>&nbsp;
&copy;2004, Homier Distributing Company. All rights reserved.</td></tr>
</table>
</td></tr>
</table>

</body>
</html>
Jul 17 '05 #1
0 2195

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Phong Ho | last post by:
Hi everyone, I try to write a simple web crawler. It has to do the following: 1) Open an URL and retrieve a HTML file. 2) Extract news headlines from the HTML file 3) Put the headlines into a...
1
by: Anton Pervukhin | last post by:
Hi everybody! While trying to implement a generic sorting function which takes a member function(on which base the actual sort happens) as a parameter, I have met the problem that I need to use...
0
by: Jason | last post by:
I have a primary form which is used to enter/edit data in a table named Test_Results. On this primary form there is a subform which displays site addresses. This subform is linked to the primary...
8
by: john | last post by:
I would like to develope a system using a web or non-web based client (FrontPage, Access, etc.) that can send requests to various travel web site (using our user name and password for each) and...
0
by: Vjay77 | last post by:
I posted this question, but I pressed 'post' and it disappeared. So once again: Problem: I need to go to lets say www.site.com/page.html Imagine that this html code is 6 mb long. I need to...
9
by: chrisspencer02 | last post by:
I am looking for a method to extract the links embedded within the Javascript in a web page: an ActiveX component, or example code in C++/Pascal/etc. I am looking for a general solution, not one...
4
by: rn5a | last post by:
A MS-Access DB has 3 tables - Teacher, Class & TeacherClass. The Teacher table has 2 columns - TeacherID & TeacherName (TeacherID being the primary key). The Class table too has 2 columns - ClassID...
2
by: zerodevice | last post by:
Hi, I'm trying to code my php that allows me to extract or fetch the html codes from another website, then i'll filter it myself to get only the specific text i want and display or echo it directly...
0
by: Formula | last post by:
Hello everybody,because I am newbie in python two weeks only but I had programming in another languages but the python take my heart there's 3 kind of arrays Wow now I hate JAVA :) . I am working...
1
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...
0
by: veera ravala | last post by:
ServiceNow is a powerful cloud-based platform that offers a wide range of services to help organizations manage their workflows, operations, and IT services more efficiently. At its core, ServiceNow...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 3 Jan 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). For other local times, please check World Time Buddy In...
0
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
2
by: jimatqsi | last post by:
The boss wants the word "CONFIDENTIAL" overlaying certain reports. He wants it large, slanted across the page, on every page, very light gray, outlined letters, not block letters. I thought Word Art...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.