Connecting Tech Pros Worldwide Help | Site Map

extract web data

  #1  
Old February 25th, 2007, 01:35 PM
caine
Guest
 
Posts: n/a
I want to extract web data from a news feed page
http://everling.nierchi.net/mmubulletins.php.
Just want to extract necessary info between open n closing tags of
<title>, <categoryand <link>. Whenever I initiated the extraction,
first news title is always "MMU Bulletin Board RSS Feed" with the
proper bulletin's link stored, but not the correct news title being
stored.

Necessary info only appears within <itemand </itemwhich consists
those <title>, <categoryand <link>.

<?php

include 'connect.php';

$URL="http://everling.nierchi.net/mmubulletins.php";

$f = fopen($URL, "r");
if($f){

$pre = "";
while(!feof($f))
{

$pre= fread($f, 1000);
$source = $source.$pre;
}
}
else
{
echo 'Unable to open '.$URL.'.';
die;
}




//extract the date into database
$datetime = date("Y-n-j");

$total= substr_count($source, "<item>");

//extract necessary information into database
$pos=0;
for($loop=0;$loop<$total;$loop++)
{
$line1 = strpos($source, "<title>", $pos);
$end1 = strpos($source, "</title>", $line1);
$line1 = $line1 + 7;
$end1 = $end1 - $line1;
$title = substr($source, $line1, $end1);
$title = convert($title);

$line2 = $line1 + $end1 + 1;
$line2 = strpos($source, "<category>", $line2);
$end2 = strpos($source , "</category>" , $line2);
$line2 = $line2 + 10;
$end2 = $end2 - $line2;
$category = substr($source , $line2, $end2);
$category = convert($category);

$line3 = $line2 + $end2 + 1;
$line3 = strpos($source , "<link>" , $line3);
$end3 = strpos($source , "</link>" , $line3);
$line3 = $line3 + 6;
$end3 = $end3 - $line3;
$link = substr($source , $line3 , $end3);
$link = convert($link);

$pos = $line3 + $end3 + 1;

$qry = "INSERT INTO `bul_data` (`DATE`, `TITLE`,
`DEPARTMENT`,`CAMPUS`, `LINK`) VALUES
( '$datetime','$title','$category','', '$link')";

$res = mysql_query($qry) OR die(mysql_error());

}

function convert($string)
{

$string = htmlspecialchars($string,ENT_QUOTES);
return $string;
}

?>

  #2  
Old February 25th, 2007, 02:05 PM
McKirahan
Guest
 
Posts: n/a

re: extract web data


"caine" <thensiujing@gmail.comwrote in message
news:1172409876.997638.19400@p10g2000cwp.googlegro ups.com...
Quote:
I want to extract web data from a news feed page
http://everling.nierchi.net/mmubulletins.php.
Just want to extract necessary info between open n closing tags of
<title>, <categoryand <link>. Whenever I initiated the extraction,
first news title is always "MMU Bulletin Board RSS Feed" with the
proper bulletin's link stored, but not the correct news title being
stored.
>
Necessary info only appears within <itemand </itemwhich consists
those <title>, <categoryand <link>.
[snip]

Pull up that page in a browser and all you'll see is:

<?xml version="1.0" encoding="iso-8859-1" ?>
- <rss version="2.0">
- <channel>
<title>MMU Bulletin Board RSS Feed</title>
<link>http://bulletin.mmu.edu.my/</link>
<description>Yet another MMU Bulletin Board RSS Feed developed by
everling</description>
<ttl>15</ttl>
</channel>
</rss

Visiting the <linkspecified you'll (in part) see this:

"If any query needed , please contact webmaster@mmu.edu.my"

Contact them.


Closed Thread


Similar Threads
Thread Thread Starter Forum Replies Last Post
DataTable returned from a Web Service jsoques answers 9 July 23rd, 2008 06:45 AM
help!! *extra* tricky web page to extract data from... seberino@spawar.navy.mil answers 11 March 14th, 2007 02:05 AM
Web Application Data Extraction John answers 2 November 28th, 2006 05:45 PM
"Long binary data" in Access db Jerry answers 8 November 13th, 2005 12:12 PM