By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,965 Members | 1,702 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,965 IT Pros & Developers. It's quick & easy.

extract web data

P: n/a
I want to extract web data from a news feed page
http://everling.nierchi.net/mmubulletins.php.
Just want to extract necessary info between open n closing tags of
<title>, <categoryand <link>. Whenever I initiated the extraction,
first news title is always "MMU Bulletin Board RSS Feed" with the
proper bulletin's link stored, but not the correct news title being
stored.

Necessary info only appears within <itemand </itemwhich consists
those <title>, <categoryand <link>.

<?php

include 'connect.php';

$URL="http://everling.nierchi.net/mmubulletins.php";

$f = fopen($URL, "r");
if($f){

$pre = "";
while(!feof($f))
{

$pre= fread($f, 1000);
$source = $source.$pre;
}
}
else
{
echo 'Unable to open '.$URL.'.';
die;
}


//extract the date into database
$datetime = date("Y-n-j");

$total= substr_count($source, "<item>");

//extract necessary information into database
$pos=0;
for($loop=0;$loop<$total;$loop++)
{
$line1 = strpos($source, "<title>", $pos);
$end1 = strpos($source, "</title>", $line1);
$line1 = $line1 + 7;
$end1 = $end1 - $line1;
$title = substr($source, $line1, $end1);
$title = convert($title);

$line2 = $line1 + $end1 + 1;
$line2 = strpos($source, "<category>", $line2);
$end2 = strpos($source , "</category>" , $line2);
$line2 = $line2 + 10;
$end2 = $end2 - $line2;
$category = substr($source , $line2, $end2);
$category = convert($category);

$line3 = $line2 + $end2 + 1;
$line3 = strpos($source , "<link>" , $line3);
$end3 = strpos($source , "</link>" , $line3);
$line3 = $line3 + 6;
$end3 = $end3 - $line3;
$link = substr($source , $line3 , $end3);
$link = convert($link);

$pos = $line3 + $end3 + 1;

$qry = "INSERT INTO `bul_data` (`DATE`, `TITLE`,
`DEPARTMENT`,`CAMPUS`, `LINK`) VALUES
( '$datetime','$title','$category','', '$link')";

$res = mysql_query($qry) OR die(mysql_error());

}

function convert($string)
{

$string = htmlspecialchars($string,ENT_QUOTES);
return $string;
}

?>

Feb 25 '07 #1
Share this Question
Share on Google+
1 Reply


P: n/a
"caine" <th*********@gmail.comwrote in message
news:11*********************@p10g2000cwp.googlegro ups.com...
I want to extract web data from a news feed page
http://everling.nierchi.net/mmubulletins.php.
Just want to extract necessary info between open n closing tags of
<title>, <categoryand <link>. Whenever I initiated the extraction,
first news title is always "MMU Bulletin Board RSS Feed" with the
proper bulletin's link stored, but not the correct news title being
stored.

Necessary info only appears within <itemand </itemwhich consists
those <title>, <categoryand <link>.
[snip]

Pull up that page in a browser and all you'll see is:

<?xml version="1.0" encoding="iso-8859-1" ?>
- <rss version="2.0">
- <channel>
<title>MMU Bulletin Board RSS Feed</title>
<link>http://bulletin.mmu.edu.my/</link>
<description>Yet another MMU Bulletin Board RSS Feed developed by
everling</description>
<ttl>15</ttl>
</channel>
</rss

Visiting the <linkspecified you'll (in part) see this:

"If any query needed , please contact we*******@mmu.edu.my"

Contact them.
Feb 25 '07 #2

This discussion thread is closed

Replies have been disabled for this discussion.