Connecting Tech Pros Worldwide Help | Site Map

extract web data

 
LinkBack Thread Tools Search this Thread
  #1  
Old February 25th, 2007, 12:35 PM
caine
Guest
 
Posts: n/a
Default extract web data

I want to extract web data from a news feed page
http://everling.nierchi.net/mmubulletins.php.
Just want to extract necessary info between open n closing tags of
<title>, <categoryand <link>. Whenever I initiated the extraction,
first news title is always "MMU Bulletin Board RSS Feed" with the
proper bulletin's link stored, but not the correct news title being
stored.

Necessary info only appears within <itemand </itemwhich consists
those <title>, <categoryand <link>.

<?php

include 'connect.php';

$URL="http://everling.nierchi.net/mmubulletins.php";

$f = fopen($URL, "r");
if($f){

$pre = "";
while(!feof($f))
{

$pre= fread($f, 1000);
$source = $source.$pre;
}
}
else
{
echo 'Unable to open '.$URL.'.';
die;
}




//extract the date into database
$datetime = date("Y-n-j");

$total= substr_count($source, "<item>");

//extract necessary information into database
$pos=0;
for($loop=0;$loop<$total;$loop++)
{
$line1 = strpos($source, "<title>", $pos);
$end1 = strpos($source, "</title>", $line1);
$line1 = $line1 + 7;
$end1 = $end1 - $line1;
$title = substr($source, $line1, $end1);
$title = convert($title);

$line2 = $line1 + $end1 + 1;
$line2 = strpos($source, "<category>", $line2);
$end2 = strpos($source , "</category>" , $line2);
$line2 = $line2 + 10;
$end2 = $end2 - $line2;
$category = substr($source , $line2, $end2);
$category = convert($category);

$line3 = $line2 + $end2 + 1;
$line3 = strpos($source , "<link>" , $line3);
$end3 = strpos($source , "</link>" , $line3);
$line3 = $line3 + 6;
$end3 = $end3 - $line3;
$link = substr($source , $line3 , $end3);
$link = convert($link);

$pos = $line3 + $end3 + 1;

$qry = "INSERT INTO `bul_data` (`DATE`, `TITLE`,
`DEPARTMENT`,`CAMPUS`, `LINK`) VALUES
( '$datetime','$title','$category','', '$link')";

$res = mysql_query($qry) OR die(mysql_error());

}

function convert($string)
{

$string = htmlspecialchars($string,ENT_QUOTES);
return $string;
}

?>


  #2  
Old February 25th, 2007, 01:05 PM
McKirahan
Guest
 
Posts: n/a
Default Re: extract web data

"caine" <thensiujing@gmail.comwrote in message
news:1172409876.997638.19400@p10g2000cwp.googlegro ups.com...
Quote:
I want to extract web data from a news feed page
http://everling.nierchi.net/mmubulletins.php.
Just want to extract necessary info between open n closing tags of
<title>, <categoryand <link>. Whenever I initiated the extraction,
first news title is always "MMU Bulletin Board RSS Feed" with the
proper bulletin's link stored, but not the correct news title being
stored.
>
Necessary info only appears within <itemand </itemwhich consists
those <title>, <categoryand <link>.
[snip]

Pull up that page in a browser and all you'll see is:

<?xml version="1.0" encoding="iso-8859-1" ?>
- <rss version="2.0">
- <channel>
<title>MMU Bulletin Board RSS Feed</title>
<link>http://bulletin.mmu.edu.my/</link>
<description>Yet another MMU Bulletin Board RSS Feed developed by
everling</description>
<ttl>15</ttl>
</channel>
</rss

Visiting the <linkspecified you'll (in part) see this:

"If any query needed , please contact webmaster@mmu.edu.my"

Contact them.


 

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Popular Articles

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over 220,989 network members.