473,385 Members | 1,279 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

KEYWORDS from a string

Nel
Hi all,

Before I re-invent the wheel here, has anyone willing to share a basic
script to extract META keywords from a string. I have a string, let's say
$pageText that contains the dynamic contents of the page.

Ideally, I don't just want to explode the string and remove "and", "or" and
"the" etc. because some the the repeated keywords may be more that one word
long.

Also, it would be good to be able to rank the keywords according to the
frequency.

I have searched google and hotscripts etc. Can only find web sites to
create METAs to copy & paste.

Thanx in advance.

Nel
Jul 17 '05 #1
4 4379
"Nel" <ne***@ne14.co.NOSPAMuk> wrote in message
news:41***********************@ptn-nntp-reader02.plus.net...
Hi all,

Before I re-invent the wheel here, has anyone willing to share a basic
script to extract META keywords from a string. I have a string, let's say
$pageText that contains the dynamic contents of the page.

Ideally, I don't just want to explode the string and remove "and", "or" and "the" etc. because some the the repeated keywords may be more that one word long.

Also, it would be good to be able to rank the keywords according to the
frequency.

I have searched google and hotscripts etc. Can only find web sites to
create METAs to copy & paste.

Thanx in advance.

Nel


See documentation for get_meta_tags().
Jul 17 '05 #2
"Chung Leong" wrote:
"Nel" <ne***@ne14.co.NOSPAMuk> wrote in message
news:41001aae[quote:362d827a4d="Chung Leong"]"Nel" <ne***@ne14.co.NOSPAMuk> wrote in message
news:41***********************@ptn-nntp-reader02.plus.net... Hi all,

Before I re-invent the wheel here, has anyone willing to share a basic script to extract META keywords from a string. I have a string, let’s say $pageText that contains the dynamic contents of the page.

Ideally, I don’t just want to explode the string and remove "and", "or"
and "the" etc. because some the the repeated keywords may be more that one
word long.

Also, it would be good to be able to rank the keywords according to the frequency.

I have searched google and hotscripts etc. Can only find web sites to create METAs to copy & paste.

Thanx in advance.

Nel


See documentation for get_meta_tags().[/quote:362d827a4d]
47**********@ptn-nntp-reader02.plus.net...
Hi all,

Before I re-invent the wheel here, has anyone willing to share a

basic
script to extract META keywords from a string. I have a string,

let’s say
$pageText that contains the dynamic contents of the page.

Ideally, I don’t just want to explode the string and remove

"and", "or"
and
"the" etc. because some the the repeated keywords may be more

that one
word
long.

Also, it would be good to be able to rank the keywords according

to the
frequency.

I have searched google and hotscripts etc. Can only find web

sites to
create METAs to copy & paste.

Thanx in advance.

Nel


See documentation for get_meta_tags().


The reply above answers your question if you are looking for strict
definition of meta tags in html.

If by meta you mean keywords that are important and somewhat unique in
the body of the text, then I suggest that you need to have a
definition for common keywords, and then remove them to arrive at
"meta". The way I do it is to start with mysql stop words (search
on web). Then add words that are common in your domain (e.g. "html"
may be a common word on the web). Now remove all of these words from
the string using regular expressions, and what remains is pretty much
unique words.

--
http://www.dbForumz.com/ This article was posted by author's request
Articles individually checked for conformance to usenet standards
Topic URL: http://www.dbForumz.com/PHP-KEYWORDS...ict132415.html
Visit Topic URL to contact author (reg. req'd). Report abuse: http://www.dbForumz.com/eform.php?p=442256
Jul 17 '05 #3
"Chung Leong" wrote:
"Nel" <ne***@ne14.co.NOSPAMuk> wrote in message
news:41001aae[quote:362d827a4d="Chung Leong"]"Nel" <ne***@ne14.co.NOSPAMuk> wrote in message
news:41***********************@ptn-nntp-reader02.plus.net... Hi all,

Before I re-invent the wheel here, has anyone willing to share a basic script to extract META keywords from a string. I have a string, let’s say $pageText that contains the dynamic contents of the page.

Ideally, I don’t just want to explode the string and remove "and", "or"
and "the" etc. because some the the repeated keywords may be more that one
word long.

Also, it would be good to be able to rank the keywords according to the frequency.

I have searched google and hotscripts etc. Can only find web sites to create METAs to copy & paste.

Thanx in advance.

Nel


See documentation for get_meta_tags().[/quote:362d827a4d]
47**********@ptn-nntp-reader02.plus.net...
Hi all,

Before I re-invent the wheel here, has anyone willing to share a

basic
script to extract META keywords from a string. I have a string,

let’s say
$pageText that contains the dynamic contents of the page.

Ideally, I don’t just want to explode the string and remove

"and", "or"
and
"the" etc. because some the the repeated keywords may be more

that one
word
long.

Also, it would be good to be able to rank the keywords according

to the
frequency.

I have searched google and hotscripts etc. Can only find web

sites to
create METAs to copy & paste.

Thanx in advance.

Nel


See documentation for get_meta_tags().


The reply above answers your question if you are looking for strict
definition of meta tags in html.

If by meta you mean keywords that are important and somewhat unique in
the body of the text, then I suggest that you need to have a
definition for common keywords, and then remove them to arrive at
"meta". The way I do it is to start with mysql stop words (search
on web). Then add words that are common in your domain (e.g. "html"
may be a common word on the web). Now remove all of these words from
the string using regular expressions, and what remains is pretty much
unique words.

--
http://www.dbForumz.com/ This article was posted by author's request
Articles individually checked for conformance to usenet standards
Topic URL: http://www.dbForumz.com/PHP-KEYWORDS...ict132415.html
Visit Topic URL to contact author (reg. req'd). Report abuse: http://www.dbForumz.com/eform.php?p=442256
Jul 17 '05 #4
Nel
Here is the final script I put together thanks to your help and suggestions.
It will automatically work through a string and remove duplicates, new lines
and punctuation before listing the keywords within a meta tag.

If anyone can offer any improvements I am open to suggestions.

Nel.
_____________________________________________

<?php // metatags.inc.php
// Create keyword META tags from dynamic page content

// test string from BBC News
echo metatags("Tony Blair has nominated long-time ally Peter Mandelson as
Britain's next European commissioner.
The announcement was made after Mr Blair spoke to new European Commission
President Jose Manuel Durao Barroso on Friday morning.

The appointment represents a remarkable comeback for Mr Mandelson, who has
twice resigned from the Cabinet in controversial

circumstances.

It will also trigger a Westminster by-election in his Hartlepool seat.

'Positive response'

In a statement, Mr Mandelson said he was \"delighted\" to have been
nominated for the post by the prime minister, but confirmed that

he had \"agonised\" over whether the job was right for him.");

function metatags($pagetext)
{
// Define variables for this web site
$websitename = "Example's Web Site";
$metadescription = "This web site's description";
$metakeywords = cleankeywords($pagetext);

// Build up META TAGS
$metatags = " <meta name=\"Name\" content=\"$websitename\">\n";
$metatags .= " <meta name=\"Rating\" content=\"General\">\n";
$metatags .= " <meta name=\"Robots\" content=\"Index\">\n";
$metatags .= " <meta name=\"Revisit-After\" content=\"14 days\">\n";
$metatags .= " <meta name=\"DESCRIPTION\"
content=\"$metadescription\">\n";
$metatags .= " <meta name=\"KEYWORDS\"
content=\"$websitename,$metakeywords\">\n";

return $metatags;
}
function cleankeywords($term)
{
//Specify text file containing stop words (one on each line)
$stopwords_file = "stopwords.txt";

//Remove punctuation and \n \r
$pat = array("/\./s","/\,/s","/\"/s","/\'/s","/\n/s","/\r/s");
$term = preg_replace($pat, "", $term);

//load list of common words
$common = file($stopwords_file);
$total = count($common);
for ($x=0; $x<= $total; $x++)
{
$common[$x] = trim(strtolower($common[$x]));
}

//make array of search terms
$_terms = explode(" ", $term);

foreach ($_terms as $line)
{
if (!in_array(strtolower(trim($line)), $common))
{
$cleanterm[$line] = $line;
}
}
$cleanwords = implode(", ", $cleanterm);
return $cleanwords;
}
?>
Jul 17 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Berteun Damman | last post by:
Hello, I'm having some problems with pyparsing, I could not find how to tell it to view certain words as keywords, i.e. not as a possible variable name (in an elegant way), for example, I have...
14
by: Jason Heyes | last post by:
I want to write a class that supports operations on keywords belonging to the C++ programming language. The following program repeatedly prompts the user for a keyword until 'explicit' is finally...
3
by: Jason Heyes | last post by:
This is a revised version of a post entitled "Class to support keywords". Please reply to this post instead of the old one. The following program repeatedly prompts the user for C++ keywords...
5
by: Digital.Rebel.18 | last post by:
I'm trying to figure out how to extract the keywords from an HTML document. The input string would typically look like: <meta name='keywords' content='word1, more stuff, etc'> Either single...
9
by: Nenad Loncarevic | last post by:
I am a geologist, and over the years I've accumulated quite a number of proffesional papers on the subject, in various publications. I would like to make a database that would help me find the...
3
by: Richard S | last post by:
CODE: ASP.NET with C# DATABASE: ACCES alright, im having a problem, probably a small thing, but i cant figure out, nor find it in any other post, or on the internet realy (probably cuz i wouldnt...
5
by: mforema | last post by:
Hi Everyone, I want to search records by typing in multiple keywords. I currently have a search form. It has a combo box, text box, Search command button, and a subform. The combo box lists the...
5
by: =?Utf-8?B?UGV0ZXI=?= | last post by:
How can I get the list of connection string's keywords available in sqlclient programmatically? I have found the list in here...
1
Ajm113
by: Ajm113 | last post by:
Hello everyone. Ok this is my goal: When a user sorts by number of keywords and lets say one result had 5 keywords of what the user entered in that has a better chance on being up top. I already...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.