Array problems 
July 17th, 2005, 12:32 AM
| | | Array problems
Hello,
How I can realize that?
I have this code:
<?php
$url = "http://www.URL.com;
$content = file($url);
foreach($content as $line){
$pattern =
"/([\w]+:\/\/[\w-?&;#~=\.\/\@]+[\w\/](\.(html|php|shtml|htm|xhtml|xml)))/i";
$count = 0;
if(preg_match_all($pattern,$line,$urls_back_array) ){
foreach($urls_back_array[0] as $url_back){
$count++;
echo $url_back;
}
}
}
?>
Now I want to make a loop - My script should count all links of all my *html
sites. But the script are not allowed to count double! Also the script shall
count all links on html sites correctly!
Example:
Home
|-Web
|-Forum
|--Site 1
|--Site 2
|--Site 3
|-Download
It should count 7 and list me all! =)
Gretting from Germany. | 
July 17th, 2005, 12:32 AM
| | | Re: Array problems
"Sven Dzepina" <mail@styleswitch.de> wrote in message
news:3f8c1b25$0$11958$9b4e6d93@newsread4.arcor-online.net...[color=blue]
> Hello,
>
> How I can realize that?
> I have this code:
>
> <?php
> $url = "http://www.URL.com;
> $content = file($url);
> foreach($content as $line){
> $pattern =
>[/color]
"/([\w]+:\/\/[\w-?&;#~=\.\/\@]+[\w\/](\.(html|php|shtml|htm|xhtml|xml)))/i";[color=blue]
>
> $count = 0;
> if(preg_match_all($pattern,$line,$urls_back_array) ){
> foreach($urls_back_array[0] as $url_back){
> $count++;
> echo $url_back;
> }
> }
> }
> ?>
>
> Now I want to make a loop - My script should count all links of all my[/color]
*html[color=blue]
> sites. But the script are not allowed to count double! Also the script[/color]
shall[color=blue]
> count all links on html sites correctly!
> Example:
> Home
> |-Web
> |-Forum
> |--Site 1
> |--Site 2
> |--Site 3
> |-Download
> It should count 7 and list me all! =)
>
> Gretting from Germany.
>
>[/color]
I'm playing around here trying to do what you want to do... I'm not good
with my regular expressions using preg tools but I am using a mixture of
implode and explode to get at the url of each link (ie the "href=" bit in
the <A HREF" tag)... Once I have the website address that the link is
targeted at, I plan on using a mix of parse_url() and pathinfo() to identify
html type files. And in order to avoid duplices, the address will be
written in to an array which I will then run against array_unique.
Do these ideas help any? | 
July 17th, 2005, 12:32 AM
| | | Re: Array problems
Sven Dzepina wrote:
[...][color=blue]
> $count = 0;
> if(preg_match_all($pattern,$line,$urls_back_array) ){
> foreach($urls_back_array[0] as $url_back){
> $count++;
> echo $url_back;[/color]
[...]
I didn't check your regex.
I'd do it somewhat differently:
after preg_match_all() put the URLs into the index part of an array
### should this be __1__ ?
foreach ($urls_back_array[0] as $url_back) {
$large_url_array[$url_back]++;
## no echo
}
} ## if
} ## foreach
## echo now!
$total_count = 0;
$unique_urls = 0;
foreach ($large_url_array as $url=>$count) {
echo $url, ' : appears ', $count, ' times<br />';
$total_count += $count;
$unique_urls++;
}
echo '<br />Unique URLs: ', $unique_urls, '<br />';
echo '<br />Total links: ', $total_count, '<br />';
NOTE: This was typed directly in the editor and not tested.
--
I have a spam filter working.
To mail me include "urkxvq" (with or without the quotes)
in the subject line, or your mail will be ruthlessly discarded. | 
July 17th, 2005, 12:33 AM
| | | Re: Array problems
Hello Rondell,
perhaps I've explained my aim imprecisely.
I want to count all Sites, which are linked on a homepage and list them.
My earlier solution was, that I have scanned all links and then I have
listed them all in a database.
But, it was a loop and so I fetched all links from the database to scann
them, too!
I didn't thought on this problem:
If I scann all links and insert them into a database, and I fetch them in
the same loop - Then I get always the same links.
Gretting.
"Randell D." <you.can.email.me.at.randelld@yahoo.com> schrieb im Newsbeitrag
news:NQ_ib.98843$6C4.43373@pd7tw1no...[color=blue]
>
> "Sven Dzepina" <mail@styleswitch.de> wrote in message
> news:3f8c1b25$0$11958$9b4e6d93@newsread4.arcor-online.net...[color=green]
> > Hello,
> >
> > How I can realize that?
> > I have this code:
> >
> > <?php
> > $url = "http://www.URL.com;
> > $content = file($url);
> > foreach($content as $line){
> > $pattern =
> >[/color]
>[/color]
"/([\w]+:\/\/[\w-?&;#~=\.\/\@]+[\w\/](\.(html|php|shtml|htm|xhtml|xml)))/i";[color=blue][color=green]
> >
> > $count = 0;
> > if(preg_match_all($pattern,$line,$urls_back_array) ){
> > foreach($urls_back_array[0] as $url_back){
> > $count++;
> > echo $url_back;
> > }
> > }
> > }
> > ?>
> >
> > Now I want to make a loop - My script should count all links of all my[/color]
> *html[color=green]
> > sites. But the script are not allowed to count double! Also the script[/color]
> shall[color=green]
> > count all links on html sites correctly!
> > Example:
> > Home
> > |-Web
> > |-Forum
> > |--Site 1
> > |--Site 2
> > |--Site 3
> > |-Download
> > It should count 7 and list me all! =)
> >
> > Gretting from Germany.
> >
> >[/color]
>
> I'm playing around here trying to do what you want to do... I'm not good
> with my regular expressions using preg tools but I am using a mixture of
> implode and explode to get at the url of each link (ie the "href=" bit in
> the <A HREF" tag)... Once I have the website address that the link is
> targeted at, I plan on using a mix of parse_url() and pathinfo() to[/color]
identify[color=blue]
> html type files. And in order to avoid duplices, the address will be
> written in to an array which I will then run against array_unique.
>
> Do these ideas help any?
>
>[/color] | | Thread Tools | Search this Thread | | | |
Posting Rules
| You may not post new threads You may not post replies You may not post attachments You may not edit your posts HTML code is Off | | | | | | What is Bytes?
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over 220,989 network members.
|