By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,264 Members | 1,237 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,264 IT Pros & Developers. It's quick & easy.

Remove duplicate entries from TXT file ?

P: n/a
All,
Is it possible to use PHP to open/read a TXT file (i.e. IP.TXT) that
contains ip addresses (1 per line), remove any duplicates found, and
re-write the file back out to IP_NEW.TXT ?

The reason is, I have ips coming from several sources, i.e. URLSCAN logs,
MySQL dbs, etc. and I can use a DISTINCT on the individual queries, but I
then combine all the results into 1 file which may contain duplicates.

Thanks.
Jul 17 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
"m|sf|t" <m|****@ampsycho.com> wrote in message
news:10*************@corp.supernews.com...
Is it possible to use PHP to open/read a TXT file (i.e. IP.TXT) that
contains ip addresses (1 per line), remove any duplicates found, and
re-write the file back out to IP_NEW.TXT ?


Something like this...

<?php

$handle = fopen("ip.txt");
$file_data = file($handle);
$file_data = array_unique($file_data);
fclose($handle);

$new_handle = fopen("ip_new.txt","w");
$new_file_data = implode("\n",$file_data);
fwrite($new_handle, $new_file_data);
fclose($new_handle);

?>

I'm pretty sure this works, but correct me if I'm wrong.

Chris Finke
Jul 17 '05 #2

P: n/a
> Something like this...

<?php

$handle = fopen("ip.txt");
$file_data = file($handle);
$file_data = array_unique($file_data);
fclose($handle);

$new_handle = fopen("ip_new.txt","w");
$new_file_data = implode("\n",$file_data);
fwrite($new_handle, $new_file_data);
fclose($new_handle);

?>


Many thanks - I will give yours a try. A helluva lot simpler than the
copy/pasting I did from the PHP.NET manuals. This is what I came up with
(don't laugh at how complex it is ;)

<?php

function my_array_unique($somearray)
{
$tmparr = array_unique($somearray);
$i = 0;
foreach ($tmparr as $v) {
$newarr[$i] = $v;
$i++;
}
return $newarr;
}

function remdupe ($filename, $return_max_lines = 0, $callback_func = null,
$do_rtrim = true, $buffer_size = 1024)
{
$open = fopen($filename, 'rb');
$open_data = array();
$line = 0;
while (!feof($open)) {
if ($do_rtrim) {
$open_data[$line] = rtrim(fgets($open, $buffer_size));
} else {
$open_data[$line] = fgets($open, $buffer_size);
}

if ($callback_func != null && function_exists($callback_func)) {
eval($callback_func . '($open_data[$line]);');
}
$line++;

if ($return_max_lines > 0) {
if ($line >= $return_max_lines) {
break;
}
}
}
fclose($open);
return $open_data;
}

$open = remdupe("ip_combined.txt");
$open = my_array_unique($open);

$fp = @fopen("ip.txt", "w");
$numElements = count($open);

for($counter = 0; $counter < $numElements - 1; $counter++) {
fwrite($fp, $open[$counter][0] . "\r\n");
}
fclose($fp);

?>
Jul 17 '05 #3

P: n/a
On 2005-01-17, m|sf|t <m|****@ampsycho.com> wrote:
All,
Is it possible to use PHP to open/read a TXT file (i.e. IP.TXT) that
contains ip addresses (1 per line), remove any duplicates found, and
re-write the file back out to IP_NEW.TXT ?


On a linux system

timvw@madoka:~$ cat ip.txt
200.12.123.45
134.58.127.1
134.58.126.3
134.58.127.1
timvw@madoka:~$ sort ip.txt | uniq
134.58.126.3
134.58.127.1
200.12.123.45

--
Met vriendelijke groeten,
Tim Van Wassenhove <http://www.timvw.info>
Jul 17 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.