473,412 Members | 2,072 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,412 software developers and data experts.

Remove duplicate entries from TXT file ?

All,
Is it possible to use PHP to open/read a TXT file (i.e. IP.TXT) that
contains ip addresses (1 per line), remove any duplicates found, and
re-write the file back out to IP_NEW.TXT ?

The reason is, I have ips coming from several sources, i.e. URLSCAN logs,
MySQL dbs, etc. and I can use a DISTINCT on the individual queries, but I
then combine all the results into 1 file which may contain duplicates.

Thanks.
Jul 17 '05 #1
3 8622
"m|sf|t" <m|****@ampsycho.com> wrote in message
news:10*************@corp.supernews.com...
Is it possible to use PHP to open/read a TXT file (i.e. IP.TXT) that
contains ip addresses (1 per line), remove any duplicates found, and
re-write the file back out to IP_NEW.TXT ?


Something like this...

<?php

$handle = fopen("ip.txt");
$file_data = file($handle);
$file_data = array_unique($file_data);
fclose($handle);

$new_handle = fopen("ip_new.txt","w");
$new_file_data = implode("\n",$file_data);
fwrite($new_handle, $new_file_data);
fclose($new_handle);

?>

I'm pretty sure this works, but correct me if I'm wrong.

Chris Finke
Jul 17 '05 #2
> Something like this...

<?php

$handle = fopen("ip.txt");
$file_data = file($handle);
$file_data = array_unique($file_data);
fclose($handle);

$new_handle = fopen("ip_new.txt","w");
$new_file_data = implode("\n",$file_data);
fwrite($new_handle, $new_file_data);
fclose($new_handle);

?>


Many thanks - I will give yours a try. A helluva lot simpler than the
copy/pasting I did from the PHP.NET manuals. This is what I came up with
(don't laugh at how complex it is ;)

<?php

function my_array_unique($somearray)
{
$tmparr = array_unique($somearray);
$i = 0;
foreach ($tmparr as $v) {
$newarr[$i] = $v;
$i++;
}
return $newarr;
}

function remdupe ($filename, $return_max_lines = 0, $callback_func = null,
$do_rtrim = true, $buffer_size = 1024)
{
$open = fopen($filename, 'rb');
$open_data = array();
$line = 0;
while (!feof($open)) {
if ($do_rtrim) {
$open_data[$line] = rtrim(fgets($open, $buffer_size));
} else {
$open_data[$line] = fgets($open, $buffer_size);
}

if ($callback_func != null && function_exists($callback_func)) {
eval($callback_func . '($open_data[$line]);');
}
$line++;

if ($return_max_lines > 0) {
if ($line >= $return_max_lines) {
break;
}
}
}
fclose($open);
return $open_data;
}

$open = remdupe("ip_combined.txt");
$open = my_array_unique($open);

$fp = @fopen("ip.txt", "w");
$numElements = count($open);

for($counter = 0; $counter < $numElements - 1; $counter++) {
fwrite($fp, $open[$counter][0] . "\r\n");
}
fclose($fp);

?>
Jul 17 '05 #3
On 2005-01-17, m|sf|t <m|****@ampsycho.com> wrote:
All,
Is it possible to use PHP to open/read a TXT file (i.e. IP.TXT) that
contains ip addresses (1 per line), remove any duplicates found, and
re-write the file back out to IP_NEW.TXT ?


On a linux system

timvw@madoka:~$ cat ip.txt
200.12.123.45
134.58.127.1
134.58.126.3
134.58.127.1
timvw@madoka:~$ sort ip.txt | uniq
134.58.126.3
134.58.127.1
200.12.123.45

--
Met vriendelijke groeten,
Tim Van Wassenhove <http://www.timvw.info>
Jul 17 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: marx | last post by:
I have a bit of a problem and any help would be much appreciated. Problem: I have two dropdown list boxes with same data(all data driven). These are used for two separate entries. For every...
3
by: andreas.maurer1971 | last post by:
Hi all, since a few years I use the following statement to find duplicate entries in a table: SELECT t1.id, t2.id,... FROM table AS t1 INNER JOIN table AS t2 ON t1.field = t2.field WHERE...
5
by: Chris Lasher | last post by:
Hello Pythonistas! I'm looking for a way to duplicate entries in a symmetrical matrix that's composed of genetic distances. For example, suppose I have a matrix like the following: A B ...
14
by: BarrySDCA | last post by:
I have a database being populated by hits to a program on a server. The problem is each client connection may require a few hits in a 1-2 second time frame. This is resulting in multiple database...
5
by: Manish | last post by:
The topic is related to MySQL database. Suppose a table "address" contains the following records ------------------------------------------------------- | name | address | phone |...
1
by: JTreefrog | last post by:
Hello - I've read a ton of stuff about deleting duplicate values in an array. They are all very useful - they just haven't addressed an array of objects. Here's my array: var sDat = ; The...
1
by: robertstone | last post by:
The following code is meant to take a set of entry elements and while iterating through them using recursion, output a <spanspec> tag for each unique set of @namest and @nameend attributes found in...
6
by: teser3 | last post by:
I have my PHP inserting into Oracle 9i. But how do I prevent duplicate record entries? I only have 3 fields in the insert in the action page: CODE <?php $c=OCILogon("scott", "tiger",...
4
by: ramdil | last post by:
Hi All I have table and it have around 90000 records.Its primary key is autonumber field and it has also have date column and name, then some other columns Now i have problem with the table,as my...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.