469,293 Members | 1,335 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,293 developers. It's quick & easy.

remove duplicates from file

hello all

excuse my english, I`m a beginer in java and I want to remove duplicate cities + rearrenging file`s lines elements from one 150 mb file, in my file I have that:

"68586240","68586367","IRVING"
"68586368","68586431","DENTON"
"68586432","68586495","DENTON"

and I want to become

"68586240","68586367","IRVING"
"68586368","68586495","DENTON"

i mean

1 2 x
3 4 x
5 6 y

become

1 4 x
5 6 y

I have allready done the program but i`m stuck . can anyone help me?

Expand|Select|Wrap|Line Numbers
  1.  
  2. import java.util.regex.*;
  3. import java.io.*;
  4.  
  5.     public class test {
  6.  
  7.         private static final String REGEX = ",";
  8.         private static Pattern pattern;
  9.  
  10.          public static void main(String[] argv) {
  11.  
  12.              pattern = Pattern.compile(REGEX);
  13.  
  14.             try{
  15.                 FileInputStream fstream = new FileInputStream("test.txt");
  16.                 DataInputStream in = new DataInputStream(fstream);
  17.                 BufferedReader br = new BufferedReader(new InputStreamReader(in));
  18.                 String strLine;
  19.  
  20.                 int i = 1;
  21.  
  22.  
  23.                 BufferedWriter make_file = new BufferedWriter(new FileWriter("file_temp.txt", true));
  24.                 String make_test = "\"TEST\",\"TEST\",\"TEST\"";
  25.                 make_file.write(make_test);
  26.                 make_file.close();
  27.  
  28.  
  29.                 BufferedWriter out = new BufferedWriter(new FileWriter("good.txt", true));                
  30.  
  31.                 while ((strLine = br.readLine()) != null)   {
  32.  
  33.                     //explode line                    
  34.                     String[] items = pattern.split(strLine);
  35.  
  36.                     // read + memory content from temp file    
  37.                     FileReader input = new FileReader("file_temp.txt");
  38.                     BufferedReader bufRead = new BufferedReader(input);
  39.                        String line = ""; 
  40.                     String temp = "";    
  41.                     line = bufRead.readLine();
  42.                     while (line != null){
  43.                         temp = temp+line;
  44.                         line = bufRead.readLine();                
  45.                     }
  46.                     bufRead.close();
  47.  
  48.                     // explode line from temp file
  49.                     String[] items_temp = pattern.split(temp);
  50.  
  51.                     //delete temp file
  52.                     new    File("file_temp.txt").delete();
  53.  
  54.                     BufferedWriter out_temp = new BufferedWriter(new FileWriter("file_temp.txt", true));
  55.                     out_temp.write(strLine);
  56.                     out_temp.close();
  57.  
  58.                     String x1 = items_temp[2];
  59.                     String x2 = items[2];
  60.  
  61.                     if (x1.equalsIgnoreCase(x2)) {
  62.                         // ??????????
  63.                     } 
  64.                     else {
  65.                         out.write(items[0]+","+items_temp[1]+","+items[2]+"\n");
  66.                         /* 
  67.                          input   output  correct 
  68.                          1 2 x   1 2 x   1 4 x
  69.                          3 4 x   5 6 y   5 6 y
  70.                          5 6 y
  71.                         */
  72.  
  73.                     }
  74.  
  75.                     System.out.println(i+"-"+strLine);            
  76.                     i++;
  77.  
  78.                 }
  79.                  out.close();
  80.                 in.close();
  81.             }
  82.             catch (Exception e){
  83.                 System.err.println("Error: " + e.getMessage());
  84.             }    
  85.  
  86.     }
  87.  
  88. }
  89.  
  90.  
tnx
Oct 4 '07 #1
6 9956
dmjpro
2,476 2GB
Sorry for less time in my hand to look after your code throughly :-)
Have a look at my code ....

Expand|Select|Wrap|Line Numbers
  1. BufferedReader in = new BufferedReader(new FileReader("src_file"));
  2. FileWriter out = new FileWriter("tgt_file");
  3. String line;
  4. HashSet hs = new HashSet();
  5. String op = "";
  6. while(!(line=in.readLine())!=null){
  7. StringTokenizer s = new StringTokenizer(line,",");
  8. String city;
  9. while(s.hasMoreTokes()) city = s.nextToken();
  10. op += hs.add(city) ? (line+System.getProperty("line.separator")) : "";
  11. //here it ignores duplicate items
  12. }
  13. out.write(op);
  14. in.close();
  15. op.close();
  16.  
Enjoy this code.
Good Luck :-)

Kind regards,
Dmjpro.
Oct 4 '07 #2
r035198x
13,262 8TB
Sorry for less time in my hand to look after your code throughly :-)
Have a look at my code ....

Expand|Select|Wrap|Line Numbers
  1. BufferedReader in = new BufferedReader(new FileReader("src_file"));
  2. FileWriter out = new FileWriter("tgt_file");
  3. String line;
  4. HashSet hs = new HashSet();
  5. String op = "";
  6. while(!(line=in.readLine())!=null){
  7. StringTokenizer s = new StringTokenizer(line,",");
  8. String city;
  9. while(s.hasMoreTokes()) city = s.nextToken();
  10. op += hs.add(city) ? (line+System.getProperty("line.separator")) : "";
  11. //here it ignores duplicate items
  12. }
  13. out.write(op);
  14. in.close();
  15. op.close();
  16.  
Enjoy this code.
Good Luck :-)

Kind regards,
Dmjpro.
@OP Do not use InputStreams to read text files. Use FileReader or Scanner
@dmjpro Do not use StringTokenizer
Oct 4 '07 #3
JosAH
11,448 Expert 8TB
@OP: what should be done with the following file contents:

Expand|Select|Wrap|Line Numbers
  1. 1 2 x
  2. 3 4 x
  3. 5 6 x
  4.  
or with this?

Expand|Select|Wrap|Line Numbers
  1. 1 2 x
  2. 3 4 y
  3. 5 6 x
  4.  
I think the problem description should be less ambiguous and more complete
to start with before we can start thinking of a solution.

kind regards,

Jos
Oct 4 '07 #4
more info

input :
1 2 x
3 4 x
5 6 y

output :
1 2 x
5 6 y

correct:
1 4 x
5 6 y

compare the third element of line 1 with the third element of line 2.
if equal join first element from line 1 with second element from line 2 and the third comun element in ... all in new file
if not equal put line 2 in new file

compare the third element of line 2 with the third element of line 3.
if equal join first element from line 2 with second element from line 3 and the third comun element ... all in new file
if not equal put line 3 in new file
.............................

compare the third element of line line n-1 with the third element of line n.
if equal join first element from line n-1 with second element from line n and the third comun element ... all in new file
if not equal put line n in new file

tnx all
Oct 4 '07 #5
dmjpro
2,476 2GB
more info

input :
1 2 x
3 4 x
5 6 y

output :
1 2 x
5 6 y

correct:
1 4 x
5 6 y

compare the third element of line 1 with the third element of line 2.
if equal join first element from line 1 with second element from line 2 and the third comun element in ... all in new file
if not equal put line 2 in new file

compare the third element of line 2 with the third element of line 3.
if equal join first element from line 2 with second element from line 3 and the third comun element ... all in new file
if not equal put line 3 in new file
.............................

compare the third element of line line n-1 with the third element of line n.
if equal join first element from line n-1 with second element from line n and the third comun element ... all in new file
if not equal put line n in new file

tnx all

Did you try my code :-)

Debasis Jana
Oct 5 '07 #6
JosAH
11,448 Expert 8TB
Did you try my code :-)

Debasis Jana
I noticed that you're spoonfeeding quite a bit of (incorrect) code lately; please don't do that.

Jos
Oct 5 '07 #7

Post your reply

Sign in to post your reply or Sign up for a free account.

By using this site, you agree to our Privacy Policy and Terms of Use.