473,324 Members | 2,214 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

how to read large CSV file

228 100+
hi everyone
i'm trying to read data from CSV file file is about 1MB (or bigger)
i figured out how to read it quickly from the disc but it takes 3000 milliseconds to change it into arrays so my question is
is there a faster way of doing it?
thats my class:
Expand|Select|Wrap|Line Numbers
  1. public class ReadWrite {
  2.  
  3.     private static String filename = "myFile.csv";
  4.     private static File file;
  5.     private static Scanner in;
  6.     private static String str="";
  7.     private int[] time =     new int[100000];
  8.     private float[] value =  new float[100000];
  9.     private String [] date = new String[100000];
  10.     private String[] update =new String[100000];
  11.  
  12.     public static int count = 0;
  13.     public boolean hasNext = true;
  14.  
  15.     static private String str2="";
  16.  
  17.     /** Creates a new instance of ReadWrite */
  18.     public ReadWrite() throws FileNotFoundException, IOException {
  19.         file = new File(filename);
  20.         fileReader = new FileReader(file);
  21.  
  22.         readFile();
  23.  
  24.         in = new Scanner(this.str);
  25.         in.useDelimiter("(\t|\n)");
  26.  
  27.     }
  28. // thats where the problems are this method is to slow
  29. // slower then PHP :/  
  30. //(in php i would use explode("\n",str) and explode("\t",str)) 
  31.     public void read(){
  32.         while(in.hasNextInt()){
  33.             time[count] = in.nextInt() ;
  34.             value[count] = in.nextFloat();
  35.             date[count] = in.next();
  36.             update[count] = in.next();
  37.             count++;
  38.         }
  39.     }
  40.  ------------------------------------end of problems -----------------------------
  41.     public String toString(int i){
  42.         return time[i] + " " +value[i] +" "+ date[i];
  43.     }
  44.  
  45.     FileReader fileReader;
  46.     void readFile() throws IOException{
  47.         while(hasNext){
  48.             char [] c = new char[4096];
  49.             if(fileReader.read(c) > -1 ){
  50.                 str += String.valueOf(c);
  51.             }else{
  52.                 hasNext = false;                
  53.             }
  54.         }
  55.     }
  56.  
  57.  
  58. }
  59.  
Feb 3 '08 #1
4 9206
BigDaddyLH
1,216 Expert 1GB
I don't know about the speed of code, but first copying the entire file into a string is a whacky thing to do. Why not just read from the file?
Expand|Select|Wrap|Line Numbers
  1. in = new Scanner(new File(filename));
Also, your use of static fields needs to be rethought, but on another day.

Finally, why are you loading the entire file into memory at once? Can you avoid this?
Feb 4 '08 #2
jx2
228 100+
I don't know about the speed of code, but first copying the entire file into a string is a whacky thing to do. Why not just read from the file?
Expand|Select|Wrap|Line Numbers
  1. in = new Scanner(new File(filename));
Also, your use of static fields needs to be rethought, but on another day.

Finally, why are you loading the entire file into memory at once? Can you avoid this?
actualy i used scanner but it was the slowest one
perhaps i 'm missing something importent
Expand|Select|Wrap|Line Numbers
  1.  str = in.nextLine(); 
is there a better way to read whole file in ( i mean faster way)
Feb 5 '08 #3
BigDaddyLH
1,216 Expert 1GB
There are libraries for working with CSV files. Google for them.
Feb 5 '08 #4
rather than use str += String.valueOf(c) try using a StringBuffer and use the append.

something like:
sb.append(String.valueOf(c))
Feb 20 '08 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

11
by: Sebastian Krause | last post by:
Hello, I tried to read in some large ascii files (200MB-2GB) in Python using scipy.io.read_array, but it did not work as I expected. The whole idea was to find a fast Python routine to read in...
3
by: Johnny | last post by:
Hello all, I have a 1GB XML file that I need to read once a day and I would like to get feedback to find out what is the most efficient way to go about reading this file. The application reading...
75
by: Greg McIntyre | last post by:
I have a Python snippet: f = open("blah.txt", "r") while True: c = f.read(1) if c == '': break # EOF # ... work on c Is some way to make this code more compact and simple? It's a bit...
1
by: David Arden Stevensonn | last post by:
Say I have an XML file on my website that gets read alot (by a c# aspx page) but written to occasionally (also by the same c# aspx page) . Its a simple caching situation based on time. Example: If...
6
by: Rolf Schroedter | last post by:
(Sorry for cross-posting). I need to access large files > 2GByte (Linux, WinXP/NTFS) using the standard C-library calls. Till today I thought I know how to do it, namely for Win32: Use open(),...
9
by: sweety | last post by:
Dear All, How to encrypt a C data file and make binary file and then have to read a bin file at run time and decrypt the file and have to read the data. Any help to achive this pls. Would be...
35
by: RyanS09 | last post by:
Hello- I am trying to write a snippet which will open a text file with an integer on each line. I would like to read the last integer in the file. I am currently using: file = fopen("f.txt",...
6
by: comp.lang.php | last post by:
if (!function_exists('bigfile')) { /** * Works like file() in PHP except that it will work more efficiently with very large files * * @access public * @param mixed $fullFilePath * @return...
3
by: =?Utf-8?B?ZGF2aWQ=?= | last post by:
I try to follow Steve's paper to build a database, and store a small text file into SQL Server database and retrieve it later. Only difference between my table and Steve's table is that I use NTEXT...
2
by: Kevin Ar18 | last post by:
I posted this on the forum, but nobody seems to know the solution: http://python-forum.org/py/viewtopic.php?t=5230 I have a zip file that is several GB in size, and one of the files inside of it...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.