473,597 Members | 2,146 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

how to read large CSV file

228 New Member
hi everyone
i'm trying to read data from CSV file file is about 1MB (or bigger)
i figured out how to read it quickly from the disc but it takes 3000 milliseconds to change it into arrays so my question is
is there a faster way of doing it?
thats my class:
Expand|Select|Wrap|Line Numbers
  1. public class ReadWrite {
  2.  
  3.     private static String filename = "myFile.csv";
  4.     private static File file;
  5.     private static Scanner in;
  6.     private static String str="";
  7.     private int[] time =     new int[100000];
  8.     private float[] value =  new float[100000];
  9.     private String [] date = new String[100000];
  10.     private String[] update =new String[100000];
  11.  
  12.     public static int count = 0;
  13.     public boolean hasNext = true;
  14.  
  15.     static private String str2="";
  16.  
  17.     /** Creates a new instance of ReadWrite */
  18.     public ReadWrite() throws FileNotFoundException, IOException {
  19.         file = new File(filename);
  20.         fileReader = new FileReader(file);
  21.  
  22.         readFile();
  23.  
  24.         in = new Scanner(this.str);
  25.         in.useDelimiter("(\t|\n)");
  26.  
  27.     }
  28. // thats where the problems are this method is to slow
  29. // slower then PHP :/  
  30. //(in php i would use explode("\n",str) and explode("\t",str)) 
  31.     public void read(){
  32.         while(in.hasNextInt()){
  33.             time[count] = in.nextInt() ;
  34.             value[count] = in.nextFloat();
  35.             date[count] = in.next();
  36.             update[count] = in.next();
  37.             count++;
  38.         }
  39.     }
  40.  ------------------------------------end of problems -----------------------------
  41.     public String toString(int i){
  42.         return time[i] + " " +value[i] +" "+ date[i];
  43.     }
  44.  
  45.     FileReader fileReader;
  46.     void readFile() throws IOException{
  47.         while(hasNext){
  48.             char [] c = new char[4096];
  49.             if(fileReader.read(c) > -1 ){
  50.                 str += String.valueOf(c);
  51.             }else{
  52.                 hasNext = false;                
  53.             }
  54.         }
  55.     }
  56.  
  57.  
  58. }
  59.  
Feb 3 '08 #1
4 9223
BigDaddyLH
1,216 Recognized Expert Top Contributor
I don't know about the speed of code, but first copying the entire file into a string is a whacky thing to do. Why not just read from the file?
Expand|Select|Wrap|Line Numbers
  1. in = new Scanner(new File(filename));
Also, your use of static fields needs to be rethought, but on another day.

Finally, why are you loading the entire file into memory at once? Can you avoid this?
Feb 4 '08 #2
jx2
228 New Member
I don't know about the speed of code, but first copying the entire file into a string is a whacky thing to do. Why not just read from the file?
Expand|Select|Wrap|Line Numbers
  1. in = new Scanner(new File(filename));
Also, your use of static fields needs to be rethought, but on another day.

Finally, why are you loading the entire file into memory at once? Can you avoid this?
actualy i used scanner but it was the slowest one
perhaps i 'm missing something importent
Expand|Select|Wrap|Line Numbers
  1.  str = in.nextLine(); 
is there a better way to read whole file in ( i mean faster way)
Feb 5 '08 #3
BigDaddyLH
1,216 Recognized Expert Top Contributor
There are libraries for working with CSV files. Google for them.
Feb 5 '08 #4
mattwarren
1 New Member
rather than use str += String.valueOf( c) try using a StringBuffer and use the append.

something like:
sb.append(Strin g.valueOf(c))
Feb 20 '08 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

11
9003
by: Sebastian Krause | last post by:
Hello, I tried to read in some large ascii files (200MB-2GB) in Python using scipy.io.read_array, but it did not work as I expected. The whole idea was to find a fast Python routine to read in arbitrary ascii files, to replace Yorick (which I use right now and which is really fast, but not as general as Python). The problem with scipy.io.read_array was, that it is really slow, returns errors when trying to process large files and it...
3
606
by: Johnny | last post by:
Hello all, I have a 1GB XML file that I need to read once a day and I would like to get feedback to find out what is the most efficient way to go about reading this file. The application reading this is in C# and I am using .NET 2.0. How can I read it without loading it all at once into memory? Will a simple XmlDocument and a XmlNodeIterator work? All feedback is extremely appreciated. Thanks!
75
5317
by: Greg McIntyre | last post by:
I have a Python snippet: f = open("blah.txt", "r") while True: c = f.read(1) if c == '': break # EOF # ... work on c Is some way to make this code more compact and simple? It's a bit spaghetti.
1
1770
by: David Arden Stevensonn | last post by:
Say I have an XML file on my website that gets read alot (by a c# aspx page) but written to occasionally (also by the same c# aspx page) . Its a simple caching situation based on time. Example: If x minutes have elapsed return the XML from the file and then rewrite the XML file for the next user and reset the clock. If x minutes havent elapsed just return the XML from the file. Do I need to be concerned for any concurrent read/writes or...
6
8450
by: Rolf Schroedter | last post by:
(Sorry for cross-posting). I need to access large files > 2GByte (Linux, WinXP/NTFS) using the standard C-library calls. Till today I thought I know how to do it, namely for Win32: Use open(), read(), _itelli64(), _lseeki64() with type __int64 Linux/Cygwin: #define _FILE_OFFSET_BITS 64 Use open(), read(), lseek() with type off_t
9
5139
by: sweety | last post by:
Dear All, How to encrypt a C data file and make binary file and then have to read a bin file at run time and decrypt the file and have to read the data. Any help to achive this pls. Would be great if any sample source code provided. Thanks, Sweety
35
11398
by: RyanS09 | last post by:
Hello- I am trying to write a snippet which will open a text file with an integer on each line. I would like to read the last integer in the file. I am currently using: file = fopen("f.txt", "r+"); fseek(file, -2, SEEK_END); fscanf(file, "%d", &c); this works fine if the integer is only a single character. When I get into larger numbers though (e.g. 502) it only reads in the 2. Is there
6
2309
by: comp.lang.php | last post by:
if (!function_exists('bigfile')) { /** * Works like file() in PHP except that it will work more efficiently with very large files * * @access public * @param mixed $fullFilePath * @return array $lineArray * @see actual_path */
3
2931
by: =?Utf-8?B?ZGF2aWQ=?= | last post by:
I try to follow Steve's paper to build a database, and store a small text file into SQL Server database and retrieve it later. Only difference between my table and Steve's table is that I use NTEXT datatype for the file instead of using IMAGE datatype. I can not use SqlDataReader to read the data. I need your help, Thanks. -David (1) I have a table TestFile for testing: ID int FileName navrchar(255)
2
5444
by: Kevin Ar18 | last post by:
I posted this on the forum, but nobody seems to know the solution: http://python-forum.org/py/viewtopic.php?t=5230 I have a zip file that is several GB in size, and one of the files inside of it is several GB in size. When it comes time to read the 5+GB file from inside the zip file, it fails with the following error: File "...\zipfile.py", line 491, in read bytes = self.fp.read(zinfo.compress_size) OverflowError: long it too large to...
0
7969
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
7886
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8272
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8035
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8258
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
5847
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5431
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
2404
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1494
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.