473,320 Members | 1,724 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

HELP!! FAST way to read Parts of Big Files

Hi,

I have files I need to read, which contains records with a variable lenght.
What I need to do is Copy a Part of such a File to a new File, based on the
a Begin- and End-record.

I used this functions:
Dim intMyFile As Integer = FreeFile()
FileOpen(intMyFile, MakePathFile(strDirS, strFileS), OpenMode.Input,
OpenAccess.Read, OpenShare.Shared, -1)
Do While Not EOF(intMyFile)
strLine = LineInput(intMyFile)
If (intX >= intStartRec) And (intX <= intEndRec) Then
strNew = strNew & strLine & vbCrLf
End If
intX = intX + 1
Loop

It worked fine until I met some really big files. I have some files of 10
Mb, containing 75000 records... After 20 minutes my application still
doesn't have read the exact part.
I tryed this:
Dim fsFile As New FileStream(MakePathFile(strDirS, strFileS), FileMode.Open,
FileAccess.Read)
Dim brFile As New StreamReader(fsFile,
System.Text.Encoding.GetEncoding(1252), False, fsFile.Length - 1)
intX = 0
Do While intX <= intEndRec
strLine = brFile.ReadLine
If (intX >= intStartRec) And (intX <= intEndRec) Then
strNew = strNew & strLine & vbCrLf
End If
intX = intX + 1
Loop

But it's as slow as the other one.

Only one thing was really quick (only 10 seconds):
Dim fsFile As New FileStream(MakePathFile(strDirS, strFileS),
FileMode.Open, FileAccess.Read)
Dim brFile As New StreamReader(fsFile,
System.Text.Encoding.GetEncoding(1252), False, fsFile.Length - 1)
Dim strChar((intEndRec * 128) - 1) As Char
For intX = 0 To strChar.Length - 1
strChar(intX) = " "
Next
brFile.ReadBlock(strChar, intStartRec * 128, (intEndRec - intStartRec) *
128)

But here I had several stupid problems for which I din't really find a
solution:
- first of all: I'm having really big problems converting the Char to a
String (I tryed filing the Char with spaces and than Trim it but what about
spaces in the end of my File?)
- the ReadBlock works with character-positioning, and not with lines. Is
there a way to convert a line-position to a character-position or do a
ReadBlock with lines or something like that?
Anyhelp would really be appreciated!! I'm really stuck with this problem,
and it's kidn of urgent!! Any help regarding the ReadBlock or
StremReader-stuff, or on other (FAST!) way to do this would really be
appreciated!!

Thanks a lot in advance!!

Pieter


Nov 20 '05 #1
4 2024
DraguVaso <pi**********@hotmail.com> wrote:
I have files I need to read, which contains records with a variable lenght.
What I need to do is Copy a Part of such a File to a new File, based on the
a Begin- and End-record.

I used this functions:
Dim intMyFile As Integer = FreeFile()
FileOpen(intMyFile, MakePathFile(strDirS, strFileS), OpenMode.Input,
OpenAccess.Read, OpenShare.Shared, -1)
Do While Not EOF(intMyFile)
strLine = LineInput(intMyFile)
If (intX >= intStartRec) And (intX <= intEndRec) Then
strNew = strNew & strLine & vbCrLf
End If
intX = intX + 1
Loop

It worked fine until I met some really big files. I have some files of 10
Mb, containing 75000 records... After 20 minutes my application still
doesn't have read the exact part.


The problem is your string concatenation. Either create a StreamWriter
to start with, and just copy the relevant lines straight to disk, or
use a StringBuilder to build up the string in memory.

Just reading through the file should be very fast indeed.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 20 '05 #2
DraguVaso <pi**********@hotmail.com> wrote:
I have files I need to read, which contains records with a variable lenght.
What I need to do is Copy a Part of such a File to a new File, based on the
a Begin- and End-record.

I used this functions:
Dim intMyFile As Integer = FreeFile()
FileOpen(intMyFile, MakePathFile(strDirS, strFileS), OpenMode.Input,
OpenAccess.Read, OpenShare.Shared, -1)
Do While Not EOF(intMyFile)
strLine = LineInput(intMyFile)
If (intX >= intStartRec) And (intX <= intEndRec) Then
strNew = strNew & strLine & vbCrLf
End If
intX = intX + 1
Loop

It worked fine until I met some really big files. I have some files of 10
Mb, containing 75000 records... After 20 minutes my application still
doesn't have read the exact part.


The problem is your string concatenation. Either create a StreamWriter
to start with, and just copy the relevant lines straight to disk, or
use a StringBuilder to build up the string in memory.

Just reading through the file should be very fast indeed.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 20 '05 #3
Shit :-S
*enormous blush* :-(
I spend 2 hours trying to get it faster and getting frustrated, and I didn't
think about the string-concatenation :-(

Thanks a lot! You made my day! hehe :-)

Pieter

"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
DraguVaso <pi**********@hotmail.com> wrote:
I have files I need to read, which contains records with a variable lenght. What I need to do is Copy a Part of such a File to a new File, based on the a Begin- and End-record.

I used this functions:
Dim intMyFile As Integer = FreeFile()
FileOpen(intMyFile, MakePathFile(strDirS, strFileS), OpenMode.Input, OpenAccess.Read, OpenShare.Shared, -1)
Do While Not EOF(intMyFile)
strLine = LineInput(intMyFile)
If (intX >= intStartRec) And (intX <= intEndRec) Then
strNew = strNew & strLine & vbCrLf
End If
intX = intX + 1
Loop

It worked fine until I met some really big files. I have some files of 10 Mb, containing 75000 records... After 20 minutes my application still
doesn't have read the exact part.


The problem is your string concatenation. Either create a StreamWriter
to start with, and just copy the relevant lines straight to disk, or
use a StringBuilder to build up the string in memory.

Just reading through the file should be very fast indeed.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 20 '05 #4
Shit :-S
*enormous blush* :-(
I spend 2 hours trying to get it faster and getting frustrated, and I didn't
think about the string-concatenation :-(

Thanks a lot! You made my day! hehe :-)

Pieter

"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
DraguVaso <pi**********@hotmail.com> wrote:
I have files I need to read, which contains records with a variable lenght. What I need to do is Copy a Part of such a File to a new File, based on the a Begin- and End-record.

I used this functions:
Dim intMyFile As Integer = FreeFile()
FileOpen(intMyFile, MakePathFile(strDirS, strFileS), OpenMode.Input, OpenAccess.Read, OpenShare.Shared, -1)
Do While Not EOF(intMyFile)
strLine = LineInput(intMyFile)
If (intX >= intStartRec) And (intX <= intEndRec) Then
strNew = strNew & strLine & vbCrLf
End If
intX = intX + 1
Loop

It worked fine until I met some really big files. I have some files of 10 Mb, containing 75000 records... After 20 minutes my application still
doesn't have read the exact part.


The problem is your string concatenation. Either create a StreamWriter
to start with, and just copy the relevant lines straight to disk, or
use a StringBuilder to build up the string in memory.

Just reading through the file should be very fast indeed.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 20 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: googlinggoogler | last post by:
Hiya, The title says it all really, but im a newbie to python sort of. I can read in files and write files no probs. But what I want to do is read in a couple of files and output them to one...
5
by: Raj | last post by:
Hi all, Can anyone help me with a script which would delete files or move them to a different folder at some scheduled time..! Please.....!!! Thanks in advance...
1
by: Pranav Bagora | last post by:
Hello, I am getting a permission Denied error when i am trying to make changes in some read only files in a directory. How do we check and change the read only attributes of files in python. ...
2
by: Michael A. Covington | last post by:
I want to deploy a project in which the user is provided with a set of READ-ONLY files to use as templates. They will be in a directory to which the user can add files of his own. It's...
2
by: ViperCB | last post by:
Hello from a newbie, I am trying to do some research on an upcoming project that involves reading in audio files of various formats and using the audio signal as a source of noise to generate...
2
by: hzgt9b | last post by:
Using VB .NET 2003, I need to delete some read-only files. I tried using System.IO.File.Delete(filename) - but when the files are read-only, I get error messages stating that I don't have proper...
0
by: Ivt22 | last post by:
hi, if someone can help me how can i read all files form a CD one by one and from each file extract all tables in the file once the file ends go and open the next file and on and on... until the...
0
by: U S Contractors Offering Service A Non-profit | last post by:
This Sunday the 26th 2006 there will be Music @ Tue Nov Inbox Reply Craig Somerford to me show details 9:54 pm (26 minutes ago) #1St "CLICK" HeAt frOm A blanket...
0
by: kdsutaia | last post by:
hi! I hv to read two files from two diff directories and need to count pairs of words from both the files. as well I hv to keep track of no of documents where this terms present. All files...
14
by: Zoro | last post by:
My task is to read html files from disk and save them onto SQL Server database field. I have created an nvarchar(max) field to hold them. The problem is that some characters, particularly html...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.