473,569 Members | 2,601 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Reading a large text file line by line backwards

Is there a way through .net to read a very large text file (400MB+)
backwards line by line. In system.io the filestream class has a "seek"
method but the only read method requires you to know how many bytes to read
in.

My problem is that the line length of this log file is not constant so there
is no easy way to read one line in. The only thing that is constant is that
each line is terminated by a carriage return.

If my only option was to use the filestream class I suppose I can read data
back in chunks and try to parse the data and determine lines...

Is there a better way?
Amy.
Jul 21 '05 #1
2 11732
"Amy L." <am**@paxemail. com> wrote in message
news:ua******** ******@TK2MSFTN GP10.phx.gbl...
Is there a way through .net to read a very large text file (400MB+)
backwards line by line. In system.io the filestream class has a "seek"
method but the only read method requires you to know how many bytes to read in.

My problem is that the line length of this log file is not constant so there is no easy way to read one line in. The only thing that is constant is that each line is terminated by a carriage return.

If my only option was to use the filestream class I suppose I can read data back in chunks and try to parse the data and determine lines...

Is there a better way?
Amy.


About all you can do is seek to, say, 1024 bytes before the end; read 1024
bytes; work through them byte by byte to chop up into lines; seek to 1024
bytes before that; read 1024 bytes; and keep going.

The problem is that, as you know, "lines" mean nothing to the low-level disk
functions in the operating system. This isn't like an IBM mainframe where
files have a "record length" and are divided into "card images".

I suggested 1024 because the cluster size (allocation unit) is probably a
multiple of 1024, so you will be synchronized with the disk sector
boundaries; that should be mildly advantageous as regards speed.
Jul 21 '05 #2
ehubbell
1 New Member
Old thread, but Google grabs it so I'll add to it.

Here's how I did it in VB.NET. We have like 4G text files that we usually only want the bottom half of. This worked faster than just using readline forever (although on a local drive, I didn't see any difference at all). Just the basic framework:

'Imports System.IO
'Imports System.Text.Enc oding
Public Sub ReadTextFileBac kwards(ByVal sFilePath As String, ByVal sSearchString As String)
Dim i As Integer

Dim streamTextFile As Stream

streamTextFile = File.OpenRead(s FilePath)
streamTextFile. Seek(0, SeekOrigin.End)

Dim stringArray() As String
Dim sBuffer As String = ""

' the ideal block size is something of a holy grail, I guess. After some
' testing of sizes from 1K to 75K, it looked like speed really dropped off
' after the mid 60Ks or so. When testing for the minimum time to finish a task,
' the number 41K and 42K kept coming up, with occasional 34K thrown in for fun.
' There may be an ideal size for each file - I'm not really sure. What I do know
' is that over a network drive, this was 7X faster than just doing 'readline' until
' the cows come home. On a local drive, it doesn't make as much of a difference
' because the file access calls are similar in load to the overhead necessary
' when pulling large blocks. But when it is a network drive, the file access gets
' slower, so we're much better off minimizing the number of times we access
' the file.
Dim iBlockSize As Integer = 41000
Dim iFirstElement As Integer = 1

While streamTextFile. Position > 0
If streamTextFile. Position <= iBlockSize Then
iBlockSize = CInt(streamText File.Position)
iFirstElement = 0
End If
Dim byteArray(iBloc kSize - 1) As Byte
streamTextFile. Seek(-1 * iBlockSize, SeekOrigin.Curr ent)
streamTextFile. Read(byteArray, 0, byteArray.Lengt h)
streamTextFile. Seek(-1 * iBlockSize, SeekOrigin.Curr ent)
stringArray = Split(ASCII.Get String(byteArra y), vbCrLf)
stringArray(str ingArray.Length - 1) = stringArray(str ingArray.Length - 1) + sBuffer
For i = stringArray.Get UpperBound(0) To iFirstElement Step -1
If stringArray(i). Contains(sSearc hString) Then
MsgBox("Found It!")
End If
Next
sBuffer = stringArray(0)
End While

End Sub

I probably won't check back , but comments are always welcome. ~Ed
May 3 '06 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
17415
by: Jay | last post by:
I have a very large text file (being read by a CGI script on a web server), and I get memory errors when I try to read the whole file into a list of strings. The problem is, I want to read the file backwards, starting with the last line. Previously, I did: myfile = open('myfile.txt', 'r') mylines = myfile.readlines() myfile.close()
1
337
by: Amy L. | last post by:
Is there a way through .net to read a very large text file (400MB+) backwards line by line. In system.io the filestream class has a "seek" method but the only read method requires you to know how many bytes to read in. My problem is that the line length of this log file is not constant so there is no easy way to read one line in. The only...
11
2582
by: mkarja | last post by:
Hi, I'm trying to figure out how to read some range of rows from a file. Is it possible to search the file with some criteria and then when the search string is found read 3 rows before and after that row from where the search string was found ? Of course the row where the string was found should also be read. I have a log file in txt...
20
33021
by: sahukar praveen | last post by:
Hello, I have a question. I try to print a ascii file in reverse order( bottom-top). Here is the logic. 1. Go to the botton of the file fseek(). move one character back to avoid the EOF. 2. From here read a character, print it, move the file pointer (FILE*) to 2 steps back (using fseek(fp, -2, SEEK_CUR)) to read the previous character. ...
11
3034
by: Matt DeFoor | last post by:
I have some log files that I'm working with that look like this: 1000000000 3456 1234 1000000001 3456 1235 1000020002 3456 1223 1000203044 3456 986 etc. I'm trying to read the file backwards and just look at the first column. Here's what I've got so far:
6
6329
by: Rajorshi Biswas | last post by:
Hi folks, Suppose I have a large (1 GB) text file which I want to read in reverse. The number of characters I want to read at a time is insignificant. I'm confused as to how best to do it. Upon browsing through this group and other sources on the web, it seems that there are many ways to do it. Some suggest that simply fseek'ing to 8K bytes...
6
44009
by: Neil Patel | last post by:
I have a log file that puts the most recent record at the bottom of the file. Each line is delimited by a \r\n Does anyone know how to seek to the end of the file and start reading backwards?
3
2925
by: booksnore | last post by:
I have to read data from a flat file with millions of records. I wanted to find the most efficient way of doing this. I was just going to use a StreamReader and then break up the input line using Substring as there are no delimiters however I have a spec for the format of the file. Is using Substring the only way to do this or is there a more...
1
4712
by: stoogots2 | last post by:
I have written a Windows App in C# that needs to read a text file over the network, starting from the end of the file and reading backwards toward the beginning (looking for the last occurrence of a couple of strings in one line of text). I do not want to read the entire file, as it is very large, on a highly utilized server, and is updated with...
0
7701
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7615
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8130
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7677
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6284
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
3653
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3643
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1223
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
940
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.