473,385 Members | 1,357 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

how to efficiently find line number k

I have a big text file with millions of lines, and given any number k
as input, I want to output line k. What is the most efficient way to
do this, other than checking end-of-line k times?
Thanks
Nov 14 '05 #1
2 1364


b83503104 wrote:
I have a big text file with millions of lines, and given any number k
as input, I want to output line k. What is the most efficient way to
do this, other than checking end-of-line k times?
Thanks


No matter how you do it you will have to read every char before it.

BTW this Q is OT
Nov 14 '05 #2
b8*******@yahoo.com (b83503104) wrote:
I have a big text file with millions of lines, and given any number k
as input, I want to output line k. What is the most efficient way to
do this, other than checking end-of-line k times?


For a one-shot run, that _is_ the most efficient way. If you run this
search repeatedly, and you know that the text file is not going to
change, you can create an index file (or even an index in memory, if you
want to waste several millions of bytes on it). If you run this search
repeatedly, and the text file does change, probably the most efficient
solution is to get the program that changes it to also update the index.
Basically, you'd be turning your text file into a database of sorts.

Richard
Nov 14 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: elmlish | last post by:
Hello all, I'm currently befuddled as to how to efficiently test for a positive re. match then use the results of that match in a function. Mostly what I've seen people do is to first test for...
14
by: Klaus Neuner | last post by:
Hello, I need to gather information that is contained in various files. Like so: file1: ===================== foo : 1 2 bar : 2 4
22
by: fynali | last post by:
Hi all, I have two files: - PSP0000320.dat (quite a large list of mobile numbers), - CBR0000319.dat (a subset of the above, a list of barred bumbers) # head PSP0000320.dat CBR0000319.dat...
0
by: Fei Liu | last post by:
Yet another problem to deal with dynamic data type that can only be determined at run time. For a netCDF file (a scientific data format), a variable is defined with its associating dimensions, i.e....
31
by: krypto.wizard | last post by:
How to divide a number by 7 efficiently without using - or / operator. We can use the bit operators. I was thinking about bit shift operator but I don't know the correct answer.
9
by: | last post by:
I am interested in scanning web pages for content of interest, and then auto-classifying that content. I have tables of metadata that I can use for the classification, e.g. : "John P. Jones" "Jane...
1
by: Terry Reedy | last post by:
Dan Stromberg wrote: Since you do not need all 10**6 files sorted, you might also try the heapq module. The entries into the heap would be (time, fileid)
3
by: Laphan | last post by:
Hi All I use a MySQL DB with my ASP classic web app. I've been asked if I can create a routine whereby I get a random number of products (records) from the DB and display these on the site. ...
1
by: mkjame | last post by:
I need to find the best way to efficiently sanitize $_POST, $_GET variables by type of variable (if $_POST then verify if it is really an email, if $_POST; verify if it's a 5 digits number, etc...) ...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.