473,804 Members | 3,383 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

randomly write to a file

hi,
i am developing a desktop search.For the index of the files i have
developed an algorithm with which
i should be able to read and write to a line if i know its line
number.
i can read a specified line by using the module linecache
but i am struck as to how to implement writing to the n(th) line in a
file EFFICIENTLY
which means i don't want to traverse the file sequentially to reach
the n(th) line

Please help.
Regards
Rohit

May 7 '07 #1
10 2300
On May 7, 2:51 pm, rohit <rohitsethi...@ gmail.comwrote:
hi,
i am developing a desktop search.For the index of the files i have
developed an algorithm with which
i should be able to read and write to a line if i know its line
number.
i can read a specified line by using the module linecache
but i am struck as to how to implement writing to the n(th) line in a
file EFFICIENTLY
which means i don't want to traverse the file sequentially to reach
the n(th) line

Please help.
Regards
Rohit
Hi,

Looking through the archives, it looks like some recommend reading the
file into a list and doing it that way. And if they file is too big,
than use a database. See links below:

http://mail.python.org/pipermail/tut...ch/045571.html
http://mail.python.org/pipermail/tut...ch/045572.html

I also found this interesting idea that explains what would be needed
to accomplish this task:

http://mail.python.org/pipermail/pyt...il/076890.html

Have fun!

Mike

May 7 '07 #2
En Mon, 07 May 2007 16:51:37 -0300, rohit <ro***********@ gmail.com>
escribió:
i am developing a desktop search.For the index of the files i have
developed an algorithm with which
i should be able to read and write to a line if i know its line
number.
i can read a specified line by using the module linecache
but i am struck as to how to implement writing to the n(th) line in a
file EFFICIENTLY
which means i don't want to traverse the file sequentially to reach
the n(th) line
You can only replace a line in-place with another of exactly the same
length. If the lengths differ, you have to write the modified line and all
the following ones.
If all your lines are of fixed length, you have a "record". To read record
N (counting from 0):
a_file.seek(N*r ecord_length)
return a_file.read(rec ord_length)
And then you are reinventing ISAM.

--
Gabriel Genellina

May 7 '07 #3
Rohit,

Consider using an SQLite database. It comes with Python 2.5 and
higher. SQLite will do a nice job keeping track of the index. You can
easily find the line you need with a SQL query and your can write to
it as well. When you have a file and you write to one line of the
file, all of the rest of the lines will have to be shifted to
accommodate, the potentially larger new line.

-Nick Vatamaniuc
On May 7, 3:51 pm, rohit <rohitsethi...@ gmail.comwrote:
hi,
i am developing a desktop search.For the index of the files i have
developed an algorithm with which
i should be able to read and write to a line if i know its line
number.
i can read a specified line by using the module linecache
but i am struck as to how to implement writing to the n(th) line in a
file EFFICIENTLY
which means i don't want to traverse the file sequentially to reach
the n(th) line

Please help.
Regards
Rohit

May 7 '07 #4
nick,
i just wanted to ask for time constrained applications like searching
won't sqlite be a expensive approach.
i mean searching and editing o the files is less expensive by the time
taken .
so i need an approach which will allow me writing randomly to a line
in file without using a database
On May 8, 2:41 am, Nick Vatamaniuc <vatam...@gmail .comwrote:
Rohit,

Consider using an SQLite database. It comes with Python 2.5 and
higher. SQLite will do a nice job keeping track of the index. You can
easily find the line you need with a SQL query and your can write to
it as well. When you have a file and you write to one line of the
file, all of the rest of the lines will have to be shifted to
accommodate, the potentially larger new line.

-Nick Vatamaniuc
May 7 '07 #5
hi gabriel,
i am utilizing file names and their paths which are written to a file
on a singe line.
now if i use records that would be wasting too much space as there is
no limit on the no. of characters (at max) in the path.
next best approach i can think of is reading the file in memory
editing it and writing the portion that has just been altered and the
followiing lines
but is there a better approach you can highlight?
You can only replace a line in-place with another of exactly the same
length. If the lengths differ, you have to write the modified line and all
the following ones.
If all your lines are of fixed length, you have a "record". To read record
N (counting from 0):
a_file.seek(N*r ecord_length)
return a_file.read(rec ord_length)
And then you are reinventing ISAM.

--
Gabriel Genellina

May 7 '07 #6
On Mon, 07 May 2007 12:51:37 -0700, rohit wrote:
i can read a specified line by using the module linecache but i am
struck as to how to implement writing to the n(th) line in a file
EFFICIENTLY
which means i don't want to traverse the file sequentially to reach the
n(th) line
Unless you are lucky enough to be using an OS that supports random-access
line access to text files natively, if such a thing even exists, you
can't because you don't know how long each line will be.

If you can guarantee fixed-length lines, then you can use file.seek() to
jump to the appropriate byte position.

If the lines are random lengths, but you can control access to the files
so other applications can't write to them, you can keep an index table,
which you update as needed.

Otherwise, if the files are small enough, say up to 20 or 40MB each, just
read them entirely into memory.

Otherwise, you're out of luck.
--
Steven.
May 8 '07 #7
On Mon, 07 May 2007 14:41:02 -0700, Nick Vatamaniuc wrote:
Rohit,

Consider using an SQLite database. It comes with Python 2.5 and higher.
SQLite will do a nice job keeping track of the index. You can easily
find the line you need with a SQL query and your can write to it as
well. When you have a file and you write to one line of the file, all of
the rest of the lines will have to be shifted to accommodate, the
potentially larger new line.

Using an database for tracking line number and byte position -- isn't
that a bit overkill?

I would have thought something as simple as a list of line lengths would
do:

offsets = [35, # first line is 35 bytes long
19, # second line is 19 bytes long...
45, 12, 108, 67]
To get to the nth line, you have to seek to byte position:

sum(offsets[:n])

--
Steven.
May 8 '07 #8
Steven D'Aprano <st****@REMOVE. THIS.cybersourc e.com.auwrote:
On Mon, 07 May 2007 14:41:02 -0700, Nick Vatamaniuc wrote:
Rohit,

Consider using an SQLite database. It comes with Python 2.5 and higher.
SQLite will do a nice job keeping track of the index. You can easily
find the line you need with a SQL query and your can write to it as
well. When you have a file and you write to one line of the file, all of
the rest of the lines will have to be shifted to accommodate, the
potentially larger new line.


Using an database for tracking line number and byte position -- isn't
that a bit overkill?

I would have thought something as simple as a list of line lengths would
do:

offsets = [35, # first line is 35 bytes long
19, # second line is 19 bytes long...
45, 12, 108, 67]
To get to the nth line, you have to seek to byte position:

sum(offsets[:n])
....and then you STILL can't write there (without reading and rewriting
all the succeeding part of the file) unless the line you're writing is
always the same length as the one you're overwriting, which doesn't seem
to be part of the constraints in the OP's original application. I'm
with Nick in recommending SQlite for the purpose -- it _IS_ quite
"lite", as its name suggests. BSD-DB (a DB that's much more complicated
to use, being far lower-level, but by the same token affords you
extremely fine-grained control of operations) might be an alternative
IF, after first having coded the application with SQLite, you can indeed
prove, profiler in hand, that it's a serious bottleneck. However,
premature optimization is the root of all evil in programming.
Alex
May 8 '07 #9
On Mon, 07 May 2007 20:00:57 -0700, Alex Martelli wrote:
Steven D'Aprano <st****@REMOVE. THIS.cybersourc e.com.auwrote:
>On Mon, 07 May 2007 14:41:02 -0700, Nick Vatamaniuc wrote:
Rohit,

Consider using an SQLite database. It comes with Python 2.5 and
higher. SQLite will do a nice job keeping track of the index. You can
easily find the line you need with a SQL query and your can write to
it as well. When you have a file and you write to one line of the
file, all of the rest of the lines will have to be shifted to
accommodate, the potentially larger new line.


Using an database for tracking line number and byte position -- isn't
that a bit overkill?

I would have thought something as simple as a list of line lengths
would do:

offsets = [35, # first line is 35 bytes long
19, # second line is 19 bytes long... 45, 12, 108, 67]
To get to the nth line, you have to seek to byte position:

sum(offsets[:n])

...and then you STILL can't write there (without reading and rewriting
all the succeeding part of the file) unless the line you're writing is
always the same length as the one you're overwriting, which doesn't seem
to be part of the constraints in the OP's original application. I'm
with Nick in recommending SQlite for the purpose -- it _IS_ quite
"lite", as its name suggests.

Hang on, as I understand it, Nick just suggesting using SQlite for
holding indexes into the file! That's why I said it was overkill. So
whether the indexes are in a list or a database, you've _still_ got to
deal with writing to the file.

If I've misunderstood Nick's suggestion, if he actually meant to read the
entire text file into the database, well, that's just a heavier version
of reading the file into a list of strings, isn't it? If the database
gives you more and/or better functionality than file.readlines( ), then I
have no problem with using the right tool for the job.
--
Steven.
May 8 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
3274
by: Benny Ng | last post by:
Hi,All, Export Method: ------------------------------------------------------------------------- strFileNameExport = "Results" Response.Clear() Response.Buffer = True Response.ContentType ="application/vnd.ms-excel" 'application/msword
5
5295
by: fbwhite | last post by:
I know this issue has been brought up many times, but I have tried many of the solutions to no avail. I wanted to give my specific case to see if someone could be of any help. We are using the sessionstate inproc mode and users are randomly losing their session. I do not believe it is happening across all users at one time. It seems to happen to different users at different times, but I am only going off heresay. The aspnet worker...
3
1504
by: brian | last post by:
Hello, Can someone tell me how I can randomly assign a file to a variable from a directory. Example: Dim File as string I need to search through 'C:/Comics/' and have the program randomly pick a file from that path and assign it
2
2562
by: Scott Gordo | last post by:
<!--Newbie warning--> I've got a csv file with names, addresses, emails, etc. I've been asked to randomly select a name from a csv file. I've found plenty of RandNum examples, but I'm not sure how to apply it to these names. Can anyone point out a sample I could use? TIA! Scott
2
2500
by: TPK | last post by:
I have an HTML document with Javascript where I have a portion of the page contents, a series of questions, being pulled from a XML file. When the page dynamically builds all the questions and selection choices are listed top down in the order they appear in the XML file. Here is an example of the XML formatting: <question type="single_answer">The text of a question. <answer correct="no">A. Answer One <user_feedback>Incorrect...
1
1928
by: antony | last post by:
It run w/o error but no image appears. Please help me. Here si the code I do " <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> <title>New Page 1</title> </head> <body>
9
1716
by: Alan Isaac | last post by:
I need access to 2*n random choices for two types subject to a constraint that in the end I have drawn n of each. I first tried:: def random_types(n,typelist=): types = typelist*n random.shuffle(types) for next_type in types: yield next_type
4
1393
by: bb nicole | last post by:
Hi.. i would like to create a page that provide users to randomly run phone number. I know it is use rand() to code it, but how if i porvide a box for user to key in how many phone number they want to key in? For example, the user type 1000, then there are 1000 phone number will randomly run and come out as output.. And how to convert the result/output in .csv file? Thank a lot..:)
7
2270
by: Man4ish | last post by:
I have one pblm for reading a file randomly for searching the value with in given range. e.g. offset allele id 19 G/T 2066803 20 C/T 2066804 13 A/G 2066805 12 A/G 2066927 In the above file i have to search all the records which are in range (10,15). If i read whole file sequentially it will consume lot of time since data file is around 1GB. So I am planning...
0
10564
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10320
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10308
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9134
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7609
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6846
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5513
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5645
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4288
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.