User "Andy Dingley" wrote:
Don't make XML files that are 250MB in size.
It isn't file created by me. File contains about 100'000 records which I
import to my program. Everything is working. Unfortunately several records
in the file have errors which I want to correct. I don't want to write
additional code to be able to correct imported data. I prefer to make some
changes in source file. Of course I could write code for editing imported
data, but I don't need this functionality except for correcting mentioned
errors. I also have no access to editor which exported mentioned xml file.
User "Juergen Kahrs" wrote:
Use vim, the improved vi editor. I have edited such
large XML files with vi several times ....
Thanks! I've checked it and it's good solution for me.
With this configuration:
- set enc=utf-8 (UTF-8 encoding)
- set undolevels=-1 (maybe with this vim is faster ...)
efficiencies for subtasks of editing in gvim are:
- opening 250MB xml file: 15 seconds
- searching word (case sensitive): to 20 seconds (depending on its place
in file)
In my opinion it could be better because for example in Total
Commander's default viewer it takes only 2 seconds!
But it is acceptable, because I want only to make a few dozen of
changes.
- going to specified line of the file by specifying line number or by
draging vertical slider by mouse: veeeery long, so don't do this!
- making small changes (for example inserting and deleting some lines of
text; writing something): fluently
- writing changes to file (for example when we will do all changes): 15
seconds
I have Athlon 2500 with 1GB RAM. gvim uses only 300MB, so 512MB of RAM were
free.
User "Juergen Kahrs" wrote:
... and you hardly
notice the difference between 10 MB and 200 MB files.
Current versions of vim (when configured properly)
can also edit any UTF-8 characters, for example Japanese.
I can notice difference between searches which take 2 seconds and 20
seconds:) But you are right that "making small changes (for example
inserting and deleting some lines of text; writing something)" is very fast.
User "Joe Kesselman" wrote:
>Ather alternative is a stream editor -- the Unix tool "sed" or
something equivalent. Downside of that is that it isn't interactive; you
have to essentially write a program that tells it how to find the points
you want changed and what you want done with them.
I would prefer something interactive, because every change will be different
.... I dont want to write a program every time ...
>Or find/write a tool that will handle your document in chunks, either
text-based or SAX-based. Again, that presumes that what you're doing
divides up nicely.
Unfortunatelly I can't find such a tool ...
User
ac*******@yahoo.co.uk wrote:
>If you're on Windows you could try TextPad (you can get a full-featured
evaluation version to test) or EmEditor (free standard version with
most features).
Here are statistics with default configuration: ;)
- opening 250MB xml file: 70 seconds
- searching word at end of file: 45 seconds
- draging vertical slider by mouse: fluently:)
- making small changes (for example inserting and deleting some lines of
text; writing something): sometimes 0.5 second, sometimes 30 seconds :(((
30 seconds is long, but maybe it will be acceptable for someone ...
- writing changes to file (for example when we will do all changes): not
tested;)
P.S. Sorry for errors, my English isn't good.