pkirk25 wrote:
Quote:
My data is in a big file that I have no control over. Sometimes its
over 30 MB and often there are several of them.
>
It is machine generated and is nicely formatted. Example text follows:
>
AuctioneerSnapshotDB = {
["nordrassil-neutral"] = {
["nextAuctionId"] = 20,
["version"] = 1,
["updates"] = {
[1] = "15416.012;;0;0;0;0;0;0",
},
["auctions"] = {
[1] =
"16717;0;0;0;1;1650000;1650000;Boneglay;0;0;3;1159 391569;1159420369",
[2] =
"6661;0;0;0;1;399900;599900;Krius;0;0;2;1159391569 ;1159398769",
[3] =
"6657;0;0;1289192110;1;7300;7900;Bootyboy;0;0;4;11 59391569;1159477969",
[19] =
"9865;1191;0;680935487;1;5013;8000;Warmist;0;0;1;1 159391569;1159393369",
},
["ahKey"] = "nordrassil-neutral",
>
I think I will be able to find what I want and populate my structs by
looking for keywords like "nordrassil-neutral" and "ahKey". The code
is not pretty. In fact, it seem sot have works like "Fragile - Handle
with Care" stamped all over it.
>
A pseudocode version might read:
Copy each line into a temporary string
If we have found the keyword "nordrassil-neutral" and have found the
keyword "auctions"
if the line contains 10 ";"
populate the struct
} while we have not found the keyword "ahKey"
>
I can tell that there are 3 contigous "\t" before each numbered line.
But my question is if this is the right approach to a structured
document or is there a better way? I can see that there is a rational
structure but can't see how to use the formatted text better than my
brute force of counting approach.
It looks like the file is a collection of key-value pairs. The
keys are either identifiers (the top level only, apparently), or
strings or numbers enclosed in square brackets. The values are
either numbers, strings, or lists of key-value pairs enclosed in
curly braces and separated by commas. Your best bet would be to
parse the file based on that structure.
If you rely on counting tabs and semi-colons or the order of the
keys, you can be sure that the people maintaining the program
that generates the file will change those details without
bothering to tell you. The basic structure could change, too, of
course, but that is less likely.
--
Thomas M. Sommers --
tms@nj.net -- AB2SB