473,474 Members | 1,592 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Large XML file and some kind of indexing?

Hi Everyone,

I have a very large XML file (~1GB). I would like to essentially
pre-navigate the entire structure using an XmlReader and somehow index the
positions of important elements.

I suppose I'd hoped that I could access the exact file position of an xml
element, store that position with a unique id in a hashtable and then be
able to quickly seek back to that position at a later point to get the data
out.

The only way I can think of achieving this is to implement my own Stream
Reader which will allow seeking as well as returning accurate position
information (incorporating buffering etc).

I've seen documentation about special XPath Indexed Navigators but these
only work when the XML is in memory and that's definitely something I need
to avoid.

I don't suppose anyone has come across a problem like this, or perhaps have
some ideas about solving the problem?

Thanks,
Andrew
Jun 25 '07 #1
0 1857

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

36
by: Andrea Griffini | last post by:
I did it. I proposed python as the main language for our next CAD/CAM software because I think that it has all the potential needed for it. I'm not sure yet if the decision will get through, but...
3
by: chandra.somesh | last post by:
Hi I am having problem indexing large amount of textual data(in range of 200,000 to 1 million). I tried using map container but the compiler just hanged upon entering the data. On bit of...
6
by: Greg | last post by:
I am working on a project that will have about 500,000 records in an XML document. This document will need to be queried with XPath, and records will need to be updated. I was thinking about...
17
by: Luc Mercier | last post by:
Hi Folks, I'm new here, and I need some advice for what tool to use. I'm using XML for benchmarking purposes. I'm writing some scientific programs which I want to analyze. My program generates...
20
by: mike | last post by:
I help manage a large web site, one that has over 600 html pages... It's a reference site for ham radio folks and as an example, one page indexes over 1.8 gb of on-line PDF documents. The site...
7
by: =?Utf-8?B?TW9iaWxlTWFu?= | last post by:
Hello everyone: I am looking for everyone's thoughts on moving large amounts (actually, not very large, but large enough that I'm throwing exceptions using the default configurations). We're...
14
by: mfrsousa | last post by:
hi there, i have a huge large text file (350.000 lines) that i want to import to a MS Acccess Database, of course i don't want to use Access, but do it with C#. i already have tried the...
19
by: rmr531 | last post by:
First of all I am very new to c++ so please bear with me. I am trying to create a program that keeps an inventory of items. I am trying to use a struct to store a product name, purchase price,...
10
by: nflacco | last post by:
I'm tinkering around with a data collection system, and have come up with a very hackish way to store my data- for reference, I'm anticipating collecting at least 100 million different dataId...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.