473,799 Members | 3,210 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Huge XML data needed

I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.
Any idea, where i can get this data ??
Jul 20 '05 #1
15 1969
On 1 Apr 2004 06:45:29 -0800, bc******@web.de (Beda Christoph
Hammerschmidt) wrote:
Any idea, where i can get this data ??


Make it yourself. That way you can control the size and the
distribution of certain features. If this process is automated, then
you can easily run tests over and over with different parameters.

It's often useful (but rarely done) to test, not just that "it works",
but to test for sensitivity to different sorts of load. Does
performance change with many small items, or with few large items ?
Does sorted/unsorted input data make a difference ?

Another source of "real world" data in a large corporate is to connect
to something like an LDAP server and use that. I've also done much of
my own testing with lists of endangered species form the WCMC. You may
also find the W3C site useful, particularly the RDF test cases (not
large, but they do demonstrate many obscure conditions).

--
Smert' spamionam
Jul 20 '05 #2
>>>>> "Beda" == Beda Christoph Hammerschmidt <bc******@web.d e> writes:

Beda> I wat to perform some performance measurements on an XML database. FOr
Beda> this reason i need some huge XML sample data. The data should be not
Beda> too structured and a lot of reasonable queries should make sense. Any
Beda> idea, where i can get this data ??

You might get some RSS feed. RSS is a form used by several news servers to
distribute news. So by definition, there is not much structure, but you can
make reasonable queries, like what happed (some terrorist act), what was the
score (some soccer game) etc.

--
Arto V. Viitanen av@cs.uta.fi
University of Tampere, Department of Computer Sciences
Tampere, Finland http://www.cs.uta.fi/~av/
Jul 20 '05 #3
Beda Christoph Hammerschmidt wrote in message news:<c2******* *************** ***@posting.goo gle.com>...
I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.
Any idea, where i can get this data ??


I'm not sure what you mean by "huge", but there is a good amount of
data that might be intersting to query at:
http://www.ibiblio.org/xml/examples/shakespeare/

Toivo Lainevool
http://www.XMLPatterns.com - Develop effective DTDs and XML Schema
documents for your XML using structural design patterns.
Jul 20 '05 #4
Why don't you generate them ?
Use a free-db like MySQL...
bc******@web.de (Beda Christoph Hammerschmidt) wrote in message news:<c2******* *************** ***@posting.goo gle.com>...
I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.
Any idea, where i can get this data ??

Jul 20 '05 #5
Arto Viitanen wrote:
>>"Beda" == Beda Christoph Hammerschmidt <bc******@web.d e> writes:

Beda> I wat to perform some performance measurements on an XML database. FOr
Beda> this reason i need some huge XML sample data. The data should be not
Beda> too structured and a lot of reasonable queries should make sense. Any
Beda> idea, where i can get this data ??

You might get some RSS feed.


But RSS - by definition - is not "huge XML data".
--
Johannes Koch
In te domine speravi; non confundar in aeternum.
(Te Deum, 4th cent.)
Jul 20 '05 #6
>>>>> "Johannes" == Johannes Koch <ko**@w3develop ment.de> writes:

Beda> this reason i need some huge XML sample data. The data should be not
Beda> too structured and a lot of reasonable queries should make sense. Any
Beda> idea, where i can get this data ??
You might get some RSS feed.


Johannes> But RSS - by definition - is not "huge XML data".

But I got two out of third: it is not too structured and there can be
reasonable queries !
--
Arto V. Viitanen av@cs.uta.fi
University of Tampere, Department of Computer Sciences
Tampere, Finland http://www.cs.uta.fi/~av/
Jul 20 '05 #7
Arto Viitanen wrote:
>>"Johannes " == Johannes Koch <ko**@w3develop ment.de> writes:

Beda> this reason i need some huge XML sample data. The data should be not
Beda> too structured and a lot of reasonable queries should make sense. Any
Beda> idea, where i can get this data ?? >> You might get some RSS feed.


Johannes> But RSS - by definition - is not "huge XML data".

But I got two out of third: it is not too structured and there can be
reasonable queries !


That's right :-)
--
Johannes Koch
In te domine speravi; non confundar in aeternum.
(Te Deum, 4th cent.)
Jul 20 '05 #8
On 1 Apr 2004, Fabien R wrote:
Why don't you generate them ?
Good idea. There are 5 major XML DB Benchmark efforts. Some include data
generators. See:

http://www.rpbourret.com/xml/XMLDBLinks.htm#Benchmarks

Ron Bourret has a link to a benchmark page that I use to maintain, but I
no longer have time to maintain it.
Use a free-db like MySQL...
bc******@web.de (Beda Christoph Hammerschmidt) wrote in message news:<c2******* *************** ***@posting.goo gle.com>...
I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.
Any idea, where i can get this data ??



Some benchmarks and performance issues are also covered in the book I
helped edit:

A.B. Chaudhri, A. Rashid and R. Zicari (eds.) (2003) XML data management:
native XML and XML-enabled database systems (Reading, Massachusetts:
Addison-Wesley)

http://www.awprofessional.com/titles/0201844524/

HTH

akmal

Jul 20 '05 #9
bc******@web.de (Beda Christoph Hammerschmidt) writes:
I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.


This is somewhat structured, but large:

http://rdf.dmoz.org/rdf/content.rdf.u8.gz

Jul 20 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
11135
by: Stephane | last post by:
Hello, I need to allow an end user to upload video files on a server and in the same time to put the file name and a few infos in a database. It must be very simple for the end user, and uploading with a browser in a form would be perfect since it allows to upload and fill a form in the same time. I'll have full control of the server (no max_size problem etc). The end user connexion is ADSL, the server is on a 100mb internet connexion.
53
3564
by: john67 | last post by:
The company I work for is about to embark on developing a commercial application that will cost us tens-of-millions to develop. When all is said and done it will have thousands of business objects/classes, some of which will have hundreds-of-thousands of instances stored in a DB. Our clients will probably have somewhere between 50-200 users working on the app during the day, possibly in mutiple offices, and then a large number of batch...
1
1268
by: Jaunty Edward | last post by:
Hi, I am to make a DB that will handle over a million inserttions every month. Right Now I am to design it. I was wondering if any of you have a tutorial or some guide that can talk about the best practices that a DBA has to folow before he designs the new huge DB. The DB will be used with ASP and will be online on a Dedicated webserver in US only. I will be thankful if anyone can guide me to a tutorial or tell their
15
2362
by: cody | last post by:
We have a huge project, the solutuion spans 50 projects growing. Everytime I want to start the project I have to wait nearly over 1 minute for the compiler to complete building. This is unaccaptable. I thought about loading only the project I need into visual studio and not the whole solution. The problem is that the compiler tells me it cannot find the referenced dlls (project references) although they are all lying in their bin and obj...
5
4165
by: mas | last post by:
I have a Stored Procedure (SP) that creates the data required for a report that I show on a web page. The SP does all the work and just returns back a results set that I dump in an ASP.NET DataGrid. The SP takes a product area and a start and end date as parameters. Here are the basics of the SP. 1. Create temp table to store report results, all columns are created that will be needed at this point. 2. Select products and general...
1
1045
by: Hardy Wang | last post by:
Hi all: Our web site uses 3-tier structure, the middle tier is a web service. We also use Microsoft Data Access Application Block to access sql server. Now I have to display a DataGrid with huge amount of data on web page, of course the DataGrid has pager. But even though, the web service still will return full DataSet to web server, and let web server to handle pages. Since Microsoft Data Access Application Block does not support...
6
3809
by: Daniel Walzenbach | last post by:
Hi, I have a web application which sometimes throws an “out of memory” exception. To get an idea what happens I traced some values using performance monitor and got the following values (for one day): \\FFDS24\ASP.NET Applications(_LM_W3SVC_1_Root_ATV2004)\Errors During Execution: 7 \\FFDS24\ASP.NET Apps v1.1.4322(_LM_W3SVC_1_Root_ATV2004)\Compilations
6
2465
by: Michael Gray | last post by:
VS 2003 VB.net Win2000 SP4 The System.Array class seems to be limited to 32 bit addresses, meaning that one can only assign 2^32 elements. Is there any way that I can have an array that allows 2^64 elements? Or doesn't the CLR support this on a 32 bit opsys? I suspect that I am going to have to make such a beastie myself...
3
3491
by: Gummy | last post by:
Hello, I have an ASPX page on which I place a UserControl 15 times (they only need to be static controls on the page). This UserControl is a set of two listboxes with radiobuttons above the listbox (to select between viewing a code or description in the listbox). There are also left and right arrows that move the selected items between the listboxes. In my Page_Load I assign each UserControl the appropriate DataTable and this...
0
10490
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10260
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10243
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10030
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9078
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7570
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupr who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6809
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
4146
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2941
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.