473,322 Members | 1,409 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

Huge XML data needed

I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.
Any idea, where i can get this data ??
Jul 20 '05 #1
15 1943
On 1 Apr 2004 06:45:29 -0800, bc******@web.de (Beda Christoph
Hammerschmidt) wrote:
Any idea, where i can get this data ??


Make it yourself. That way you can control the size and the
distribution of certain features. If this process is automated, then
you can easily run tests over and over with different parameters.

It's often useful (but rarely done) to test, not just that "it works",
but to test for sensitivity to different sorts of load. Does
performance change with many small items, or with few large items ?
Does sorted/unsorted input data make a difference ?

Another source of "real world" data in a large corporate is to connect
to something like an LDAP server and use that. I've also done much of
my own testing with lists of endangered species form the WCMC. You may
also find the W3C site useful, particularly the RDF test cases (not
large, but they do demonstrate many obscure conditions).

--
Smert' spamionam
Jul 20 '05 #2
>>>>> "Beda" == Beda Christoph Hammerschmidt <bc******@web.de> writes:

Beda> I wat to perform some performance measurements on an XML database. FOr
Beda> this reason i need some huge XML sample data. The data should be not
Beda> too structured and a lot of reasonable queries should make sense. Any
Beda> idea, where i can get this data ??

You might get some RSS feed. RSS is a form used by several news servers to
distribute news. So by definition, there is not much structure, but you can
make reasonable queries, like what happed (some terrorist act), what was the
score (some soccer game) etc.

--
Arto V. Viitanen av@cs.uta.fi
University of Tampere, Department of Computer Sciences
Tampere, Finland http://www.cs.uta.fi/~av/
Jul 20 '05 #3
Beda Christoph Hammerschmidt wrote in message news:<c2*************************@posting.google.c om>...
I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.
Any idea, where i can get this data ??


I'm not sure what you mean by "huge", but there is a good amount of
data that might be intersting to query at:
http://www.ibiblio.org/xml/examples/shakespeare/

Toivo Lainevool
http://www.XMLPatterns.com - Develop effective DTDs and XML Schema
documents for your XML using structural design patterns.
Jul 20 '05 #4
Why don't you generate them ?
Use a free-db like MySQL...
bc******@web.de (Beda Christoph Hammerschmidt) wrote in message news:<c2*************************@posting.google.c om>...
I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.
Any idea, where i can get this data ??

Jul 20 '05 #5
Arto Viitanen wrote:
>>"Beda" == Beda Christoph Hammerschmidt <bc******@web.de> writes:

Beda> I wat to perform some performance measurements on an XML database. FOr
Beda> this reason i need some huge XML sample data. The data should be not
Beda> too structured and a lot of reasonable queries should make sense. Any
Beda> idea, where i can get this data ??

You might get some RSS feed.


But RSS - by definition - is not "huge XML data".
--
Johannes Koch
In te domine speravi; non confundar in aeternum.
(Te Deum, 4th cent.)
Jul 20 '05 #6
>>>>> "Johannes" == Johannes Koch <ko**@w3development.de> writes:

Beda> this reason i need some huge XML sample data. The data should be not
Beda> too structured and a lot of reasonable queries should make sense. Any
Beda> idea, where i can get this data ??
You might get some RSS feed.


Johannes> But RSS - by definition - is not "huge XML data".

But I got two out of third: it is not too structured and there can be
reasonable queries !
--
Arto V. Viitanen av@cs.uta.fi
University of Tampere, Department of Computer Sciences
Tampere, Finland http://www.cs.uta.fi/~av/
Jul 20 '05 #7
Arto Viitanen wrote:
>>"Johannes" == Johannes Koch <ko**@w3development.de> writes:

Beda> this reason i need some huge XML sample data. The data should be not
Beda> too structured and a lot of reasonable queries should make sense. Any
Beda> idea, where i can get this data ?? >> You might get some RSS feed.


Johannes> But RSS - by definition - is not "huge XML data".

But I got two out of third: it is not too structured and there can be
reasonable queries !


That's right :-)
--
Johannes Koch
In te domine speravi; non confundar in aeternum.
(Te Deum, 4th cent.)
Jul 20 '05 #8
On 1 Apr 2004, Fabien R wrote:
Why don't you generate them ?
Good idea. There are 5 major XML DB Benchmark efforts. Some include data
generators. See:

http://www.rpbourret.com/xml/XMLDBLinks.htm#Benchmarks

Ron Bourret has a link to a benchmark page that I use to maintain, but I
no longer have time to maintain it.
Use a free-db like MySQL...
bc******@web.de (Beda Christoph Hammerschmidt) wrote in message news:<c2*************************@posting.google.c om>...
I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.
Any idea, where i can get this data ??



Some benchmarks and performance issues are also covered in the book I
helped edit:

A.B. Chaudhri, A. Rashid and R. Zicari (eds.) (2003) XML data management:
native XML and XML-enabled database systems (Reading, Massachusetts:
Addison-Wesley)

http://www.awprofessional.com/titles/0201844524/

HTH

akmal

Jul 20 '05 #9
bc******@web.de (Beda Christoph Hammerschmidt) writes:
I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.


This is somewhat structured, but large:

http://rdf.dmoz.org/rdf/content.rdf.u8.gz

Jul 20 '05 #10
On Fri, 02 Apr 2004 10:25:31 +0200, Johannes Koch
<ko**@w3development.de> wrote:
But RSS - by definition - is not "huge XML data".


Much RSS isn't even XML !

Today's RSS feed bug was this
http://www.littlefluffy.com/index.php?a=rss

<description>
A more aptly named game you are not likely to find. [...] a great
game for drug users &lt;em>and&lt;/em> kids.
</description>
Jul 20 '05 #11
In article <f9********************************@4ax.com>,
Andy Dingley <di*****@codesmiths.com> wrote:

% Today's RSS feed bug was this
% http://www.littlefluffy.com/index.php?a=rss
%
% <description>
% A more aptly named game you are not likely to find. [...] a great
% game for drug users &lt;em>and&lt;/em> kids.
% </description>

So, what's wrong with it? That <em> should appear as mark-up, or that
you think > shouldn't be there?
--

Patrick TJ McPhee
East York Canada
pt**@interlog.com
Jul 20 '05 #12
On Sun, 4 Apr 2004 03:00:54 +0200 (MEST), pt**@interlog.com (Patrick
TJ McPhee) wrote:
% game for drug users &lt;em>and&lt;/em> kids. So, what's wrong with it? That <em> should appear as mark-up, or that
you think > shouldn't be there?


There's no valid encoding for HTML in RSS where the opening character
of a tag is escaped, but not the closing character.

Jul 20 '05 #13
In article <j0********************************@4ax.com>,
Andy Dingley <di*****@codesmiths.com> wrote:
% On Sun, 4 Apr 2004 03:00:54 +0200 (MEST), pt**@interlog.com (Patrick
% TJ McPhee) wrote:
%
% >% game for drug users &lt;em>and&lt;/em> kids.
%
% >So, what's wrong with it? That <em> should appear as mark-up, or that
% >you think > shouldn't be there?
%
% There's no valid encoding for HTML in RSS where the opening character
% of a tag is escaped, but not the closing character.

OK, perhaps it's not valid RSS, but it's valid XML.

--

Patrick TJ McPhee
East York Canada
pt**@interlog.com
Jul 20 '05 #14
http://www.pir.uniprot.org/database/download.shtml

up to a gig in size
"Beda Christoph Hammerschmidt" <bc******@web.de> wrote in message
news:c2*************************@posting.google.co m...
I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.
Any idea, where i can get this data ??

Jul 20 '05 #15
> too structured and a lot of reasonable queries should make sense.
Any idea, where i can get this data ??

I also suggest you to create the data on your own. I recommend ToXgene:
http://www.cs.toronto.edu/tox/toxgene

hth
Torsten
Jul 20 '05 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Stephane | last post by:
Hello, I need to allow an end user to upload video files on a server and in the same time to put the file name and a few infos in a database. It must be very simple for the end user, and...
53
by: john67 | last post by:
The company I work for is about to embark on developing a commercial application that will cost us tens-of-millions to develop. When all is said and done it will have thousands of business...
1
by: Jaunty Edward | last post by:
Hi, I am to make a DB that will handle over a million inserttions every month. Right Now I am to design it. I was wondering if any of you have a tutorial or some guide that can talk about the best...
15
by: cody | last post by:
We have a huge project, the solutuion spans 50 projects growing. Everytime I want to start the project I have to wait nearly over 1 minute for the compiler to complete building. This is...
5
by: mas | last post by:
I have a Stored Procedure (SP) that creates the data required for a report that I show on a web page. The SP does all the work and just returns back a results set that I dump in an ASP.NET...
1
by: Hardy Wang | last post by:
Hi all: Our web site uses 3-tier structure, the middle tier is a web service. We also use Microsoft Data Access Application Block to access sql server. Now I have to display a DataGrid with huge...
6
by: Daniel Walzenbach | last post by:
Hi, I have a web application which sometimes throws an “out of memory” exception. To get an idea what happens I traced some values using performance monitor and got the following values (for...
6
by: Michael Gray | last post by:
VS 2003 VB.net Win2000 SP4 The System.Array class seems to be limited to 32 bit addresses, meaning that one can only assign 2^32 elements. Is there any way that I can have an array that...
3
by: Gummy | last post by:
Hello, I have an ASPX page on which I place a UserControl 15 times (they only need to be static controls on the page). This UserControl is a set of two listboxes with radiobuttons above the...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, youll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.