473,506 Members | 17,393 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to search XML? Are there special libs?

I am now using XML to record my lab records in quite a complex way, in
which about 80 tags are used to identify different types of data. I
think it is a good idea to search for a popular and mature XML search
engine before I started to program it myself:

I need the following functions:
Search and get all the records that share the same tree structures.
Search records containing a specific node or sub tree structres.

Does anyone know about something suitable? Or other powerful XML search
engine in Python?
Thank you so much for help.

PS: I do not care about the learning curve, as long as it is not a
cliff to climb.

Mar 17 '06 #1
3 1061
Sullivan WxPyQtKinter wrote:
I am now using XML to record my lab records in quite a complex way, in
which about 80 tags are used to identify different types of data. I
think it is a good idea to search for a popular and mature XML search
engine before I started to program it myself:

I need the following functions:
Search and get all the records that share the same tree structures.
Search records containing a specific node or sub tree structres.

Does anyone know about something suitable? Or other powerful XML search
engine in Python?
Thank you so much for help.


XPath is your friend. Available at least in the 4suite-xml-libraries. I'm
not aware of anything that resembles a more database-like approach, but at
least you can issue queries against the xml.

However you might reconsider your choice of a persistence model. Queries is
what SQL is for, and you very well might be better off using e.g. sqlite
and a XML-im/export if that is what you need.

Diez
Mar 17 '06 #2
Yes! XPath is a good bet.
You can also try some Pythonic XML libraries like Amara. You need not
learn any special language even.

There are good database approaches to XML too, especially if you are
going to query a document collection as a whole rather than file by
file. You can try XQuery. I think 4Suite can do this (But I am too
sleepy to confirm :-) ). You also use eXist (Java but you can use
XMLRPC or SOAP to interface with it from Python). Not optimal like
parent said, but if it is XML that have to live with ...

Mar 17 '06 #3
Ravi Teja wrote:
Yes! XPath is a good bet.
You can also try some Pythonic XML libraries like Amara. You need not
learn any special language even.

There are good database approaches to XML too, especially if you are
going to query a document collection as a whole rather than file by
file. You can try XQuery. I think 4Suite can do this (But I am too
sleepy to confirm :-) ). You also use eXist (Java but you can use
XMLRPC or SOAP to interface with it from Python). Not optimal like
parent said, but if it is XML that have to live with ...


4Suite does not support XQuery. It does support full XPath plus EXSLT
and enough other extensions to come close to the power of XQuery.
Amara [1] makes it really easy to get XQuery-like power from right
within Python, as I've blogged before (e.g. [2][3]).

I don't know whether full-text indexing of XML is something the OP
needs as well. If so, see [3].

[1] http://uche.ogbuji.net/tech/4Suite/amara/
[2] http://copia.ogbuji.net/blog/2005-06-12/Amara_equi
[3] http://copia.ogbuji.net/blog/2005/Sep/20
[4] http://www.xml.com/pub/a/2004/12/08/py-xml.html

--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/

Mar 17 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
1872
by: Paul Rubin | last post by:
I have a string s, possibly megabytes in size, and two regexps, p and q. I want to find the first occurence of q that occurs after the first occurence of p. Is there a reasonable way to do it?...
3
9692
by: Stephen Ferg | last post by:
I need a little help here. I'm developing some introductory material on Python for non-programmers. The first draft includes this statement. Is this correct? ...
23
6959
by: SeaPlusPlus | last post by:
I want to convert large files of prose to xhtml and so I need a way to remove unwanted line wraps. So, I'm looking for a freebee editor that has the capability of searching for a single "carriage...
32
14762
by: tshad | last post by:
Can you do a search for more that one string in another string? Something like: someString.IndexOf("something1","something2","something3",0) or would you have to do something like: if...
3
1334
by: Peter Schmitz | last post by:
Hi, one of the main task of my currently developed application is to search in small buffers for given patterns (average buffer size 1000 bytes). Now, for some reasons I can't use the provided C...
0
1511
by: Luis Corrales | last post by:
Hi all, I have a problem when searching for text with special characters in e-mails in an IMAP server. I'm using imaplib in python 2.4.3 and I can't get this code working: # first connect and...
1
1514
by: Ragavendran | last post by:
Hi, I am using this method for search: Query =org.apache.lucene.queryParser.QueryParser.parse(String arg0) throws ParseException Hits = org.apache.lucene.search.Searcher.search(Query query,...
4
1541
by: neoform | last post by:
I was wondering what would be the best search application for indexing/ searching content? Fulltext in MySQL is too slow once you get a db over 1GB. I played around with Zend's version of Lucene,...
3
6208
by: thesti | last post by:
hi, how to search a string in a table that contain a special character ? for example i want to search fo a string "D'Crepes" in a field of a table. i've tried using the following sql from vb...
0
7220
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7308
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7371
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
7023
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
1
5037
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
3188
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1534
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
757
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
410
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.