473,584 Members | 2,897 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to search XML? Are there special libs?

I am now using XML to record my lab records in quite a complex way, in
which about 80 tags are used to identify different types of data. I
think it is a good idea to search for a popular and mature XML search
engine before I started to program it myself:

I need the following functions:
Search and get all the records that share the same tree structures.
Search records containing a specific node or sub tree structres.

Does anyone know about something suitable? Or other powerful XML search
engine in Python?
Thank you so much for help.

PS: I do not care about the learning curve, as long as it is not a
cliff to climb.

Mar 17 '06 #1
3 1064
Sullivan WxPyQtKinter wrote:
I am now using XML to record my lab records in quite a complex way, in
which about 80 tags are used to identify different types of data. I
think it is a good idea to search for a popular and mature XML search
engine before I started to program it myself:

I need the following functions:
Search and get all the records that share the same tree structures.
Search records containing a specific node or sub tree structres.

Does anyone know about something suitable? Or other powerful XML search
engine in Python?
Thank you so much for help.


XPath is your friend. Available at least in the 4suite-xml-libraries. I'm
not aware of anything that resembles a more database-like approach, but at
least you can issue queries against the xml.

However you might reconsider your choice of a persistence model. Queries is
what SQL is for, and you very well might be better off using e.g. sqlite
and a XML-im/export if that is what you need.

Diez
Mar 17 '06 #2
Yes! XPath is a good bet.
You can also try some Pythonic XML libraries like Amara. You need not
learn any special language even.

There are good database approaches to XML too, especially if you are
going to query a document collection as a whole rather than file by
file. You can try XQuery. I think 4Suite can do this (But I am too
sleepy to confirm :-) ). You also use eXist (Java but you can use
XMLRPC or SOAP to interface with it from Python). Not optimal like
parent said, but if it is XML that have to live with ...

Mar 17 '06 #3
Ravi Teja wrote:
Yes! XPath is a good bet.
You can also try some Pythonic XML libraries like Amara. You need not
learn any special language even.

There are good database approaches to XML too, especially if you are
going to query a document collection as a whole rather than file by
file. You can try XQuery. I think 4Suite can do this (But I am too
sleepy to confirm :-) ). You also use eXist (Java but you can use
XMLRPC or SOAP to interface with it from Python). Not optimal like
parent said, but if it is XML that have to live with ...


4Suite does not support XQuery. It does support full XPath plus EXSLT
and enough other extensions to come close to the power of XQuery.
Amara [1] makes it really easy to get XQuery-like power from right
within Python, as I've blogged before (e.g. [2][3]).

I don't know whether full-text indexing of XML is something the OP
needs as well. If so, see [3].

[1] http://uche.ogbuji.net/tech/4Suite/amara/
[2] http://copia.ogbuji.net/blog/2005-06-12/Amara_equi
[3] http://copia.ogbuji.net/blog/2005/Sep/20
[4] http://www.xml.com/pub/a/2004/12/08/py-xml.html

--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/

Mar 17 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
1874
by: Paul Rubin | last post by:
I have a string s, possibly megabytes in size, and two regexps, p and q. I want to find the first occurence of q that occurs after the first occurence of p. Is there a reasonable way to do it? g1 = re.search(p, s) g2 = re.search(q, s) q_offset = g1.end() + g2.start()
3
9706
by: Stephen Ferg | last post by:
I need a little help here. I'm developing some introductory material on Python for non-programmers. The first draft includes this statement. Is this correct? ----------------------------------------------------------------- When loading modules, Python looks for modules in the following places in the following order:
23
6975
by: SeaPlusPlus | last post by:
I want to convert large files of prose to xhtml and so I need a way to remove unwanted line wraps. So, I'm looking for a freebee editor that has the capability of searching for a single "carriage return/line feed" or "line feed/carriage return" and removing them. I quess what I need is an editor that allows non-printable characters in it's...
32
14799
by: tshad | last post by:
Can you do a search for more that one string in another string? Something like: someString.IndexOf("something1","something2","something3",0) or would you have to do something like: if ((someString.IndexOf("something1",0) >= 0) || ((someString.IndexOf("something2",0) >= 0) ||
3
1341
by: Peter Schmitz | last post by:
Hi, one of the main task of my currently developed application is to search in small buffers for given patterns (average buffer size 1000 bytes). Now, for some reasons I can't use the provided C std libs, so I need to implement my own - but unfortunately I cannot find any hints about how to program such an algorithm, so that it is really...
0
1516
by: Luis Corrales | last post by:
Hi all, I have a problem when searching for text with special characters in e-mails in an IMAP server. I'm using imaplib in python 2.4.3 and I can't get this code working: # first connect and login conn = IMAP4(my_server) conn.login(my_user, my_pass)
1
1519
by: Ragavendran | last post by:
Hi, I am using this method for search: Query =org.apache.lucene.queryParser.QueryParser.parse(String arg0) throws ParseException Hits = org.apache.lucene.search.Searcher.search(Query query, Sort sort) throws IOException Problem : I cant able to search the word. If the word contain special character like , %BF It just taken that special...
4
1560
by: neoform | last post by:
I was wondering what would be the best search application for indexing/ searching content? Fulltext in MySQL is too slow once you get a db over 1GB. I played around with Zend's version of Lucene, but it contains a few bugs that need to be fixed before I can use it.. Does anyone know of other search libs/apps out there that would work well...
3
6224
by: thesti | last post by:
hi, how to search a string in a table that contain a special character ? for example i want to search fo a string "D'Crepes" in a field of a table. i've tried using the following sql from vb .net select name from menu where name like '%D'Crepes%' but since ' is a special character, i got an error. is there anyway to do this, like...
0
7897
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7829
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8190
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8331
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
6590
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
3824
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3850
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2336
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1441
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.