473,666 Members | 2,039 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to search XML? Are there special libs?

I am now using XML to record my lab records in quite a complex way, in
which about 80 tags are used to identify different types of data. I
think it is a good idea to search for a popular and mature XML search
engine before I started to program it myself:

I need the following functions:
Search and get all the records that share the same tree structures.
Search records containing a specific node or sub tree structres.

Does anyone know about something suitable? Or other powerful XML search
engine in Python?
Thank you so much for help.

PS: I do not care about the learning curve, as long as it is not a
cliff to climb.

Mar 17 '06 #1
3 1065
Sullivan WxPyQtKinter wrote:
I am now using XML to record my lab records in quite a complex way, in
which about 80 tags are used to identify different types of data. I
think it is a good idea to search for a popular and mature XML search
engine before I started to program it myself:

I need the following functions:
Search and get all the records that share the same tree structures.
Search records containing a specific node or sub tree structres.

Does anyone know about something suitable? Or other powerful XML search
engine in Python?
Thank you so much for help.


XPath is your friend. Available at least in the 4suite-xml-libraries. I'm
not aware of anything that resembles a more database-like approach, but at
least you can issue queries against the xml.

However you might reconsider your choice of a persistence model. Queries is
what SQL is for, and you very well might be better off using e.g. sqlite
and a XML-im/export if that is what you need.

Diez
Mar 17 '06 #2
Yes! XPath is a good bet.
You can also try some Pythonic XML libraries like Amara. You need not
learn any special language even.

There are good database approaches to XML too, especially if you are
going to query a document collection as a whole rather than file by
file. You can try XQuery. I think 4Suite can do this (But I am too
sleepy to confirm :-) ). You also use eXist (Java but you can use
XMLRPC or SOAP to interface with it from Python). Not optimal like
parent said, but if it is XML that have to live with ...

Mar 17 '06 #3
Ravi Teja wrote:
Yes! XPath is a good bet.
You can also try some Pythonic XML libraries like Amara. You need not
learn any special language even.

There are good database approaches to XML too, especially if you are
going to query a document collection as a whole rather than file by
file. You can try XQuery. I think 4Suite can do this (But I am too
sleepy to confirm :-) ). You also use eXist (Java but you can use
XMLRPC or SOAP to interface with it from Python). Not optimal like
parent said, but if it is XML that have to live with ...


4Suite does not support XQuery. It does support full XPath plus EXSLT
and enough other extensions to come close to the power of XQuery.
Amara [1] makes it really easy to get XQuery-like power from right
within Python, as I've blogged before (e.g. [2][3]).

I don't know whether full-text indexing of XML is something the OP
needs as well. If so, see [3].

[1] http://uche.ogbuji.net/tech/4Suite/amara/
[2] http://copia.ogbuji.net/blog/2005-06-12/Amara_equi
[3] http://copia.ogbuji.net/blog/2005/Sep/20
[4] http://www.xml.com/pub/a/2004/12/08/py-xml.html

--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/

Mar 17 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
1876
by: Paul Rubin | last post by:
I have a string s, possibly megabytes in size, and two regexps, p and q. I want to find the first occurence of q that occurs after the first occurence of p. Is there a reasonable way to do it? g1 = re.search(p, s) g2 = re.search(q, s) q_offset = g1.end() + g2.start()
3
9719
by: Stephen Ferg | last post by:
I need a little help here. I'm developing some introductory material on Python for non-programmers. The first draft includes this statement. Is this correct? ----------------------------------------------------------------- When loading modules, Python looks for modules in the following places in the following order:
23
6985
by: SeaPlusPlus | last post by:
I want to convert large files of prose to xhtml and so I need a way to remove unwanted line wraps. So, I'm looking for a freebee editor that has the capability of searching for a single "carriage return/line feed" or "line feed/carriage return" and removing them. I quess what I need is an editor that allows non-printable characters in it's search strings. Does anyone know of on that will allow this? Thank you...
32
14837
by: tshad | last post by:
Can you do a search for more that one string in another string? Something like: someString.IndexOf("something1","something2","something3",0) or would you have to do something like: if ((someString.IndexOf("something1",0) >= 0) || ((someString.IndexOf("something2",0) >= 0) ||
3
1346
by: Peter Schmitz | last post by:
Hi, one of the main task of my currently developed application is to search in small buffers for given patterns (average buffer size 1000 bytes). Now, for some reasons I can't use the provided C std libs, so I need to implement my own - but unfortunately I cannot find any hints about how to program such an algorithm, so that it is really fast. Is there any (free) algorithm to start with? Any free implementations? Thanks a lot for any...
0
1516
by: Luis Corrales | last post by:
Hi all, I have a problem when searching for text with special characters in e-mails in an IMAP server. I'm using imaplib in python 2.4.3 and I can't get this code working: # first connect and login conn = IMAP4(my_server) conn.login(my_user, my_pass)
1
1521
by: Ragavendran | last post by:
Hi, I am using this method for search: Query =org.apache.lucene.queryParser.QueryParser.parse(String arg0) throws ParseException Hits = org.apache.lucene.search.Searcher.search(Query query, Sort sort) throws IOException Problem : I cant able to search the word. If the word contain special character like , %BF It just taken that special character a empty space and search the remaining character
4
1560
by: neoform | last post by:
I was wondering what would be the best search application for indexing/ searching content? Fulltext in MySQL is too slow once you get a db over 1GB. I played around with Zend's version of Lucene, but it contains a few bugs that need to be fixed before I can use it.. Does anyone know of other search libs/apps out there that would work well with php?
3
6226
by: thesti | last post by:
hi, how to search a string in a table that contain a special character ? for example i want to search fo a string "D'Crepes" in a field of a table. i've tried using the following sql from vb .net select name from menu where name like '%D'Crepes%' but since ' is a special character, i got an error. is there anyway to do this, like using \ in mysql?
0
8448
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8783
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8552
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
6198
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5666
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4198
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4369
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2773
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2011
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.