473,387 Members | 1,588 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

How to modify the xml structure internally to work the program?

56
Dear Friends,

I have an application in Python which take input as an XML document. The XML document is supplied externally and cannot change it structure . But there is problem in alignment of XML. I am using xml minidom for parsing purpose.

There is simple position change is enough. But I have no idea how to change the element of DOM i.e self.tree = MD.parse(fichero) Please advise a good way ...

Please refer the problematic html and normal html structure attached here ...
N.B We have no option to edit the source HTML, because it may come from CD also.



Thanks
Anes
Attached Files
File Type: txt working html.txt (1.9 KB, 755 views)
File Type: txt problem html.txt (2.2 KB, 801 views)
Jan 15 '16 #1
3 1262
dwblas
626 Expert 512MB
What is the problem and what do you want to extract? It would possibly be easier to process this as a plain text file and split/groupby the <h1>, <h2>, & <span> tags depending. Will post some code later tonight time permitting.
Jan 15 '16 #2
dwblas
626 Expert 512MB
This code should be self explanatory. The combined record(s) are printed, but you could also search for string within the record, or write them to a file.
Expand|Select|Wrap|Line Numbers
  1. def process_group(group_in):
  2.     print " ".join(group_in)
  3.  
  4. with open("problem_or_working_html.txt", "r") as fp_in:
  5.     starters=["<h1", "<h2", "<span", "</body"]
  6.     this_group=[]
  7.     for rec in fp_in:
  8.         rec=rec.strip()
  9.         for start_lit in starters:
  10.             if rec.startswith(start_lit):
  11.                 process_group(this_group)
  12.                 this_group=[]
  13.         this_group.append(rec)
  14.  
  15. ## process last group
  16. process_group(this_group) 
Jan 15 '16 #3
amskape
56
Dear dwblas,
Thanks for your fantastic answer . It works fine with small indentation changes.
Expand|Select|Wrap|Line Numbers
  1. #!/bin/python  
  2. def process_group(group_in):
  3.     print " ".join(group_in)
  4. with open("problem_html.txt", "r") as fp_in:
  5.     starters = ["<h1", "<h2", "<span", "</body"]
  6.     this_group = []
  7.     for rec in fp_in:
  8.         rec = rec.strip()
  9.         for start_lit in starters:
  10.             if rec.startswith(start_lit):
  11.                 process_group(this_group)
  12.             #this_group = []
  13.         this_group.append(rec)
  14.  
  15. # process last group
  16. process_group(this_group) #function invoking...
  17.  
But current situation I got the result as DOM element with a normal python print show as
Expand|Select|Wrap|Line Numbers
  1. [<DOM Element: body at 0xb199054c>]
  2.  
So the Node list element . In node list we cannot apply this strip() method. Please advise a solution in this case...

With lots of gratitude

Anes
Jan 16 '16 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: Franco Fellico' | last post by:
Hi. Suppose to have read and displayed (using PHP) a group of row of a DB table on a dinamyc table on a HTML/PHP page. The number of row displayed could be from 1 to n. Each row contains...
12
by: | last post by:
I know how to use a StringBuilder, which supposedly does not create a new copy of it each time you modify it contents by adding or removing text. But, I wonder how does it do that internally ? I...
5
by: s88 | last post by:
Howdy: the follows is my program, I wanna change my structure array pointer in the function "testfunc", but I fail..., I also try to call the testfunc by reference, but the compiler says...
1
by: verge | last post by:
hello everyone! how's it going? like everyone in here im in need of some help and good friendship along the way...take a look at this: //MODIFIED SO IT DEALS WITH WINDOWS FTP USING ACTIVE...
4
by: gordon | last post by:
Hi I am still fairly new to C#.net and I sometimes make basic program design mistakes - particularyly in the context of paying attention to OOP principles. At the moment I am working on an...
7
by: SteveT | last post by:
Can someone point me in the right direction? Somewhere I read that you reference a strongly typed dataset as if it were a class structure. For example, <SomeTests> <TestsGroups> <Group>...
109
by: zaidalin79 | last post by:
I have a java class that goes for another week or so, and I am going to fail if I can't figure out this simple program. I can't get anything to compile to at least get a few points... Here are the...
7
juve11
by: juve11 | last post by:
hello, i have an app that imports csv files into mysql tables (not my app,so i dont have source code).a week ago that program worked.now,the soft doesnt works any more :The field is too small to...
5
by: terrybell105 | last post by:
I downloaded Stephan's utility from his website but can't get it to work - or maybe I'm not driving it properly! The form works OK with the existing 3 "views" - I can switch between them and they...
3
by: leutrim | last post by:
This code dosen't function properly on google chrome, and mozilla firefox, just on IE. The problem is that, it allways put's the box on the left uppercorner, I need to the box to appear to the place...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.