473,387 Members | 1,365 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

How to parse multi-part content

Suppose that I have content that looks like what I've included at
the end of this message. Is there something in the standard
Python library that will help me parse it, break into the parts
separated by the boundary strings, extract headers from each
sub-part, etc?

Do I need to add something like the following to the beginning?

Content-Type: multipart/related;
type="multipart/alternative";
boundary="-----------------------------1646970154570313593966717980"

I've tried working with the email, mimetools, and multifile
modules in the standard library. But my understanding of these
things is dim, and I have not had success.

Is there a beginner's guide somewhere that I should read?

In case you are curious, this is content posted to my Zope server
when I include an element '<input type="file" .../>' in my form.

Here is the content that I need to parse:
-----------------------------1646970154570313593966717980
Content-Disposition: form-data; name="xschemaContent"
-----------------------------1646970154570313593966717980
Content-Disposition: form-data; name="xschemaFile"; filename="po.xsd"
Content-Type: application/octet-stream

<xs:schema targetNamespace="http://openuri.org/easypo"
xmlns:po="http://openuri.org/easypo"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">

<xs:element name="purchase-order">
<xs:complexType>
<xs:sequence>
<xs:element name="customer" type="po:customer"/>
<xs:element name="date" type="xs:dateTime"/>
<xs:element name="line-item" type="po:line-item"
minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="shipper" type="po:shipper"
minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="customer">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="line-item">
<xs:sequence>
<xs:element name="description" type="xs:string"/>
<xs:element name="per-unit-ounces" type="xs:decimal"/>
<xs:element name="price" type="xs:double"/>
<xs:element name="quantity" type="xs:integer"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="shipper">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="per-ounce-rate" type="xs:decimal"/>
</xs:sequence>
</xs:complexType>
</xs:schema>

-----------------------------1646970154570313593966717980
Content-Disposition: form-data; name="which"

superclass
-----------------------------1646970154570313593966717980
Content-Disposition: form-data; name="Submit"

Submit
-----------------------------1646970154570313593966717980--

--
Dave Kuhlman
http://www.rexx.com/~dkuhlman
Jul 18 '05 #1
6 9682
Dave Kuhlman <dk******@rexx.com> writes:
[...]
In case you are curious, this is content posted to my Zope server
when I include an element '<input type="file" .../>' in my form.

[...]

*Surely* Zope has a standard way of doing this. Try a Zope list?
John
Jul 18 '05 #2
Dave Kuhlman <dk******@rexx.com> wrote:

Suppose that I have content that looks like what I've included at
the end of this message. Is there something in the standard
Python library that will help me parse it, break into the parts
separated by the boundary strings, extract headers from each
sub-part, etc?
...
In case you are curious, this is content posted to my Zope server
when I include an element '<input type="file" .../>' in my form.


Actually, you get this because your <form> header has
enctype="multipart/form-data". It happens that file upload only works with
that enctype, but you can use it without a file upload.

That's why cgi.py knows how to parse this. Look at cgi.parse_multipart.
--
- Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Jul 18 '05 #3
John J. Lee wrote:
Dave Kuhlman <dk******@rexx.com> writes:
[...]
In case you are curious, this is content posted to my Zope server
when I include an element '<input type="file" .../>' in my form.

[...]

*Surely* Zope has a standard way of doing this. Try a Zope list?


That's a good suggestion. Thanks. Zope people are Python people,
so they would give me the kind of help I'd need. I'll ask on the
Zope users list.

However, there is nothing Zope-specific about this. The content
was produced by my Web browser (actually two Web browsers that I
test with: Opera and Firefox).

Dave

--
Dave Kuhlman
http://www.rexx.com/~dkuhlman
Jul 18 '05 #4
Tim Roberts wrote:
Dave Kuhlman <dk******@rexx.com> wrote:

Suppose that I have content that looks like what I've included at
the end of this message. Is there something in the standard
Python library that will help me parse it, break into the parts
separated by the boundary strings, extract headers from each
sub-part, etc?
...
In case you are curious, this is content posted to my Zope server
when I include an element '<input type="file" .../>' in my form.


Actually, you get this because your <form> header has
enctype="multipart/form-data". It happens that file upload only works
with that enctype, but you can use it without a file upload.

That's why cgi.py knows how to parse this. Look at cgi.parse_multipart.


Ah. A clue. I think you're telling me that it's the CGI
specification that I need to be reading, right? I'll read some of
that.

Per your suggestion, I tried cgi.parse_multipart() and also
class cgi.FieldStorage. They don't work. Or more correctly, I
don't know how to use them.

I guess I'll have to concede defeat, which in Python-speak means:
"It was easier to write it myself."

Basically, I wrote a little parser class ContentParser which
exposes a method get_content_by_name. This method returns the
body (what follows two carriage returns, up to the next
boundary line) for a given name, where name is the value of the
"name" field in the line:

Content-Disposition: form-data; name="xschemaFile"

I was in a bit of a hurry, so my solution (class ContentParser) is
not very elegant. But if anyone needs it, let me know.

And, thanks for the suggestions.

Dave
--
Dave Kuhlman
http://www.rexx.com/~dkuhlman
Jul 18 '05 #5
Dave Kuhlman <dk******@rexx.com> wrote in message news:<2r*************@uni-berlin.de>...
Tim Roberts wrote:
Dave Kuhlman <dk******@rexx.com> wrote:

Suppose that I have content that looks like what I've included at
the end of this message. Is there something in the standard
Python library that will help me parse it, break into the parts
separated by the boundary strings, extract headers from each
sub-part, etc?
...
In case you are curious, this is content posted to my Zope server
when I include an element '<input type="file" .../>' in my form.


Actually, you get this because your <form> header has
enctype="multipart/form-data". It happens that file upload only works
with that enctype, but you can use it without a file upload.

That's why cgi.py knows how to parse this. Look at cgi.parse_multipart.


Ah. A clue. I think you're telling me that it's the CGI
specification that I need to be reading, right? I'll read some of
that.

Per your suggestion, I tried cgi.parse_multipart() and also
class cgi.FieldStorage. They don't work. Or more correctly, I
don't know how to use them.

I guess I'll have to concede defeat, which in Python-speak means:
"It was easier to write it myself."

Basically, I wrote a little parser class ContentParser which
exposes a method get_content_by_name. This method returns the
body (what follows two carriage returns, up to the next
boundary line) for a given name, where name is the value of the
"name" field in the line:

Content-Disposition: form-data; name="xschemaFile"

I was in a bit of a hurry, so my solution (class ContentParser) is
not very elegant. But if anyone needs it, let me know.

And, thanks for the suggestions.

Dave


If you are receiving this data to a python script on a server from an
HTML form (i.e. a cgi) then it's striaghtforward to do.

import cgi
theform = cgi.FieldStorage()

parses the contents of the form into a dictionary like object.
The HTML form that posted the information will assign each file (or
element of the form) a name.
You can access the saved data ausing :

thedata = theform['name].value

Look under the cgi documentation for other attributes that uploaded
files will have. (Potential pitfall with 'list values' as well, where
several values have the same name - again see the docs to see ways
round this).

Regards,

Fuzzyman
http://www.voidspace.org.uk/atlantib...thonutils.html
Jul 18 '05 #6
Dave Kuhlman wrote:
John J. Lee wrote:
Dave Kuhlman <dk******@rexx.com> writes:
[...]
In case you are curious, this is content posted to my Zope server
when I include an element '<input type="file" .../>' in my form.

[...]

*Surely* Zope has a standard way of doing this. Try a Zope list?


That's a good suggestion. Thanks. Zope people are Python people,
so they would give me the kind of help I'd need. I'll ask on the
Zope users list.

However, there is nothing Zope-specific about this. The content
was produced by my Web browser (actually two Web browsers that I
test with: Opera and Firefox).


I was wrong. You were right. There is a Zope way to do this.
Thanks for pushing me to dig deeper. It's a much easier way, too.

If there are any Zopesters reading, here is how to do it:

def my_external_method(request, ...):
# Retrieve a stream-like object.
myStream = request['myFileData']
# Read the data from the stream object.
data = myStream.read()

My problem was that I was so sure that I had to retrieve and parse
the content in the body of the request.

Thanks for help.

Dave

--
Dave Kuhlman
http://www.rexx.com/~dkuhlman
Jul 18 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: iop | last post by:
Hello there, I'd like to "parse" an entire multi-dimension array like this : APP APP without knowing "framework" or "config" or anything passed as variables... 'cause it's simple to call...
12
by: * ProteanThread * | last post by:
but depends upon the clique: ...
5
by: David | last post by:
I have an xml file that, for example, contains the following element: - <data>\x0095 blah1 \x0095 blah2 \x0095 blah3 \x0095 blah4</data> If I use XmlTextReader.ReadString() to read this data...
4
by: Matteo | last post by:
Hy everybody. I'm not a html writer, but a sysadmin who's trying to help a user able to compile an online form with IE but not with Mozilla (Moz1.6, Ns7.1, Firefox 0.8+) due to a javascript date...
10
by: dreamcatcher | last post by:
I want my program to parse INI files, only have little clue of how to do that, though, hope you guys might shed some light on this, thanx. for example:
19
by: linzhenhua1205 | last post by:
I want to parse a string like C program parse the command line into argc & argv. I hope don't use the array the allocate a fix memory first, and don't use the memory allocate function like malloc....
5
by: bobwansink | last post by:
Hi, I'm relatively new to programming and I would like to create a C++ multi user program. It's for a project for school. This means I will have to write a paper about the theory too. Does anyone...
11
by: Peter Pei | last post by:
One bad design about elementtree is that it has different ways parsing a string and a file, even worse they return different objects: 1) When you parse a file, you can simply call parse, which...
3
by: Mythran | last post by:
Sorry, this was originally posted in VB.Net newsgroup, but I meant for it to be posted to the C# group initially, so I'm sorry for the multi-post, my mistake. -------------------------------------...
4
by: ShutterMan | last post by:
I have a JSON object as below (data is from SQL Server Northwind Database). But doing an eval on it returns an error "unterminated string constant" or such. Can someone help me pinpoint the...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.