473,725 Members | 1,987 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Python parsing iTunes XML/COM

I'm trying to convert the URLs contained in iTunes' XML file into a
form comparable with the filenames returned by iTunes' COM interface.

I'm writing a podcast sorter in Python; I'm using iTunes under Windows
right now. iTunes' COM provides most of my data input and all of my
mp3/aac editing capabilities; the one thing I can't access through COM
is the Release Date, which is my primary sorting field. So I read
everything in through COM, then read all the release dates from the
iTunes XML file, then try to join the two together... But so far I
have zero success.

Is there _any_ way to match up tracks between iTunes COM and iTunes
XML? I've spent far too much effort on this. I'm not stuck on using
filenames, if that's a bad idea... But I haven't found anything else
that works, and filenames seem like an obvious solution.

-Wm
Jul 28 '08 #1
16 2065
To ask another way: how do I convert from a file:// URL to a local
path in a standard way, so that filepaths from two different sources
will work the same way in a dictionary?

Right now I'm using the following source:

track_id = url2pathname(ur lparse(track_id ).path)

url2pathname is from urllib; urlparse is from the urlparse module.

The problems occur when the filenames have non-ascii characters in
them -- I suspect that the URLs are having some encoding placed on
them that Python's decoder doesn't know about.

Thank you all in advance, and thank you for Python.

-Wm
Jul 29 '08 #2
On Jul 30, 3:53 am, william tanksley <wtanksle...@gm ail.comwrote:
To ask another way: how do I convert from a file:// URL to a local
path in a standard way, so that filepaths from two different sources
will work the same way in a dictionary?

Right now I'm using the following source:

track_id = url2pathname(ur lparse(track_id ).path)

url2pathname is from urllib; urlparse is from the urlparse module.

The problems occur when the filenames have non-ascii characters in
them -- I suspect that the URLs are having some encoding placed on
them that Python's decoder doesn't know about.
WHAT problems? WHAT non-ASCII characters?? Consider e.g.

# track_id = url2pathname(ur lparse(track_id ).path)
print repr(track_id)
parse_result = urlparse(track_ id).path
print repr(parse_resu lt)
track_id_replac ement = url2pathname(pa rse_result)
print repr(track_id_r eplacement)

and copy/paste the results into your next posting.
Jul 29 '08 #3
If you want to convert the file names which use standard URL encoding
(with %20 for space, etc) use:

from urllib import unquote
new_filename = unquote(filenam e)

I have found this does not convert encoded characters of the form
'&#CC;' so you may have to do that manually. I think these are just
ascii encodings in hexadecimal.
Jul 30 '08 #4
Thank you for the response. Here's some more info, including a little
that you didn't ask me for but which might be useful.

John Machin <sjmac...@lexic on.netwrote:
william tanksley <wtanksle...@gm ail.comwrote:
To ask another way: how do I convert from a file:// URL to a local
path in a standard way, so that filepaths from two different sources
will work the same way in a dictionary?
The problems occur when the filenames have non-ascii characters in
them -- I suspect that the URLs are having some encoding placed on
them that Python's decoder doesn't know about.
# track_id = url2pathname(ur lparse(track_id ).path)
print repr(track_id)
parse_result = urlparse(track_ id).path
print repr(parse_resu lt)
track_id_replac ement = url2pathname(pa rse_result)
print repr(track_id_r eplacement)
The "important" value here is track_id_replac ement; it contains the
data that's throwing me. It appears that some UTF-8 characters are
being read as multiple bytes by ElementTree rather than being decoded
into Unicode. Could this be a bug in ElementTree's Unicode support? If
so, can I work around it?

Here's one example. The others are similar -- they have the same
things that look like problems to me.

"Buffett Time - Annual Shareholders\xc 2\xa0L.mp3"

Note some problems here:

1. This isn't Unicode; it's missing the u"" (I printed using repr).
2. It's got the UTF-8 bytes there in the middle.

I tried doing track_id.encode ("utf-8"), but it doesn't seem to make
any difference at all.

Of course, my ultimate goal is to compare the track_id to the track_id
I get from iTunes' COM interface, including hashing to the same value
for dict lookups.
and copy/paste the results into your next posting.
In addition to the above results, while trying to get more diagnostic
printouts I got the following warning from Python:

C:\projects\pod casts\podstrand \podcast.py:280 : UnicodeWarning: Unicode
equal comparison failed to convert both arguments to Unicode -
interpreting them as being unequal
return track.databaseI D == trackLocation

The code that triggered this is as follows:

if trackLocation in self.podcasts:
track = self.podcasts[trackLocation]
if trackRelease:
track.release_d ate = trackRelease
elif track.is_podcas t:
print "No release date:", repr(track.name )
else:
# For the sake of diagnostics, try to find the track.
def track_has_locat ion(track):
return track.databaseI D == trackLocation
fillers = filter(track_ha s_location, self.fillers)
if len(fillers):
return
disabled = filter(track_ha s_location, self.deferred)
if len(disabled):
return
print "Location not known:", repr(trackLocat ion)

-Wm
Jul 30 '08 #5
On Wed, Jul 30, 2008 at 10:58 AM, william tanksley
<wt*********@gm ail.comwrote:
Here's one example. The others are similar -- they have the same
things that look like problems to me.

"Buffett Time - Annual Shareholders\xc 2\xa0L.mp3"

Note some problems here:

1. This isn't Unicode; it's missing the u"" (I printed using repr).
2. It's got the UTF-8 bytes there in the middle.

I tried doing track_id.encode ("utf-8"), but it doesn't seem to make
any difference at all.
I don't have anything to say about your iTunes problems, but encode()
is the wrong method to turn a byte string into a unicode string.
Instead, use decode(), like this:
>>track_id = "Buffett Time - Annual Shareholders\xc 2\xa0L.mp3"
utrack_id = track_id.decode ('utf-8')
type(utrack_i d)
<type 'unicode'>
>>print utrack_id
Buffett Time - Annual Shareholders L.mp3
>>print repr(utrack_id)
u'Buffett Time - Annual Shareholders\xa 0L.mp3'
>>>
--
Jerry
Jul 30 '08 #6
william tanksley wrote:
Okay, so you decode to go from raw
byes into a given encoding, and you encode to go from a given encoding
to raw bytes.
No, decoding goes from a byte sequence to a Unicode string and encoding goes
from a Unicode string to a byte sequence.

Unicode is not an encoding. A Unicode string is a character sequence, not a
byte sequence.

Stefan
Jul 30 '08 #7
On Wed, Jul 30, 2008 at 2:27 PM, william tanksley <wt*********@gm ail.comwrote:
Awesome... Thank you! I had my mental model of Python turned around
backwards. That's an odd feeling. Okay, so you decode to go from raw
byes into a given encoding, and you encode to go from a given encoding
to raw bytes. Not what I thought it was, but that's cool, makes sense.
That's not quite right. Decoding takes a byte string that is already
in a particular encoding and transforms it to unicode. Unicode isn't
a encoding of it's own. Decoding takes a unicode string (which
doesn't have any encoding associated with it), and gives you back a
sequence of bytes in a particular encoding.

This article isn't specific to Python, but it provides a good overview
of unicode and character encodings that may be useful:
http://www.joelonsoftware.com/articles/Unicode.html

--
Jerry
Jul 30 '08 #8
"Jerry Hill" <malaclyp...@gm ail.comwrote:
On Wed, Jul 30, 2008 at 2:27 PM, william tanksley <wtanksle...@gm ail.com>wrote:
Awesome... Thank you! I had my mental model of Python turned around
backwards. That's an odd feeling. Okay, so you decode to go from raw
byes into a given encoding, and you encode to go from a given encoding
to raw bytes. Not what I thought it was, but that's cool, makes sense.
That's not quite right. *Decoding takes a byte string that is already
in a particular encoding and transforms it to unicode. *Unicode isn't
a encoding of it's own. *Decoding takes a unicode string (which
doesn't have any encoding associated with it), and gives you back a
sequence of bytes in a particular encoding.
Okay, this is useful. Thank you for straightening out my mental model.
It makes sense to define strings as just naturally Unicode... and
anything else is in some ways not really a string, although it's
something that might have many of the same methods. I guess this
mental model is being implemented more thoroughly in Py3K... Anyhow,
it makes sense.

I'm still puzzled why I'm getting some non-Unicode out of an
ElementTree's text, though.
Jerry
-Wm
Jul 31 '08 #9
william tanksley <wtanksle...@gm ail.comwrote:
I'm still puzzled why I'm getting some non-Unicode out of an
ElementTree's text, though.
Now I know.

Okay, my answer is that cElementTree (in Python 2.5) is simply
deranged when it comes to Unicode. It assumes everything's ASCII.

Reference: http://codespeak.net/lxml/compatibility.html

(Note that the lxml version also doesn't handle Unicode correctly; it
errors when XML declares its encoding.)

This is unpleasant, but at least now I know WHY it was driving me
insane.
-Wm
-Wm
Jul 31 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2822
by: Leo | last post by:
hi there i have just started with mac development and i use applescript at the moment for iTunes and iPhoto automation. but i'd like to do it from python. does anybody know weather that's possible and if there is a add on for that? thanks, leo
0
1176
by: John McClumpha | last post by:
Hi there, I'm working on a script to work with podcast feeds (RSS) and have found the lasRSS parser to be quite good. http://lastrss.webdot.cz/ I've modified this to work with i-tunes tags (as they have added their own to the RSS vocabulary!) however I still have one set of tags - category - that I am stuck with. an example of these tags (Category Technology, Sub Category Computers) would be:
0
1304
by: hardikmehta | last post by:
Hello All, :) I m using xml file for itunes. When i gives .php url from itunes then control back to my php page and creates .xml file and return to itunes. Itunes habdle some types of error that related to xml file like file exists or not,.xml file format is proper or not. Now my problem is i also check user validation at .php page so is it possible to add my own error in .xml file and itunes displays that error.
0
2349
by: bar10dr | last post by:
What I want to do is get song info from iTunes if the application is running, using the ituneslib API (from Apple). One of the problem is that to check if iTunes is playing any songs I need to make the COM object right? But when I make it, it starts up the iTunes.exe application if iTunes is not already running. To counter that I made checkiTunesAlive, wich goes trough the running processes in windows and returns true if iTunes.exe is...
3
3269
by: Denrael | last post by:
I've been playing with the iTunes sdk on windows, and have come across a strange problem. With the following code: import win32com.client iTunes = win32com.client.gencache.EnsureDispatch("iTunes.Application") curr = iTunes.CurrentTrack name = curr.Name skipped = curr.SkippedCount skipdate = curr.SkippedDate print name
2
1672
by: Maj | last post by:
Hy! This is not a post for asking something.. is more like a remark.. But I would really appreciate some explanations if possible.. I am now building an application with C# from MS Visual Studio 2005. I also listened some music with iTunes 7.1.1.5 When compiling the application I got no errors, but on running this simply does nothing and I get a nice error :) here is the error: "Attempting managed execution inside OS Loader lock....
2
1400
by: =?Utf-8?B?YmVubnlnaQ==?= | last post by:
I can't launch iTunes 7.6 using Windows Vista Ultimate. I was using it and it was working fine until it froze up. I ended the task and now it won't launch at all. Things I have tried: 1. Uninstalling and Reinstalling iTunes. 2. Doing a system restore in Windows Vista. 3. I created another user account on my computer to use iTunes and it works! Which means there is an issue with my user account. Any help?
5
3200
by: skip | last post by:
(I asked this on pythonmac-sig a couple days ago but got no response, so I'm casting a broader net.) Can I easily control audio record/playback from Python on my Mac? I know zip about audio recording or about Apple APIs via Python. Pointers to simple examples would be much appreciated. Thanks, Skip
0
8886
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8750
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9401
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9168
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9104
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8090
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6701
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6010
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4509
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.