473,785 Members | 2,476 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

codec to parse raw UCS data?



Where can I find a list and documentation for codecs?
What I want to do is to make a unicode string out of unicode data.
for example. I am parsing NTFS metadata, that contains filenames as
UCS-2 code, so I have a binary string that pretends to be a UCS-2.
Currently I am using hand-written module, a wrapper to iconv library, to make
translation of this data (that is, I used it for that task).
However, I am aware of python's codecs that are supposed to do the same
thing in much prettier way.. So, as long as I can do things like
st = u'\uxxxx' and such, I could construct and exec it, but this is ugly...
Shouldn't there be a simple thing like <bin.string>.de code("UCS-2")
that would return a python's unicode string?
Or perhaps I've missed something in the documentation.. .
Jul 18 '05 #1
1 2657
Oleg Leschov wrote:
Where can I find a list and documentation for codecs?
What I want to do is to make a unicode string out of unicode data.
for example. I am parsing NTFS metadata, that contains filenames as
UCS-2 code, so I have a binary string that pretends to be a UCS-2.


the "utf-16-le" codec is probably what you want.

(utf-16 is basically ucs-2 plus mechanisms to encode characters outside
the 16-bit BMP set; IIRC, Windows 2k and later uses utf-16, not ucs-2).

</F>


Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
4576
by: Fuzzyman | last post by:
Sorry if my terminology is wrong..... but I'm having intermittent problems dealing with accented characters in python. (Only from the 8 bit latin-1 character set I think..) I've written an anagram finder that produces anagrams from a dictionary of words. The user can load their own dictionary. ( http://www.voidspace.org.uk/atlantibots/nanagram.html ) It's particularly difficult for me to understand what is happening -
2
3294
by: Max M | last post by:
Is there any codec available for handling The special UTF-7 codec for IMAP? I have searched the web for info, but there only seem to be discussions about it. Not actual implementations. This is what I am talking about: http://www.faqs.org/rfcs/rfc2060.html 5.1.3. Mailbox International Naming Convention
1
2988
by: John Perks and Sarah Mount | last post by:
(My Python uses UTF16 natively; can someone with UTF32 Python let me know if that behaves differently?) >>> import codecs >>> u'\ud800' # part of surrogate pair u'\ud800' codecs.utf_16_be_encode(_) '\xd8\x00' codecs.utf_16_be_decode(_) Traceback (most recent call last):
3
20543
by: thomas Armstrong | last post by:
Hi Using Python 2.3.4 + Feedparser 3.3 (a library to parse XML documents) I'm trying to parse a UTF-8 document with special characters like acute-accent vowels: -------- <?xml version="1.0" encoding="UTF-8" standalone="yes"?> .... -------
11
4763
by: UJ | last post by:
If I've got a video/audio file, how can I tell what Codec it needs? I want to be able to let the user upload a file to a server but I want to make sure before hand that the codec is already installed on the machine. If not I'll tell them it won't work. Any ideas how to do this? (I don't want them to download codec - I just want to use the codecs I have on the machine already.)
9
2069
by: beni.cherniavsky | last post by:
Python seems to be missing a UCS-32 codec, even in wide builds (not that it the build should matter). Is there some deep reason or should I just contribute a patch? If it's just a bug, should I call the codec 'ucs-32' or 'utf-32'? Or both (aliased)? There should be '-le' and '-be' variats, I suppose. Should there be a variant without explicit endianity, using a BOM to decide (like 'utf-16')? And it should combine surrogates into...
0
1146
by: elizabeth.kegel | last post by:
Hello- I have a webform with a link that needs to open an audio file *.wma. *.mp3, etc. What is odd is I am able to click on the file and the Windows Media Player opens and the audio file plays. When I try to open the file using the following code, the WMA player opens but then says there is is a missing codec and the file cannot be played. (C00D109B). My assumption is since I can play the file outside of my web application without a...
4
5379
by: Oleg Parashchenko | last post by:
Hello, I'm working on an unicode-aware application. I like to use "print" to debug programs, but in this case it was nightmare. The most popular result of "print" was: UnicodeDecodeError: 'ascii' codec can't decode byte 0xXX in position 0: ordinal not in range(128) I spent two hours fixing it, and I hope it's done. The solution is one
3
2218
by: Torsten Bronger | last post by:
Hallöchen! I'd like to map general unicode strings to safe filename. I tried punycode but it is case-sensitive, which Windows is not. Thus, "Hallo" and "hallo" are mapped to "Hallo-" and "hallo-", however, I need uppercase Latin letters being encoded, too, and the encoding must contain only lowercase Latin letters, numbers, underscores, and maybe a little bit more. The result should be more legible than base64, though.
0
10155
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10095
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9954
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8979
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7502
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6741
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5383
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4054
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3656
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.