473,699 Members | 2,114 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

newbie: how do I test a byte string?

5 New Member
How do I test a byte string in Python? I want to manually convert (no libraries or functions) a UTF-8 string into UTF-16.

My basic solution is to read from the stream some number of UTF-8 bytes, convert them into codepoints, then convert those codepoints into UTF-16 bytes. I want to code this myself, but I don't understand how to test the actual byte sequence.

Let's say I use the following code to ensure I have a UTF-8 encoding (from Evan Jones' Scratch Pad: http://evanjones.ca/python-utf8.html)

Expand|Select|Wrap|Line Numbers
  1. s = "hello normal string"
  2. u = unicode( s, "utf-8" )
  3. backToBytes = u.encode( "utf-8" )
  4.  
Now, I need to test the lead byte of the sequence for each character in "backToByte s", right? Is there a function that does this? Any help would be appreciated.
Jul 30 '08 #1
1 2012
Slippy27
5 New Member
I guess I get to solve my own thread (thanks again to the Natural Language Toolkit's online tutorial). The function repr() appears to give me what I need:

Expand|Select|Wrap|Line Numbers
  1. line = u'\u0144'
  2. line_utf = line.encode('utf8')
  3.  
  4. print 'line = ', line_utf
  5. print 'line repr = ', repr(line_utf)
  6.  
Output:
line = Å„
line repr = '\xc5\x84'

It's the '\xc5\x84' part that I needed.
Jul 31 '08 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

9
4166
by: lawrence | last post by:
Someone on www.php.net suggested using a seems_utf8() method to test text for UTF-8 character encoding but didn't specify how to write such a method. Can anyone suggest a test that might work? Something that maybe gives 90% confidence that a given block of text is or is not UTF-8 encoded?
12
4419
by: Nollie | last post by:
I need to write a couple of my own string manipulation routines (e.g. a strcpy() alternative that returns the number of chars copied). I've started with one of the simpler functions, strlen(). I've written a very simple version call StringLength(), but it performs significantly slower than its CRT counterparts. Here's my implementation: inline unsigned int StringLength( const char *pszString ) { unsigned int cch = 0;
14
2854
by: ThazKool | last post by:
I want to see if this code works the way it should on a Big-Endian system. Also if anyone has any ideas on how determine this at compile-time so that I use the right decoding or encoding functions, I would greatly appreciate the help. Thanks, Ché #include <iostream>
6
6077
by: tchaiket | last post by:
Hey all, I'm using NUNIT to test our classes. However, I don't want to hard code test data into the NUNIT classes. For example, to test adding a new Client, I don't want to hard code the Client Name, phone number, email, etc. I want to the NUNIT to read this data from a file. This way I can change the data in a file easily and test various types of test data. I think this can be done using the CONFIG file that NUNIT reads. Is
4
1634
by: Spam Catcher | last post by:
Anyone know of any form upload utilities which will allow me to upload files to an ASP.NET page without the need of a web browser? I like to test a form upload page I wrote (no GUI, it only accepts a form post). Thanks!
11
1499
by: itgetsharder | last post by:
can anyone help me? code errors! -------------------------------------------------------------------------------- im creating a code for a printer. the question i am trying to answer is : MyPrinter needs some printing methods. The first, signature: public boolean printOne(String text) {} should take a String as a parameter, and output it on a single line to the terminal window. It should also increment (add 1 to) the total number...
11
1638
by: new4cprog | last post by:
I need your guidance as to how to approach this question. I don't even know where to begin. It seems a bit general for me. Thanks alot. Devise a function to test bits in a byte, and it returns a) success or failure, b) count of bits that are set (“==”) in the byte, c) position of bit(s) in the byte.
0
1687
by: Madmartigan | last post by:
Hi I'm a newbie to C# and have been instructed to create a Hangman game in SharpDevelop. I don't want the answer to the full code, just some help along the way. I have included my code thus far and at this stage would like to know how I can get the RandomWordManager, which I found on another site, to display a newly generated word as the textBox1 text when the user enters a new game. I have enclosed tags around the two forms I have...
2
1515
by: Damfino | last post by:
Hi all, Newbie question here wrt defining a class that will work on bits read from a binary file. How would you go about doing it? As an example please look at the structure of my data given below. The data comes in 40 byte packets via stdin or a binary file. my_Data_pkt(){ syncByte (8bits) XML_type (2bits) XML_subtype (2bits) record_value (3bits)
0
8691
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9038
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8920
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8887
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7755
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5877
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4633
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3060
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2351
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.