473,467 Members | 1,512 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Stripping ASCII codes when parsing

I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these characters
in text I am not sure how these first 30 control characters are all
represented (other than say tab (\t), newline(\n), line return(\r) ) so
what should I do to remove these characters if they are ever
encountered. Many thanks.
Oct 17 '05 #1
2 4316
In article <ma*************************************@python.or g>,
David Pratt <fa*******@eastlink.ca> wrote:
I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these characters
in text I am not sure how these first 30 control characters are all
represented (other than say tab (\t), newline(\n), line return(\r) ) so
what should I do to remove these characters if they are ever
encountered. Many thanks.


Most of those characters are hard to see.

Represent arbitrary characters in a string in hex: "\x00\x01\x02" or
with chr(n).

If you just want to remove some characters, look into "".translate().

nullxlate = "".join([chr(n) for n in xrange(256)])
delchars = nullxlate[:31] + chr(124)
outputstr = inputstr.translate(nullxlate, delchars)
__________________________________________________ ______________________
TonyN.:' *firstname*nlsnews@georgea*lastname*.com
' <http://www.georgeanelson.com/>
Oct 17 '05 #2
David Pratt wrote:
I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these characters
in text I am not sure how these first 30 control characters are all
represented (other than say tab (\t), newline(\n), line return(\r) ) so
what should I do to remove these characters if they are ever
encountered. Many thanks.


Use ''.translate. Pass in the identity mapping for the first argument,
and for the second parameter, specify the list of all the characters you
wish to delete. This would probably be something like:

IDENTITY_MAP = ''.join([chr(x) for x in range(256)])
BAD_MAP = ''.join([chr(x) for x in range(32) + [124])

aNewString = aString.translate(IDENTITY_MAP, BAD_MAP)

Note that ASCII 31 is also a control character (US).

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
The believer is happy; the doubter is wise.
-- (an Hungarian proverb)
Oct 17 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Martín Marconcini | last post by:
Hello there, I'm writting (or trying to) a Console Application in C#. I has to be console. I remember back in the old days of Cobol (Unisys), Clipper and even Basic, I used to use a program...
11
by: Kai Bohli | last post by:
Hi all ! I need to translate a string to Ascii and return a string again. The code below dosen't work for Ascii (Superset) codes above 127. Any help are greatly appreciated. protected...
3
by: JSM | last post by:
Hi, I am just trying to port an existing simple encryption routine to C#. this routine simply adds/substracts 10 ascii characters to each character in a text file (except quotes). The routine...
4
by: Lu | last post by:
Hi, i am currently working on ASP.Net v1.0 and is encountering the following problem. In javascript, I'm passing in: "somepage.aspx?QSParameter=<RowID>Chèques</RowID>" as part of the query...
9
by: simchajoy2000 | last post by:
Hi, I know what the ASCII Character Codes are for the 2nd and 3rd powers in VB.NET but I can't find the 6th power anywhere - does anyone know what it might be or if it even exists? Joy
6
by: McHenry | last post by:
When parsing HTML is it possible to have all the ASCII codes converted to their real values first so that I do not need to search for them to exclude them. For example the following is retrieved...
7
by: FFMG | last post by:
Hi, I have a form that allows users to comment, add entries and so on. But what a lot of them do is copy and paste directly from MS Word to my forms. almost all browsers will accept the post...
6
by: Andy Leese | last post by:
Beginner Question: ASCII Symbols I am using Borland C++ and programming under DOS. I wish to display the symbols of the early ASCII character set... For example: cout << char(7); ...
9
by: =?Utf-8?B?RGFu?= | last post by:
I have the following code section that I thought would strip out all the non-ascii characters from a string after decoding it. Unfortunately the non-ascii characters are still in the string....
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.