Stripping ASCII codes when parsing

David Pratt

I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these characters
in text I am not sure how these first 30 control characters are all
represented (other than say tab (\t), newline(\n), line return(\r) ) so
what should I do to remove these characters if they are ever
encountered. Many thanks.

Oct 17 '05 #1

Subscribe Reply

4316

Tony Nelson

In article <ma*************************************@python.or g>,
David Pratt <fa*******@eastlink.ca> wrote:

I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these characters
in text I am not sure how these first 30 control characters are all
represented (other than say tab (\t), newline(\n), line return(\r) ) so
what should I do to remove these characters if they are ever
encountered. Many thanks.

Most of those characters are hard to see.

Represent arbitrary characters in a string in hex: "\x00\x01\x02" or
with chr(n).

If you just want to remove some characters, look into "".translate().

nullxlate = "".join([chr(n) for n in xrange(256)])
delchars = nullxlate[:31] + chr(124)
outputstr = inputstr.translate(nullxlate, delchars)
__________________________________________________ ______________________
TonyN.:' *firstname*nlsnews@georgea*lastname*.com
' <http://www.georgeanelson.com/>

Oct 17 '05 #2

Erik Max Francis

David Pratt wrote:

I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these characters
in text I am not sure how these first 30 control characters are all
represented (other than say tab (\t), newline(\n), line return(\r) ) so
what should I do to remove these characters if they are ever
encountered. Many thanks.

Use ''.translate. Pass in the identity mapping for the first argument,
and for the second parameter, specify the list of all the characters you
wish to delete. This would probably be something like:

IDENTITY_MAP = ''.join([chr(x) for x in range(256)])
BAD_MAP = ''.join([chr(x) for x in range(32) + [124])

aNewString = aString.translate(IDENTITY_MAP, BAD_MAP)

Note that ASCII 31 is also a control character (US).

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
The believer is happy; the doubter is wise.
-- (an Hungarian proverb)

Oct 17 '05 #3

by: Martín Marconcini | last post by:

Hello there, I'm writting (or trying to) a Console Application in C#. I has to be console. I remember back in the old days of Cobol (Unisys), Clipper and even Basic, I used to use a program...

C# / C Sharp

Convert a string to Ascii codes and then back to string again

by: Kai Bohli | last post by:

Hi all ! I need to translate a string to Ascii and return a string again. The code below dosen't work for Ascii (Superset) codes above 127. Any help are greatly appreciated. protected...

C# / C Sharp

Writing extended ascii characters to text file.

by: JSM | last post by:

Hi, I am just trying to port an existing simple encryption routine to C#. this routine simply adds/substracts 10 ascii characters to each character in a text file (except quotes). The routine...

C# / C Sharp

Request.QueryString() is stripping out French characters

by: Lu | last post by:

Hi, i am currently working on ASP.Net v1.0 and is encountering the following problem. In javascript, I'm passing in: "somepage.aspx?QSParameter=<RowID>ChÃ¨ques</RowID>" as part of the query...

ASP.NET

ASCII Character code for the 6th power

by: simchajoy2000 | last post by:

Hi, I know what the ASCII Character Codes are for the 2nd and 3rd powers in VB.NET but I can't find the 6th power anywhere - does anyone know what it might be or if it even exists? Joy

Visual Basic .NET

Parse HTML ASCII

by: McHenry | last post by:

When parsing HTML is it possible to have all the ASCII codes converted to their real values first so that I do not need to search for them to exclude them. For example the following is retrieved...

PHP

Stripping MS Word code from my forms once and for all.

by: FFMG | last post by:

Hi, I have a form that allows users to comment, add entries and so on. But what a lot of them do is copy and paste directly from MS Word to my forms. almost all browsers will accept the post...

PHP

Beginner Question: ASCII Symbols

by: Andy Leese | last post by:

Beginner Question: ASCII Symbols I am using Borland C++ and programming under DOS. I wish to display the symbols of the early ASCII character set... For example: cout << char(7); ...

C / C++

encoding.ascii

by: =?Utf-8?B?RGFu?= | last post by:

I have the following code section that I thought would strip out all the non-ascii characters from a string after decoding it. Unfortunately the non-ascii characters are still in the string....

Visual Basic .NET

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

Stripping ASCII codes when parsing

Similar topics