473,748 Members | 7,571 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Stripping ASCII codes when parsing

I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these characters
in text I am not sure how these first 30 control characters are all
represented (other than say tab (\t), newline(\n), line return(\r) ) so
what should I do to remove these characters if they are ever
encountered. Many thanks.
Oct 17 '05 #1
2 4351
In article <ma************ *************** **********@pyth on.org>,
David Pratt <fa*******@east link.ca> wrote:
I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these characters
in text I am not sure how these first 30 control characters are all
represented (other than say tab (\t), newline(\n), line return(\r) ) so
what should I do to remove these characters if they are ever
encountered. Many thanks.


Most of those characters are hard to see.

Represent arbitrary characters in a string in hex: "\x00\x01\x 02" or
with chr(n).

If you just want to remove some characters, look into "".translat e().

nullxlate = "".join([chr(n) for n in xrange(256)])
delchars = nullxlate[:31] + chr(124)
outputstr = inputstr.transl ate(nullxlate, delchars)
_______________ _______________ _______________ _______________ ____________
TonyN.:' *firstname*nlsn ews@georgea*las tname*.com
' <http://www.georgeanels on.com/>
Oct 17 '05 #2
David Pratt wrote:
I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these characters
in text I am not sure how these first 30 control characters are all
represented (other than say tab (\t), newline(\n), line return(\r) ) so
what should I do to remove these characters if they are ever
encountered. Many thanks.


Use ''.translate. Pass in the identity mapping for the first argument,
and for the second parameter, specify the list of all the characters you
wish to delete. This would probably be something like:

IDENTITY_MAP = ''.join([chr(x) for x in range(256)])
BAD_MAP = ''.join([chr(x) for x in range(32) + [124])

aNewString = aString.transla te(IDENTITY_MAP , BAD_MAP)

Note that ASCII 31 is also a control character (US).

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
The believer is happy; the doubter is wise.
-- (an Hungarian proverb)
Oct 17 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
6203
by: Martn Marconcini | last post by:
Hello there, I'm writting (or trying to) a Console Application in C#. I has to be console. I remember back in the old days of Cobol (Unisys), Clipper and even Basic, I used to use a program (its name i cannot recall now...) where I designed the "screen" using this "program" and then saved it into an ASCII file. (thus, using 'extended' ASCII's like Lines, Corners, etc. and making screens look nicer and more professional). Then reading a...
11
17416
by: Kai Bohli | last post by:
Hi all ! I need to translate a string to Ascii and return a string again. The code below dosen't work for Ascii (Superset) codes above 127. Any help are greatly appreciated. protected internal string StringToAscii(string S) { byte strArray = Encoding.UTF7.GetBytes(S); string NewString = Encoding.UTF7.GetString(strArray);
3
24526
by: JSM | last post by:
Hi, I am just trying to port an existing simple encryption routine to C#. this routine simply adds/substracts 10 ascii characters to each character in a text file (except quotes). The routine for decrypting the file works fine however when I encrypt the file, several characters are corrupted. when I looked into it they are always extended ascii characters (eg "x" which is ascii character 120 gets translated to ascii character 130 which...
4
6674
by: Lu | last post by:
Hi, i am currently working on ASP.Net v1.0 and is encountering the following problem. In javascript, I'm passing in: "somepage.aspx?QSParameter=<RowID>Chèques</RowID>" as part of the query string. However, in the code behind when I tried to get the query string value by calling Request.QueryString("QSParameter"), the value I got is: "<RowID>Chques</RowID>". The special character "è" has been stripped out. The web.config file is...
9
8262
by: simchajoy2000 | last post by:
Hi, I know what the ASCII Character Codes are for the 2nd and 3rd powers in VB.NET but I can't find the 6th power anywhere - does anyone know what it might be or if it even exists? Joy
6
2099
by: McHenry | last post by:
When parsing HTML is it possible to have all the ASCII codes converted to their real values first so that I do not need to search for them to exclude them. For example the following is retrieved as a price however it would be easier to extract using a regex if the code was first converted to a dollar sign: <h3> $249,000
7
3094
by: FFMG | last post by:
Hi, I have a form that allows users to comment, add entries and so on. But what a lot of them do is copy and paste directly from MS Word to my forms. almost all browsers will accept the post and give the impression that everything is saved properly. But, that is not the case when it comes time to displaying the message
6
5394
by: Andy Leese | last post by:
Beginner Question: ASCII Symbols I am using Borland C++ and programming under DOS. I wish to display the symbols of the early ASCII character set... For example: cout << char(7); Obviously this is assigned to the BELL signal and therefore sounds the beep
9
4145
by: =?Utf-8?B?RGFu?= | last post by:
I have the following code section that I thought would strip out all the non-ascii characters from a string after decoding it. Unfortunately the non-ascii characters are still in the string. What am I doing wrong? Dim plainText As String plainText = "t═e" Dim plainTextBytes() As Byte Dim enc As Encoding = Encoding.ASCII plainTextBytes = enc.GetBytes(plainText)
0
8991
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
0
9544
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9372
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9247
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8243
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6796
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupr who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6074
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4874
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3313
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.