473,699 Members | 2,525 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Stripping ASCII codes when parsing

I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these characters
in text I am not sure how these first 30 control characters are all
represented (other than say tab (\t), newline(\n), line return(\r) ) so
what should I do to remove these characters if they are ever
encountered. Many thanks.
Oct 17 '05 #1
2 4347
In article <ma************ *************** **********@pyth on.org>,
David Pratt <fa*******@east link.ca> wrote:
I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these characters
in text I am not sure how these first 30 control characters are all
represented (other than say tab (\t), newline(\n), line return(\r) ) so
what should I do to remove these characters if they are ever
encountered. Many thanks.


Most of those characters are hard to see.

Represent arbitrary characters in a string in hex: "\x00\x01\x 02" or
with chr(n).

If you just want to remove some characters, look into "".translat e().

nullxlate = "".join([chr(n) for n in xrange(256)])
delchars = nullxlate[:31] + chr(124)
outputstr = inputstr.transl ate(nullxlate, delchars)
_______________ _______________ _______________ _______________ ____________
TonyN.:' *firstname*nlsn ews@georgea*las tname*.com
' <http://www.georgeanels on.com/>
Oct 17 '05 #2
David Pratt wrote:
I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these characters
in text I am not sure how these first 30 control characters are all
represented (other than say tab (\t), newline(\n), line return(\r) ) so
what should I do to remove these characters if they are ever
encountered. Many thanks.


Use ''.translate. Pass in the identity mapping for the first argument,
and for the second parameter, specify the list of all the characters you
wish to delete. This would probably be something like:

IDENTITY_MAP = ''.join([chr(x) for x in range(256)])
BAD_MAP = ''.join([chr(x) for x in range(32) + [124])

aNewString = aString.transla te(IDENTITY_MAP , BAD_MAP)

Note that ASCII 31 is also a control character (US).

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
The believer is happy; the doubter is wise.
-- (an Hungarian proverb)
Oct 17 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
6196
by: Martn Marconcini | last post by:
Hello there, I'm writting (or trying to) a Console Application in C#. I has to be console. I remember back in the old days of Cobol (Unisys), Clipper and even Basic, I used to use a program (its name i cannot recall now...) where I designed the "screen" using this "program" and then saved it into an ASCII file. (thus, using 'extended' ASCII's like Lines, Corners, etc. and making screens look nicer and more professional). Then reading a...
11
17411
by: Kai Bohli | last post by:
Hi all ! I need to translate a string to Ascii and return a string again. The code below dosen't work for Ascii (Superset) codes above 127. Any help are greatly appreciated. protected internal string StringToAscii(string S) { byte strArray = Encoding.UTF7.GetBytes(S); string NewString = Encoding.UTF7.GetString(strArray);
3
24512
by: JSM | last post by:
Hi, I am just trying to port an existing simple encryption routine to C#. this routine simply adds/substracts 10 ascii characters to each character in a text file (except quotes). The routine for decrypting the file works fine however when I encrypt the file, several characters are corrupted. when I looked into it they are always extended ascii characters (eg "x" which is ascii character 120 gets translated to ascii character 130 which...
4
6667
by: Lu | last post by:
Hi, i am currently working on ASP.Net v1.0 and is encountering the following problem. In javascript, I'm passing in: "somepage.aspx?QSParameter=<RowID>Chèques</RowID>" as part of the query string. However, in the code behind when I tried to get the query string value by calling Request.QueryString("QSParameter"), the value I got is: "<RowID>Chques</RowID>". The special character "è" has been stripped out. The web.config file is...
9
8256
by: simchajoy2000 | last post by:
Hi, I know what the ASCII Character Codes are for the 2nd and 3rd powers in VB.NET but I can't find the 6th power anywhere - does anyone know what it might be or if it even exists? Joy
6
2097
by: McHenry | last post by:
When parsing HTML is it possible to have all the ASCII codes converted to their real values first so that I do not need to search for them to exclude them. For example the following is retrieved as a price however it would be easier to extract using a regex if the code was first converted to a dollar sign: <h3> $249,000
7
3094
by: FFMG | last post by:
Hi, I have a form that allows users to comment, add entries and so on. But what a lot of them do is copy and paste directly from MS Word to my forms. almost all browsers will accept the post and give the impression that everything is saved properly. But, that is not the case when it comes time to displaying the message
6
5391
by: Andy Leese | last post by:
Beginner Question: ASCII Symbols I am using Borland C++ and programming under DOS. I wish to display the symbols of the early ASCII character set... For example: cout << char(7); Obviously this is assigned to the BELL signal and therefore sounds the beep
9
4136
by: =?Utf-8?B?RGFu?= | last post by:
I have the following code section that I thought would strip out all the non-ascii characters from a string after decoding it. Unfortunately the non-ascii characters are still in the string. What am I doing wrong? Dim plainText As String plainText = "t═e" Dim plainTextBytes() As Byte Dim enc As Encoding = Encoding.ASCII plainTextBytes = enc.GetBytes(plainText)
0
8689
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
0
8618
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9178
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9035
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8916
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7752
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5875
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4376
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
3
2010
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.