473,407 Members | 2,315 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,407 software developers and data experts.

Detect non-standard characters in string

Hi

I have a project to take a MS Word doc and reformat the text into text files
that are
built into my App.

The only issue I have is some time there are some characters in MS Word that
are not printable when viewed in Notepad. I usually catch by looking at the
text in my App. Usually the problem is
an extra long hyphen --
a dagger +

Usually when I debug the string I see a squareblock in the string

Is there someway to trap the characters that will be not printable/viewable
in say notepad????

Thanks
Oct 24 '07 #1
5 2472
You could probably use Char.IsSymbol() in this case

--
Browse http://connect.microsoft.com/VisualStudio/feedback/ and vote.
http://www.peterRitchie.com/blog/
Microsoft MVP, Visual Developer - Visual C#
"sippyuconn" wrote:
Hi

I have a project to take a MS Word doc and reformat the text into text files
that are
built into my App.

The only issue I have is some time there are some characters in MS Word that
are not printable when viewed in Notepad. I usually catch by looking at the
text in my App. Usually the problem is
an extra long hyphen --
a dagger +

Usually when I debug the string I see a squareblock in the string

Is there someway to trap the characters that will be not printable/viewable
in say notepad????

Thanks

Oct 24 '07 #2
I would just check against each numeric character value to see if the
character is outside the range of ASCII characters. Most likely, what is
happening is that the text is being placed on the clipboard as unicode, but
then when you try to paste it into notepad (which is using ASCII), it does
it's best by using the square character to indicate that it couldn't perform
a conversion.

--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"Peter Ritchie [C# MVP]" <PR****@newsgroups.nospamwrote in message
news:1D**********************************@microsof t.com...
You could probably use Char.IsSymbol() in this case

--
Browse http://connect.microsoft.com/VisualStudio/feedback/ and vote.
http://www.peterRitchie.com/blog/
Microsoft MVP, Visual Developer - Visual C#
"sippyuconn" wrote:
>Hi

I have a project to take a MS Word doc and reformat the text into text
files
that are
built into my App.

The only issue I have is some time there are some characters in MS Word
that
are not printable when viewed in Notepad. I usually catch by looking at
the
text in my App. Usually the problem is
an extra long hyphen --
a dagger +

Usually when I debug the string I see a squareblock in the string

Is there someway to trap the characters that will be not
printable/viewable
in say notepad????

Thanks


Oct 24 '07 #3
I don't know how the OP has configured notepad or Word ; but notepad supports
Unicode.

The "square character" could be the glyph that is displayed for a Unicode
character not supported by the current font. Char.IsSymbol should still
catch it, at least in the case of dagger and em dash. I don't know what most
fonts are like for support of "printable" characters; but it does depend on
the font what is "printable/viewable".

--
Browse http://connect.microsoft.com/VisualStudio/feedback/ and vote.
http://www.peterRitchie.com/blog/
Microsoft MVP, Visual Developer - Visual C#
"Nicholas Paldino [.NET/C# MVP]" wrote:
I would just check against each numeric character value to see if the
character is outside the range of ASCII characters. Most likely, what is
happening is that the text is being placed on the clipboard as unicode, but
then when you try to paste it into notepad (which is using ASCII), it does
it's best by using the square character to indicate that it couldn't perform
a conversion.

--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"Peter Ritchie [C# MVP]" <PR****@newsgroups.nospamwrote in message
news:1D**********************************@microsof t.com...
You could probably use Char.IsSymbol() in this case

--
Browse http://connect.microsoft.com/VisualStudio/feedback/ and vote.
http://www.peterRitchie.com/blog/
Microsoft MVP, Visual Developer - Visual C#
"sippyuconn" wrote:
Hi

I have a project to take a MS Word doc and reformat the text into text
files
that are
built into my App.

The only issue I have is some time there are some characters in MS Word
that
are not printable when viewed in Notepad. I usually catch by looking at
the
text in my App. Usually the problem is
an extra long hyphen --
a dagger +

Usually when I debug the string I see a squareblock in the string

Is there someway to trap the characters that will be not
printable/viewable
in say notepad????

Thanks



Oct 24 '07 #4
"sippyuconn" <si********@newsgroup.nospamwrote in message
news:A5**********************************@microsof t.com...
Hi

I have a project to take a MS Word doc and reformat the text into text
files
that are
built into my App.

The only issue I have is some time there are some characters in MS Word
that
are not printable when viewed in Notepad. I usually catch by looking at
the
text in my App. Usually the problem is
an extra long hyphen --
a dagger +

Usually when I debug the string I see a squareblock in the string

Is there someway to trap the characters that will be not
printable/viewable
in say notepad????
You need to use an Encoding object obtained via the Encoding.GetEncoding
static method. This method allows you to specify the EncoderFallBack class
to use (this defaults to the EncoderReplacementFallback which simply
replaces un-encodable chars with ?).

By supplying the EncoderExceptionFallback object instead then when using the
Encoding to convert your content any out-of-band characters will cause an
EncoderFallbackException to be thrown.

The EncoderFallbackException has properties that you can use to discover
what character caused the problem and where it is.

--
Anthony Jones - MVP ASP/ASP.NET
Oct 24 '07 #5
I agree with Anthony here.

Some more references:

#I'm not a Klingon : Best Fit in WideCharToMultiByte and
System.Text.Encoding Should be Avoided
http://blogs.msdn.com/shawnste/archi...19/515047.aspx

#Fallback Encoding Application Sample
http://msdn2.microsoft.com/en-us/lib...00(VS.80).aspx

Hope this helps.
Regards,
Walter Wang (wa****@online.microsoft.com, remove 'online.')
Microsoft Online Community Support

==================================================
When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.
==================================================

This posting is provided "AS IS" with no warranties, and confers no rights.

Oct 25 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Nicholas Shanks | last post by:
I would like to use PHP (or any other server-side technique) to detect if JavaScript was turned on in the user's browser. Is there any known way to do this, and if not, why not! I realise I...
13
by: vega | last post by:
How do I detect empty tags if I have the DOM document? For example: <br /> and <br></br> I tried org.w3c.dom.Node.getFirstChild(), it returns null for both <br /> and <br></br> I also tried...
23
by: Michel Bany | last post by:
I am trying to parse responseXML from an HTTP request. var doc = request.responseXML; var elements = doc.getElementsByTagName("*"); the last statement returns an empty collection when running from...
6
by: Adam Warner | last post by:
Hi all, Is this a (C99) portable way to detect whether the C stack grows upwards (1) or downwards (-1)? #include <stdio.h> int stack_direction=0; void detect_stack_direction(void *...
19
by: lihua | last post by:
Hi, Group! I got one question here: We all know that fclose() must be called after file operations to avoid unexpected errors.But there are really cases when you forget to do that!Just like...
3
by: Richard Thornley | last post by:
Hello, I was just been given a project and I have some questions on how to accomplish the first part of the task. If a user sends an email to a specific email address I need to detect...
2
by: FE | last post by:
Hi, I need to create a program that will connect a media stream (a server on the internet - Windows Media format) and then generate an error when : * The stream is ok but there is no sound...
3
by: Asterbing | last post by:
Since the "on fly addition..." thread has taken another direction, I'm opening a new one to be more explicit and recenter the subject. Well, the subject is to detect when a document is well...
1
by: sarabhjeet | last post by:
Hi friends, can anybody tell me how to detect ascii or non-ascii strings using python.
0
by: SHP | last post by:
Hi It there a way to detect non reentrant or thread unsafe function usage at compile time? Any option in gcc? thanks shekhar
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.