473,883 Members | 1,787 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

strstr for Unicode characters

Hi all,

How could one write an strstr function to work with unicode characters?

Are there existing implementations/solutions/api for doing so?

Any pointers would be appreciated.

Thanks ..

Sep 4 '06 #1
13 14712

? "Kelvin Moss" <km**********@y ahoo.com?????? ??? ??????
news:11******** **************@ p79g2000cwp.goo glegroups.com.. .
Hi all,

How could one write an strstr function to work with unicode characters?

Are there existing implementations/solutions/api for doing so?

Any pointers would be appreciated.

Thanks ..
For windows, there is wcsstr.
Check
http://msdn.microsoft.com/library/de...c_._mbsstr.asp

--
Papastefanos Serafeim
Sep 4 '06 #2
Kelvin Moss wrote:
Hi all,

How could one write an strstr function to work with unicode characters?

Are there existing implementations/solutions/api for doing so?

Any pointers would be appreciated.

Thanks ..
You could always use memcpy() or memmove().
Sep 5 '06 #3
# Hi all,
#
# How could one write an strstr function to work with unicode characters?
#
# Are there existing implementations/solutions/api for doing so?

String functions should work just fine on UTF-8 encoded unicode
characters - minding that nonASCII characters will have codes greater
than 127 (or less than zero) and might be represented by multiple bytes.
For something like strstr which should only be looking for byte
sequences without embedded zeros, it should be fine, while strchr
can be problematically . There is also wide character (wc...) type
and functions becoming available which will probably be 16 bit or
wider unicode characters.

--
SM Ryan http://www.rawbw.com/~wyrmwif/
Don't say anything. Especially you.
Sep 6 '06 #4
SM Ryan <wy*****@tang o-sierra-oscar-foxtrot-tango.fake.orgw rote:
There is also wide character (wc...) type
and functions becoming available which will probably be 16 bit or
wider unicode characters.
for example as UTF16 used on Mac OS X File System ???
--
une bvue
Sep 6 '06 #5

SM Ryan wrote:
# Hi all,
#
# How could one write an strstr function to work with unicode characters?
#
# Are there existing implementations/solutions/api for doing so?

String functions should work just fine on UTF-8 encoded unicode
characters - minding that nonASCII characters will have codes greater
than 127 (or less than zero) and might be represented by multiple bytes.
For something like strstr which should only be looking for byte
sequences without embedded zeros, it should be fine, while strchr
can be problematically .
Yes, I am dealing with UTF8 encoded Unicode characters. So you mean to
say that as long as I don't have embedded zeroes in the strings strstr
should be fine. Right? I think this assumption may not work quite well
in real applications. Your thoughts?

Thanks ..

Sep 6 '06 #6
"Kelvin Moss" <km**********@y ahoo.comwrote in message
news:11******** **************@ p79g2000cwp.goo glegroups.com.. .
SM Ryan wrote:
>String functions should work just fine on UTF-8 encoded unicode
characters - minding that nonASCII characters will have codes greater
than 127 (or less than zero) and might be represented by multiple
bytes.
For something like strstr which should only be looking for byte
sequences without embedded zeros, it should be fine, while strchr
can be problematically .

Yes, I am dealing with UTF8 encoded Unicode characters. So you mean to
say that as long as I don't have embedded zeroes in the strings strstr
should be fine. Right? I think this assumption may not work quite
well
in real applications. Your thoughts?
UTF-8 won't have any embedded zeroes by definition; the encoding was
specifically designed to work transparently with C code that assumed
ASCII or some 8-bit ASCII-based encoding.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

--
Posted via a free Usenet account from http://www.teranews.com

Sep 6 '06 #7
Ioannis Papadopoulos wrote:
Kelvin Moss wrote:
>Hi all,

How could one write an strstr function to work with unicode characters?

Are there existing implementations/solutions/api for doing so?

Any pointers would be appreciated.

Thanks ..

You could always use memcpy() or memmove().
I do not know what I was thinking at the time. I thought you wanted a
function for strcpy. Please ignore my previous reply.

I tried strstr() for unicode chars and seems to work.
Sep 6 '06 #8
Kelvin Moss wrote:
Hi all,

How could one write an strstr function to work with unicode characters?

Are there existing implementations/solutions/api for doing so?
Do you want to deal with issues such as normalization? E.g. combining
characters can be represented in (many) different ways. In that case,
I've previously worked in a project that used the (IBM) ICU libraries
(licensed under the X license, GPL compatible).

Stijn

Sep 6 '06 #9
"Kelvin Moss" <km**********@y ahoo.comwrote:
#
# SM Ryan wrote:
# # Hi all,
# #
# # How could one write an strstr function to work with unicode characters?
# #
# # Are there existing implementations/solutions/api for doing so?
# >
# String functions should work just fine on UTF-8 encoded unicode
# characters - minding that nonASCII characters will have codes greater
# than 127 (or less than zero) and might be represented by multiple bytes.
# For something like strstr which should only be looking for byte
# sequences without embedded zeros, it should be fine, while strchr
# can be problematically .
#
# Yes, I am dealing with UTF8 encoded Unicode characters. So you mean to
# say that as long as I don't have embedded zeroes in the strings strstr
# should be fine. Right? I think this assumption may not work quite well
# in real applications. Your thoughts?
#
# Thanks ..
#
#
#

--
SM Ryan http://www.rawbw.com/~wyrmwif/
I love the smell of commerce in the morning.
Sep 6 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
5141
by: Borko | last post by:
hi I am having problems getting unicode characters into VB. Using VB6 (sp3) and Access 2000 Characters are displayed correctly in Access, just when I use ADODB (2.7) to read them in VB i get ? character instead of unicode characters. I will display them in TreeView (capable of Unicode) Is there any patch, fix or something, I know this thing is going around
3
5474
by: Kidus Yared | last post by:
I am having a problem displaying Unicode characters on my Forms labels and buttons. After coding Button1.Text = unicode; where the unicode is a Unicode character or string (‘\u1234’ or “\u1234”) It seems to work the first time I set the button to the Unicode character. After a while, when saving my code, I get a pop-up window stating that I need to save the file as a Unicode else my changes will not be saved. Seance I do not want...
3
5631
by: Mohammad-Reza | last post by:
We are writing an application for a specific culture(Arabic or Farsi). This application involves using DataAdapter, OLEDB Connection and the DataSet. We didn't use the .NET data binding, just field TextBoxes with the data retrieved from the DataSet but whole system seems to be unable to update the database (.mdb file) with Unicode characters. Instead of displaying the correct characters the application displays question marks (?). Isn't...
3
17982
by: john | last post by:
I need to produce a RTF-document which is filled with data from a database. I've created a RTF-document in WordPad (a template, so to speak) which contains 'placeholders', for example '<dd01>', '<dd02>', etc. I read the entire template into a StringBuilder and then perform a simple 'replace' on it, using a Hashtable. The keys in the Hashtable are strings representing the placeholders and the Hashtable's values contain data
5
4082
by: Matthew Thompson | last post by:
I have as issue I am finding hard to research. I use a stored proecdure in SQL 2000 to provide search capability for our database of news stories and articles. Being an international magazine publisher we use foreign characters extensively. When searching for words (I am using Full Text Indexing and using the CONTAINSTABLE method) with accented characters such as Mller (Second character is Alt+0248) the form receives back Møller
3
2548
by: Christian Nunciato | last post by:
Hi all: I've read through the various related posts in this forum, but without any success as yet. I've got an ASP.NET application built in VS.NET 2003, and I'm trying to get the Armenian Unicode characters I type into the textboxes of my ASPX Web pages (via IE6) to post back to the server and get saved in my SQL database as Armenian Unicode characters. Not having much luck, though -- all I get back are question marks.
5
3335
by: abhi147 | last post by:
Hi , I want to pass a string of unicode characters to a function . The string is a 4 bit unicode character string like"\xab\x0a\x0c\x0d" . These chars get converted to their ascii equivalent . Hence \x0a or \x0d is getting converted to Line feed /carriage return etc . So the function which accepts the string is reading just the 4 bits and ignoring the rest hence returning failure . Is there any other way to send these unicode...
6
3943
by: geegeegeegee | last post by:
Hi All, I have come across a difficult problem to do with extracting UniCode characters from RTF strings. A detailed description of my problem is below, if anyone could help, it would be much appreciated. I've tried to make the problem as clear as possible, but if any clarification is needed please let me know. Task -Convert RTF2 formatted text containing foreign characters (UniCode) to PlainText. Background -We are using Stephan...
0
571
by: M.-A. Lemburg | last post by:
On 2008-07-01 20:31, Peter Bulychev wrote: You could write a codec which translates Unicode into a ASCII lookalike characters, but AFAIK there is no standard for doing this. I guess the best choice is to use the Unicode code point names as basis. These can be accessed via unicodedata.name(). You can then create a mapping which can be processed by the character map codec.
0
9944
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
0
11153
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10420
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7134
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5804
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
6002
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4620
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4225
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3239
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.