473,396 Members | 2,061 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

a small code for filter non-chinese files.

// My email is jv*****@hotmail.com
// I want to make some friends and discuss about programing.

#include "stdafx.h"
#include <malloc.h>
#include <string.h>
#define sec1 (c=buffer[i])&&(c>0x80&&c<0xA1||c>0xA9&&c<0xFF)
#define sec2 (c=buffer[i])&&(c>0xA0&&c<0xAA)

void Convert2PureChinese(unsigned char* buffer)
{
unsigned char c;
for(int i=0;i<(int)strlen((char*)buffer);)
if(sec1)
{
printf("%c%c", buffer[i], buffer[i+1]);
i+=2;
}
else if(sec2)
i+=2;
else
i++;
}

void getFileBuffer(FILE* fp)
{
if(fp==NULL)
return;
fseek(fp, 0L, SEEK_END);
long len=ftell(fp);
rewind(fp);
unsigned char* buffer=(unsigned char*)malloc(len);
fread(buffer, len, 1, fp);
fclose(fp);
Convert2PureChinese(buffer);
free(buffer);
}

int main(int argc, char* argv[])
{
getFileBuffer(fopen("test.txt", "r+b"));
return 0;
}

Sep 7 '05 #1
8 1312
jv*****@gmail.com wrote:
// My email is jv*****@hotmail.com
// I want to make some friends and discuss about programing.
Great - nice to meet you. Have a C++ question ?

Nice "C" program, not C++.

#include "stdafx.h" Microsoft precompiled header nonsesnse - why do you need it here
exactly? Turn off precompiled headers and get rit of it if you want to
write truly portable code.
#include <malloc.h>
malloc.h is not commonly used in C++ and should probably use

#include <cstdlib>
#include <string.h>
In C++ you should include:

#include <cstring>
#define sec1 (c=buffer[i])&&(c>0x80&&c<0xA1||c>0xA9&&c<0xFF)
#define sec2 (c=buffer[i])&&(c>0xA0&&c<0xAA)
MUCH MUCH better if you wrote inline functions instead of macros.

void Convert2PureChinese(unsigned char* buffer)
{
unsigned char c;
for(int i=0;i<(int)strlen((char*)buffer);)
Why do you call strlen for every character ?
if(sec1)
{
printf("%c%c", buffer[i], buffer[i+1]);
i+=2;
}
else if(sec2)
i+=2;
else
i++;
}

Use std::istream instead of FILE* for C++
void getFileBuffer(FILE* fp)
{
if(fp==NULL)
return;
fseek(fp, 0L, SEEK_END);
long len=ftell(fp);
rewind(fp);
unsigned char* buffer=(unsigned char*)malloc(len);
C style cast instead of C++ style casts. Also looks like you need to
learn about std::string (or std::basic_string).
fread(buffer, len, 1, fp);
fclose(fp);
Convert2PureChinese(buffer);
free(buffer);
Memory management is best done by the compiler. Use std::string.
}

int main(int argc, char* argv[])
{
getFileBuffer(fopen("test.txt", "r+b"));
No reports of error if fopen fails ?
return 0;
}


The code below would look more like C++.

#include <iostream>
#include <ostream>
#include <istream>
#include <fstream>

inline bool Sec1( unsigned char c )
{
return ( c > 0x80 && c <0xA1 || c > 0xA9 && c < 0xFF );
}

inline bool Sec2( unsigned char c )
{
return ( c > 0xA0 ) && ( c < 0xAA );
}

std::ostream & Convert2PureChinese(
std::istream & i_i,
std::ostream & i_o
) {

char c;

while ( i_i )
{
i_i.get( c );

if ( Sec1( c ) )
{
i_o << c;
i_i.get( c );
i_o << c;
}
else if ( Sec2( c ) )
{
i_i.get( c );
}
}

return i_o;
}

int main(int argc, char* argv[])
{
std::ifstream i_file( "test.txt", std::ios_base::binary );

if ( i_file.fail() )
{
std::cerr << "Failed to open test.txt\n";
return 1;
}

Convert2PureChinese( i_file, std::cout );
}
Sep 7 '05 #2

"Gianni Mariani" <gi*******@mariani.ws> wrote in message
news:-O********************@speakeasy.net...
jv*****@gmail.com wrote:
// My email is jv*****@hotmail.com
// I want to make some friends and discuss about programing.


Great - nice to meet you. Have a C++ question ?

Nice "C" program, not C++.

#include "stdafx.h"

Microsoft precompiled header nonsesnse - why do you need it here exactly?
Turn off precompiled headers and get rit of it if you want to write truly
portable code.


Why bitch about it? If it makes no difference then why make an issue? Just
cause you don't like microsoft doesn't mean you need to make this an issue
when its not. It definately speeds up the compiling process so it is
needed... if he wants to port the code(which is usually never the case
anyways), you know what? takes about 2 seconds to turn off precompiled
headers(if that) and about 1 second to remove the #include tag. (and if its
a very large project I'm sure he can make it so he can remove this with
ill-effect)

And I promise you that just by trying off precompiled headers isn't "going
to make it truely portable".
It seems like one should bitch more about that it is a C program and this is
a C++ newsgroup than about including a ms specific headers? (actually, I
just rename my precompiled headers to Headers.h ;)

Jon
Sep 7 '05 #3
Jon Slaughter wrote:
And I promise you that just by trying off precompiled headers isn't "going
to make it truely portable".


Bad logic.

"Programs are better if you don't use many gotos"

"Hey, just by limiting the use of gotos you don't have a good program"

--
Salu2
Sep 7 '05 #4

"Julián Albo" <JU********@terra.es> wrote in message
news:43********@x-privat.org...
Jon Slaughter wrote:
And I promise you that just by trying off precompiled headers isn't
"going
to make it truely portable".


Bad logic.

"Programs are better if you don't use many gotos"

"Hey, just by limiting the use of gotos you don't have a good program"

--
Salu2


um, actually not. Your logic must be worse than mine then because your not
following the logic of his comment.

He said that if you want to make your code "truely" protable then you must
remove the stdafx header... true enough, but he is implying this is basicaly
the only cause, that it is so signficant that all other issues of
portability are inconsequential. This is simply not true, as if it was the
only portability issue then its not even an issue since its so easy to fix.

i.e., its very simple. Why would someone making such an issue out of
something thats so easy to change? I'm sure if the OP ever wanted to make
his code portable he wouldn't have that big a deal with the stdafx... take a
few mins to fix at most. Oh but I'm sure its not possible he could have
some real portability problems that might take him hours to fix or even
longer?

The fact is that he's only bitching about it because he doesn't like
microsoft and he wants to make it an issue with everyone that does(not that
I do). Instead of ignoring it since it has nothing to do with the actual
question(well, it could, say, if there are a lot of headers included in the
stdafx that were not mentioned but still).

So, heres some logic that I guess you won't understand? Since you are
"sticking up" for him you too must hate microsoft? (and I bet you my
conclusion is right wether you agree with my logic or not).
I mean, I just don't really see why you guys who surely must hold your time
valuable would bitch about something that is irrelevant to the problem asked
when it can simply be ignored or a 5 word sentence can be used to say its
not a good idea. What this implies is there is some alterior reason why
someone would take some extra time to make it an issue.

Now I could understand if the OP asked "MY CODE ISN'T COMPILING IN GC++
PLEASE HELP!!" but when he is asking for something completely different and
the solution to the problem is irrelevant to his code being portable and
also the "tone" that is used then I can only come to trhee conclusions:
either he's just very arrogant and has a huge ego and likes to find ways to
let it be known, he is obsessed with portability, or he just hates anything
that has to do with microsoft and anyone that uses a microsoft product.
I'll bet its a combination of the first and last but thats just my guess.

The problem I have with it, and its not alterior, is that I assume this
group is suppose to be a support group and the whole point is to help if you
want. Help != Throwing in your own egotistical comments that are irrelevant
to the solution. So I think its fair game to be bitched at if you think its
go to bitch at someone. If you are going to combine your help with an
attitude then I think your deserve what you get. I don't know what happend
to this NG but it used to be so much better.

Anyways, I got better things to do than talk about this crap.
Sep 7 '05 #5

"Jon Slaughter" <Jo***********@Hotmail.com> wrote in message
news:11*************@corp.supernews.com...

"Julián Albo" <JU********@terra.es> wrote in message
news:43********@x-privat.org...
Jon Slaughter wrote:
And I promise you that just by trying off precompiled headers isn't
"going
to make it truely portable".


Bad logic.

"Programs are better if you don't use many gotos"

"Hey, just by limiting the use of gotos you don't have a good program"

--
Salu2


um, actually not. Your logic must be worse than mine then because your not
following the logic of his comment.

<snip>
I guess I should say that if I'm wrong then I apologize... I don't want to
misinterpret what someone says, but usually if it walks like a duck, swims
like a duck, quacks like a duck, craps like a duck, smells like duck,
etc... then its usually a duck.

Jon
Sep 7 '05 #6
Jon Slaughter wrote:
"Gianni Mariani" <gi*******@mariani.ws> wrote in message
news:-O********************@speakeasy.net...
jv*****@gmail.com wrote:
// My email is jv*****@hotmail.com
// I want to make some friends and discuss about programing.
Great - nice to meet you. Have a C++ question ?

Nice "C" program, not C++.

#include "stdafx.h"


Microsoft precompiled header nonsesnse - why do you need it here exactly?
Turn off precompiled headers and get rit of it if you want to write truly
portable code.

Why bitch about it?


I have run into more problems using stdafx.h that any other header file.
For most applications, it does not make any noticable difference to
compilation times. It also usually includes non portable header files.
Ever since I have used the practive of removing stdafx.{h,cpp} and
turned off precompiled headers, I have never had to deal with those
issues. Time not dealing with those issues is time saved.

If it makes no difference then why make an issue? Just cause you don't like microsoft doesn't mean you need to make this an issue
when its not.
Asserting what I like and don't like is a practice that usually leads
you to the wrong conclusions.

It definately speeds up the compiling process so it is needed...
I have never seen it do so in any appreciable way, and certainly not for
the OP's code.

if he wants to port the code(which is usually never the case anyways),
The practice of writing portable code usually speeds up the development
process. The practice I use in my current job is that we do Win32 and
Linux (both IA32 and AMD64) builds on automated builds. The number of
times that a bug was introduced in the Win32 code that was not caught by
the MS compiler (and visa versa) is astonishing right down to race
conditions that appeared only on one platform but not the others.
Finding bugs early means lower development cost.

Doing this means you need a compatability library.

you know what? takes about 2 seconds to turn off precompiled headers(if that) and about 1 second to remove the #include tag.
Yep - so do it.

(and if its a very large project I'm sure he can make it so he can remove this with
ill-effect)

And I promise you that just by trying off precompiled headers isn't "going
to make it truely portable".
It is if it contains OS specific headers.


It seems like one should bitch more about that it is a C program and this is
a C++ newsgroup than about including a ms specific headers? (actually, I
just rename my precompiled headers to Headers.h ;)


I suspect the OP's code would probably compile just as well in C++ as it
would in C, so technically, it's "standard" C++ except for stdafx.h and
malloc.h.

Sep 7 '05 #7
Jon Slaughter wrote:
Bad logic.

(snip)
So, heres some logic that I guess you won't understand? Since you are
"sticking up" for him you too must hate microsoft? (and I bet you my
conclusion is right wether you agree with my logic or not).


I must rectify. Your logic is not bad... is not logic at all.

--
Salu2
Sep 7 '05 #8
<jv*****@gmail.com> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com...
// My email is jv*****@hotmail.com
// I want to make some friends and discuss about programing.

#include "stdafx.h"
#include <malloc.h>
#include <string.h>
#define sec1 (c=buffer[i])&&(c>0x80&&c<0xA1||c>0xA9&&c<0xFF)
#define sec2 (c=buffer[i])&&(c>0xA0&&c<0xAA)
It appears that sec2 is a subset of sec1; if sec1 is false, sec2 will be
false as well. Thus, the else condition marked below will never be
executed.

void Convert2PureChinese(unsigned char* buffer)
{
unsigned char c;
for(int i=0;i<(int)strlen((char*)buffer);)
if(sec1)
{
printf("%c%c", buffer[i], buffer[i+1]);
i+=2;
}
never executed?
else if(sec2)
i+=2;
else
i++;
}

void getFileBuffer(FILE* fp)
{
if(fp==NULL)
return;
fseek(fp, 0L, SEEK_END);
long len=ftell(fp);
rewind(fp);
unsigned char* buffer=(unsigned char*)malloc(len);
fread(buffer, len, 1, fp);
fclose(fp);
Convert2PureChinese(buffer);
free(buffer);
}

int main(int argc, char* argv[])
{
getFileBuffer(fopen("test.txt", "r+b"));
return 0;
}

Sep 7 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Parrot | last post by:
Does anyone else have a problem with the Mozilla browser not expanding a multi-line textbox to its proper size? My textboxes are all small and do not display mu;tiple lines in Mozilla or Netscape...
7
by: Vic | last post by:
Dear All, I found this code snippet on this list (taken from a nice webpage of a courteous fellow), which I used to filter a form on a combo box. I wanted to repeat the same code to have an...
4
by: Nhmiller | last post by:
This is directly from Access' Help: "About designing a query When you open a query in Design view, or open a form, report, or datasheet and show the Advanced Filter/Sort window (Advanced...
3
by: Ivan | last post by:
Hi, how to filter out non-digit chars when user writes text to System.Windows.Forms.TextBox? Thanks, Iavmn
3
by: Phil Kelly | last post by:
Hi! I hope someone can help me here because I'm tearing my hair out (what little there is of it!) trying to figure out what's going on with the code below. I'm passing an Active Directory CN...
8
by: Mike S | last post by:
Hi all, I noticed a very slight logic error in the solution to K&R Exercise 1-22 on the the CLC-Wiki, located at http://www.clc-wiki.net/wiki/KR2_Exercise_1-22 The exercise reads as...
13
by: Alan Silver | last post by:
Hello, MSDN (amongst other places) is full of helpful advice on ways to do data access, but they all seem geared to wards enterprise applications. Maybe I'm in a minority, but I don't have those...
5
by: Ron S | last post by:
After days of searching I finally an example that would work with my application, the only problem is after entering all of the code it is not working. Would someone be kind enough to take a look at...
169
by: JohnQ | last post by:
(The "C++ Grammer" thread in comp.lang.c++.moderated prompted this post). It would be more than a little bit nice if C++ was much "cleaner" (less complex) so that it wasn't a major world wide...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.