Hi all,
I have this sample data in a .txt file,
-- Ne doit pas faire
-- é
select PARC.CONTYP
, PARC.CONSTA
, count(*)
from ParcNonCede PARC
, PLAN P
and PARC.CONSTA not in ('Stock', 'Détourné')
group by CONTYP, CONSTA
;
I am saving this in UTF-8 (not necessarily using Notepad).
Now I have this sample code to detect the Codepage of a buffer as below..
void
ExamineData(String &codepage, String &buffer)
{
IMultiLanguage2 *mlang;
DetectEncodingInfo info = {0};
MIMECPINFO codepageinfo;
int length = 0, cnt = 1;
String codepagestr; // String Datatype is my class
unsigned int len =1;
if(buffer.GetString())
length = buffer.GetLength();
HRESULT hr = S_OK;
SUCCEEDED(CoInitialize(NULL)); // init COM
hr = CoCreateInstance(CLSID_CMultiLanguage, NULL, CLSCTX_INPROC_SERVER, IID_IMultiLanguage2,(void **)&mlang);
hr = mlang->DetectInputCodepage(0,0, (wyChar *)buffer.GetString(), &length, &info, &cnt);
if(SUCCEEDED(hr))
{
hr = mlang->GetCodePageInfo(info.nCodePage, info.nLangID, &codepageinfo);
if(SUCCEEDED(hr))
codepage.SetAs(codepageinfo.wszDescription);
}
mlang->Release();
CoUninitialize();
}
Now this Function Detects the Codepage for the Buffer as "Western European (Windows)" even though it is an UTF-8. But Notepad is able to correctly detect it as UTF-8.
I am Not able to come to a conclusion whether it is a problem with the code or with the API.
Please Help me.
Thanks in Advance,
Xoinki
2 7290
This looks odd:
HRESULT hr = S_OK;
SUCCEEDED(CoInitialize(NULL)); // init COM
Shouldn't it be: -
HRESULT hr = CoInitialize(NULL); // init COM
-
if (!SUCCEEDED(hr))
-
{
-
//bail. COM did not initialize
-
}
-
Further, you should be using CoInitializeEx. Otherwise, your are forced to marshal and that slows everything down. Probably, you should code: -
HRESULT hr = CoInitializeEx(NULL, COINIT_MULTITHREADED ); // init COM
-
if (!SUCCEEDED(hr))
-
{
-
//bail. COM did not initialize
-
}
-
Then here:
hr = CoCreateInstance(CLSID_CMultiLanguage, NULL, CLSCTX_INPROC_SERVER, IID_IMultiLanguage2,(void **)&mlang);
You don't check that your instance was created. You need to: -
hr = CoCreateInstance(CLSID_CMultiLanguage, NULL, CLSCTX_INPROC_SERVER, IID_IMultiLanguage2,(void **)&mlang);
-
if (!SUCCEEDED(hr))
-
{
-
//bail the instance was not created
-
}
-
This also looks odd:
hr = mlang->DetectInputCodepage(0,0, (wyChar *)buffer.GetString(), &length, &info, &cnt);
This is the protptype for DetectInputCodePage: -
HRESULT DetectInputCodepage( DWORD dwFlag,
-
DWORD dwPrefWinCodePage,
-
CHAR *pSrcStr,
-
INT *pcSrcSize,
-
DetectEncodingInfo *lpEncoding,
-
INT *pnScores
-
);
-
Your third argument is a wyChar* and not a CHAR* as far as I can see. The fifth argument is a DetectEncodingInfo pointer. DetectEncodingInfo is a struct and you have
DetectEncodingInfo info = {0};
but you never use the results from DetectinoutCodePage().
Hi I changed the Code as u suggested..
<Code>
void
ExamineData(String &codepage, String &buffer)
{
IMultiLanguage2 *mlang;
DetectEncodingInfo info;
MIMECPINFO codepageinfo;
int length = 0, cnt = 1;
String codepagestr;
unsigned int len =1;
if(buffer.GetString())
length = buffer.GetLength();
HRESULT hr = CoInitializeEx(NULL, COINIT_MULTITHREADED); // init COM
if(SUCCEEDED(hr) == false)
{
codepage.SetAs("Unicode (UTF-8)");
return;
}
hr = CoCreateInstance(CLSID_CMultiLanguage, NULL, CLSCTX_INPROC_SERVER, IID_IMultiLanguage2,(void **)&mlang);
if(SUCCEEDED(hr) == false)
{
codepage.SetAs("Unicode (UTF-8)");
CoUninitialize();
return;
}
hr = mlang->DetectInputCodepage(0,0, (char *)buffer.GetString(), &length, &info, &cnt);
if(SUCCEEDED(hr) == true)
{
hr = mlang->GetCodePageInfo(info.nCodePage, info.nLangID, &codepageinfo);
if(SUCCEEDED(hr) == true)
codepage.SetAs(codepageinfo.wszDescription);
}
else
{
codepage.SetAs("Unicode (UTF-8)");
mlang->Release();
CoUninitialize();
return;
}
mlang->Release();
CoUninitialize();
}
</code>
It is still detecting Utf-8 as "western Europen (Windows)" for the buffer i have mentioned above in my post.
Also I am saving the above file (with sample accented characters given above in m,y post ) through a program. so i am not writing any BOM. if i open the same file in notepad - saveas- select UTF-8 from the combo-- save.. then Codepage is detected properly since notepad inserts BOM.. without BOM this API is detecting properly? or is it a mistake in my code?
please help me..
Thnx in advance
Xoinki
Sign in to post your reply or Sign up for a free account.
Similar topics
by: Corgan |
last post by:
I compiled the same code as release version in VC++ 7.0 and it runs about 4X slower. My program is doing a mathematical search where time is important... am I forced to stick with the older version...
|
by: ABC |
last post by:
How to create a web page class for inhert web page using ASP.NET 1.1 and
2.0?
|
by: John |
last post by:
My friend told me that his company will migrate the VC++ win32
applications
to C++ .NET windows applications. I don't understand why since
currently
Microsoft only supports .NET on windows...
|
by: Sasquatch |
last post by:
I'm having trouble creating a simple login page using the asp:login
control. I followed some instructions in a WROX book, "Beginning
ASP.NET 2.0," and the instructions are very straight forward,...
|
by: Brandon Driesen |
last post by:
The following illustrates my question. Why is it when I bind to an a
collection of items whose interface implementation is explicit, there is
an error during the binding process wherein the error...
|
by: vunet.us |
last post by:
Can I get the name of a referral page using JavaScript? Just really
wondering...
|
by: psbasha |
last post by:
Hi,
Is anybody tried using JAVA interface as front end and Python modules in the back end?.
If so ,how to integrate the JAVA code with Python modules?
How JAVA and Jython are...
|
by: krishna V |
last post by:
Hi!
How to extract data from an xml file using c++ or vc++
|
by: hvivekw |
last post by:
Hi,
I would like to open a web page using Perl.
I have a device on the internet whose web user interface I would like to open and subsequently automate some of the tabs on that web page.
Is...
|
by: CloudSolutions |
last post by:
Introduction:
For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome former...
|
by: aa123db |
last post by:
Variable and constants
Use var or let for variables and const fror constants.
Var foo ='bar';
Let foo ='bar';const baz ='bar';
Functions
function $name$ ($parameters$) {
}
...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
| |