473,382 Members | 1,720 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

Detect Code page using IMultiLanguage2 Interface (MS VC++ (Win32))

110 100+
Hi all,

I have this sample data in a .txt file,

-- Ne doit pas faire
-- é
select PARC.CONTYP
, PARC.CONSTA
, count(*)
from ParcNonCede PARC
, PLAN P
and PARC.CONSTA not in ('Stock', 'Détourné')
group by CONTYP, CONSTA
;

I am saving this in UTF-8 (not necessarily using Notepad).

Now I have this sample code to detect the Codepage of a buffer as below..
void
ExamineData(String &codepage, String &buffer)
{
IMultiLanguage2 *mlang;
DetectEncodingInfo info = {0};
MIMECPINFO codepageinfo;
int length = 0, cnt = 1;
String codepagestr; // String Datatype is my class
unsigned int len =1;

if(buffer.GetString())
length = buffer.GetLength();

HRESULT hr = S_OK;
SUCCEEDED(CoInitialize(NULL)); // init COM

hr = CoCreateInstance(CLSID_CMultiLanguage, NULL, CLSCTX_INPROC_SERVER, IID_IMultiLanguage2,(void **)&mlang);

hr = mlang->DetectInputCodepage(0,0, (wyChar *)buffer.GetString(), &length, &info, &cnt);

if(SUCCEEDED(hr))
{
hr = mlang->GetCodePageInfo(info.nCodePage, info.nLangID, &codepageinfo);
if(SUCCEEDED(hr))
codepage.SetAs(codepageinfo.wszDescription);
}

mlang->Release();
CoUninitialize();
}

Now this Function Detects the Codepage for the Buffer as "Western European (Windows)" even though it is an UTF-8. But Notepad is able to correctly detect it as UTF-8.

I am Not able to come to a conclusion whether it is a problem with the code or with the API.

Please Help me.

Thanks in Advance,
Xoinki
Jun 27 '07 #1
2 7290
weaknessforcats
9,208 Expert Mod 8TB
This looks odd:
HRESULT hr = S_OK;
SUCCEEDED(CoInitialize(NULL)); // init COM
Shouldn't it be:
Expand|Select|Wrap|Line Numbers
  1. HRESULT hr = CoInitialize(NULL); // init COM
  2. if (!SUCCEEDED(hr))
  3. {
  4.     //bail. COM did not initialize
  5. }
  6.  
Further, you should be using CoInitializeEx. Otherwise, your are forced to marshal and that slows everything down. Probably, you should code:

Expand|Select|Wrap|Line Numbers
  1. HRESULT hr = CoInitializeEx(NULL, COINIT_MULTITHREADED ); // init COM
  2. if (!SUCCEEDED(hr))
  3. {
  4.     //bail. COM did not initialize
  5. }
  6.  
Then here:
hr = CoCreateInstance(CLSID_CMultiLanguage, NULL, CLSCTX_INPROC_SERVER, IID_IMultiLanguage2,(void **)&mlang);
You don't check that your instance was created. You need to:
Expand|Select|Wrap|Line Numbers
  1. hr = CoCreateInstance(CLSID_CMultiLanguage, NULL, CLSCTX_INPROC_SERVER, IID_IMultiLanguage2,(void **)&mlang);
  2. if (!SUCCEEDED(hr))
  3. {
  4.      //bail the instance was not created
  5. }
  6.  
This also looks odd:
hr = mlang->DetectInputCodepage(0,0, (wyChar *)buffer.GetString(), &length, &info, &cnt);
This is the protptype for DetectInputCodePage:
Expand|Select|Wrap|Line Numbers
  1. HRESULT DetectInputCodepage(          DWORD dwFlag,
  2.     DWORD dwPrefWinCodePage,
  3.     CHAR *pSrcStr,
  4.     INT *pcSrcSize,
  5.     DetectEncodingInfo *lpEncoding,
  6.     INT *pnScores
  7. );
  8.  
Your third argument is a wyChar* and not a CHAR* as far as I can see. The fifth argument is a DetectEncodingInfo pointer. DetectEncodingInfo is a struct and you have
DetectEncodingInfo info = {0};
but you never use the results from DetectinoutCodePage().
Jun 27 '07 #2
xoinki
110 100+
Hi I changed the Code as u suggested..

<Code>
void
ExamineData(String &codepage, String &buffer)
{
IMultiLanguage2 *mlang;
DetectEncodingInfo info;
MIMECPINFO codepageinfo;
int length = 0, cnt = 1;
String codepagestr;
unsigned int len =1;

if(buffer.GetString())
length = buffer.GetLength();

HRESULT hr = CoInitializeEx(NULL, COINIT_MULTITHREADED); // init COM
if(SUCCEEDED(hr) == false)
{
codepage.SetAs("Unicode (UTF-8)");
return;
}


hr = CoCreateInstance(CLSID_CMultiLanguage, NULL, CLSCTX_INPROC_SERVER, IID_IMultiLanguage2,(void **)&mlang);

if(SUCCEEDED(hr) == false)
{
codepage.SetAs("Unicode (UTF-8)");
CoUninitialize();
return;
}

hr = mlang->DetectInputCodepage(0,0, (char *)buffer.GetString(), &length, &info, &cnt);

if(SUCCEEDED(hr) == true)
{
hr = mlang->GetCodePageInfo(info.nCodePage, info.nLangID, &codepageinfo);

if(SUCCEEDED(hr) == true)
codepage.SetAs(codepageinfo.wszDescription);
}
else
{
codepage.SetAs("Unicode (UTF-8)");
mlang->Release();
CoUninitialize();
return;
}

mlang->Release();
CoUninitialize();
}
</code>

It is still detecting Utf-8 as "western Europen (Windows)" for the buffer i have mentioned above in my post.

Also I am saving the above file (with sample accented characters given above in m,y post ) through a program. so i am not writing any BOM. if i open the same file in notepad - saveas- select UTF-8 from the combo-- save.. then Codepage is detected properly since notepad inserts BOM.. without BOM this API is detecting properly? or is it a mistake in my code?
please help me..
Thnx in advance
Xoinki
Jun 28 '07 #3

Sign in to post your reply or Sign up for a free account.

Similar topics

4
by: Corgan | last post by:
I compiled the same code as release version in VC++ 7.0 and it runs about 4X slower. My program is doing a mathematical search where time is important... am I forced to stick with the older version...
3
by: ABC | last post by:
How to create a web page class for inhert web page using ASP.NET 1.1 and 2.0?
14
by: John | last post by:
My friend told me that his company will migrate the VC++ win32 applications to C++ .NET windows applications. I don't understand why since currently Microsoft only supports .NET on windows...
2
by: Sasquatch | last post by:
I'm having trouble creating a simple login page using the asp:login control. I followed some instructions in a WROX book, "Beginning ASP.NET 2.0," and the instructions are very straight forward,...
0
by: Brandon Driesen | last post by:
The following illustrates my question. Why is it when I bind to an a collection of items whose interface implementation is explicit, there is an error during the binding process wherein the error...
7
by: vunet.us | last post by:
Can I get the name of a referral page using JavaScript? Just really wondering...
2
by: psbasha | last post by:
Hi, Is anybody tried using JAVA interface as front end and Python modules in the back end?. If so ,how to integrate the JAVA code with Python modules? How JAVA and Jython are...
1
by: krishna V | last post by:
Hi! How to extract data from an xml file using c++ or vc++
1
by: hvivekw | last post by:
Hi, I would like to open a web page using Perl. I have a device on the internet whose web user interface I would like to open and subsequently automate some of the tabs on that web page. Is...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.