By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,968 Members | 1,589 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,968 IT Pros & Developers. It's quick & easy.

Reading in from MS Word files using MFC.

P: 79
Well, the title pretty much describes what I want to do. I want to be able to read the contents of a Word document (*.doc). I also want to be able to read it to a CString object, and then search that CString object for specific substrings.

The problem I am having is reading the Word file itself. I have tried numerous things but no luck. Also, all the sources I found on the net are for older versions of VC++ and MS Word.

I am using Visual Studio 2005 and Microsoft Word 2003.

And no, this is not a homework assignment in case you're wondering. It's for a program I am writing for work, and I'd love to start on the next project instead of wasting all my time on this one. Everything else is done in the program except for this one part.

Thanks.
Jun 20 '07 #1
Share this Question
Share on Google+
3 Replies


weaknessforcats
Expert Mod 5K+
P: 9,197
The easiest thing to do is save the Word document as plain text.

Now you have a Notepad file.

Read each record into a CString and enjoy.

Any other approach will have you deciphering Word file formats.
Jun 21 '07 #2

P: 79
The easiest thing to do is save the Word document as plain text.

Now you have a Notepad file.

Read each record into a CString and enjoy.

Any other approach will have you deciphering Word file formats.
I know that, I wrote code for that already. The problem is the Word files are already there, and they aren't supposed to be opened and saved again.

Is there a way to read a Word file as a plain text file or in binary mode? I've been told that it should work.
Jun 21 '07 #3

weaknessforcats
Expert Mod 5K+
P: 9,197
You have limited options. If you read the doc file byte by byte, you have to know the Word format. All the stuff about fonts, styles, etc. is all over in there. You really need Word to read it. Plus the record format varies between versions of Word.

If the doc files exist and are write protected or version managed with something like SourceSafe you should be ablt to open them and do a Save As to plain text.

The only other option is to Save As RTF. But here, you would need to know how to read and interpret an RTF file.

The Save As just runs a converter. Office has many converters and these are in the Office SDK. Would you be permitted to run the doc file through a converter? It's the same as Save As but without the mouse.

This is about all I offer.
Jun 21 '07 #4

Post your reply

Sign in to post your reply or Sign up for a free account.