473,378 Members | 1,419 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

word 6.0 files to text

I am working on a project to extract the text part from a word 6.0
binary file. The format is very messy and I am unable to calculate the
beginning of text from the FIB(File Information Block). I would like to
know how I can do the same? Which are the other data structures I would
have to consider for locating all the text in a word 6.0 binary file.

Help!!!

PS: If I am not in the right group, please tell me which is the right
group to get help on this topic.

Sep 10 '06 #1
2 1625
On 10 Sep 2006 01:51:40 -0700, in comp.lang.c , vp****@gmail.com
wrote:
>I am working on a project to extract the text part from a word 6.0
binary file.
....
>
PS: If I am not in the right group, please tell me which is the right
group to get help on this topic.
You're right, this is the wrong group. You could try one of the MS
programming groups, but I think you would be better off to search the
MSDN and to google for "word document format decode". I know that MS
publish some libraries to let you read the doc format, and wotsit.org
probably has some stuff.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Sep 10 '06 #2
vp****@gmail.com said:
I am working on a project to extract the text part from a word 6.0
binary file. The format is very messy and I am unable to calculate the
beginning of text from the FIB(File Information Block). I would like to
know how I can do the same? Which are the other data structures I would
have to consider for locating all the text in a word 6.0 binary file.

Help!!!

PS: If I am not in the right group, please tell me which is the right
group to get help on this topic.
Um, yes, you're in completely the wrong group. You need to look for a group
that discusses Microsoft's "compound document" format, also known as
"structured storage".

In the meantime, and without any guarantees that the information is correct,
you could take a look at the following URL, which will at least give you a
starting point, admittedly in C++ rather than C, and in Microsoft C++ at
that. If you find it helpful, great, but don't be surprised if, when you
eventually find an expert on the subject, he laughs at my code!

http://www.cpax.org.uk/prg/windows/structstor.php
--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Sep 10 '06 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Neil | last post by:
An article at http://news.com.com/2100-1012-991694.html?tag=fd_top states: "XML would allow easier interchange of data generated in Office documents with back-end systems or existing Web...
8
by: Asma | last post by:
Dear Sir, I am trying to find a way to open a Word document using C language and read the text of word doc into a variable. (Turbo C on Dos 6.0). Can anyone please tell me which libraries in...
0
by: Shat T. Cat | last post by:
Hello, I have a program that I originally wrote in VB6 that breaks down plain-text Profit & Loss reports from my organization's Accounting system into separate files for each Cost Center (office...
0
by: alivip | last post by:
I write code to get most frequent words in the file I won't to implement bigram probability by modifying the code to do the following: How can I get every Token (word) and ...
5
by: alivip | last post by:
How can I get every Token (word) and PreviousToken(Previous word) From multube files and frequency of each two word my code is trying to get all single word and double word (every Token (word) and...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.