Connecting Tech Pros Worldwide Forums | Help | Site Map

Storing the content of doc\pdf into MS SQL2005 DB

Newbie
 
Join Date: Jul 2008
Posts: 7
#1: Jul 9 '08
Friends,
I need to read the content of doc\pdf and store it in DB. I am using fstram logic to do this. Currently it only working for txt files. I am not sure whether fstream handles doc\pdf. But when I read I am able to see some values (I don't know whether its junk) in variable. The DB column type is 'text'. Can any suggest me some resolution for this???

Banfa's Avatar
AdministratorVoR
 
Join Date: Feb 2006
Location: South West UK
Posts: 6,188
#2: Jul 9 '08

re: Storing the content of doc\pdf into MS SQL2005 DB


doc format is a binary format so when using your fstream you need to make sure that you are doing a binary read and when storing it in the data base you need to store in a field with a binary type like image
Newbie
 
Join Date: Jul 2008
Posts: 7
#3: Jul 9 '08

re: Storing the content of doc\pdf into MS SQL2005 DB


Quote:

Originally Posted by Banfa

doc format is a binary format so when using your fstream you need to make sure that you are doing a binary read and when storing it in the data base you need to store in a field with a binary type like image

Banfa,
Ya I am considering it as a binary. I am using the fstream attributes ios::in|ios::binary|ios::ate, but still its not reading the doc completely. Its failing somewhere while reading the header info of doc. Only some 8 char are read.
following is the method I am using to read it

ifstream::pos_type size;
char * pszBuffer;
ifstream file (strFileName, ios::in|ios::binary|ios::ate);
if (file.is_open())
{
size = file.tellg();
pszBuffer = strAttachmentBuffer.GetBufferSetLength(size);
file.seekg (0, ios::beg);
file.read (pszBuffer, size);
file.close();
}
Any idea why its not reading completely? Any other attributes I need to add??
Banfa's Avatar
AdministratorVoR
 
Join Date: Feb 2006
Location: South West UK
Posts: 6,188
#4: Jul 9 '08

re: Storing the content of doc\pdf into MS SQL2005 DB


GetBufferSetLength looks like it's a member of CSimpleStringT.

If so then you need to change that, CSimpleStringT is for holding strings, you are not dealing with string data you want to hold an array of binary bytes. Something like CByteArray.
Newbie
 
Join Date: Jul 2008
Posts: 7
#5: Jul 10 '08

re: Storing the content of doc\pdf into MS SQL2005 DB


Quote:

Originally Posted by Banfa

GetBufferSetLength looks like it's a member of CSimpleStringT.

If so then you need to change that, CSimpleStringT is for holding strings, you are not dealing with string data you want to hold an array of binary bytes. Something like CByteArray.

Ya that true. But I am trying to read & hold the value in pszBuffer, which is a char pointer. The value is not read here only. Now what I am really confused is whether fstream can read the doc\pdf ?
I am trying to read a word Doc\pdf.
Banfa's Avatar
AdministratorVoR
 
Join Date: Feb 2006
Location: South West UK
Posts: 6,188
#6: Jul 10 '08

re: Storing the content of doc\pdf into MS SQL2005 DB


Quote:

Originally Posted by San07

Ya that true. But I am trying to read & hold the value in pszBuffer, which is a char pointer. The value is not read here only. Now what I am really confused is whether fstream can read the doc\pdf ?
I am trying to read a word Doc\pdf.

I see no reason why it wouldn't but if you are worried read it and then write it immediately to convince yourself.

There is nothing special about doc/pdf, it's just another binary file format (well 2) and fstream can read binary data when set-up correctly (which you seem to have done).
Newbie
 
Join Date: Jul 2008
Posts: 7
#7: Jul 10 '08

re: Storing the content of doc\pdf into MS SQL2005 DB


Quote:

Originally Posted by Banfa

I see no reason why it wouldn't but if you are worried read it and then write it immediately to convince yourself.

There is nothing special about doc/pdf, it's just another binary file format (well 2) and fstream can read binary data when set-up correctly (which you seem to have done).


Banfa,
Thanks for ur time nd suggestions.
k the problem of writing a doc\pdf is not exactly after reading. Immediately, after reading if I try to write into a file, its working file. But I need to store the data into SQL 2005 DB. If I try to retreive the content from DB its failing to write it back.

What exactly happening is when I read a doc file(Ex 160kb file) , the content in the char pointer will be "ÐÏࡱá" (instead of "ÐÏࡱá >  þÿ.....") and it will be stored in DB as it is and when I retreive it back it says the size is 8 and copiesthe same in file.
So, I immediately after reading the doc, I wrote the content the content to a text file. Then I edited the text file (I deleted the white spaces after the 8th digit and till the symbol '>' but still there were lot of spaces in the file) To my surprise when I read back from DB the content was 160kb(that means the file is storing the complete info). But anyway the file was corrupted since we had manually modified the header content.
1. Is there any attribute I am missing bcoz of which I am failing to read the header completely?
2. Do I need to use anything more like serialization?


In DB(MS-SQL 2005) the column which I are using is of Data_Type 'text' (which has the capacity to hold upto 2GB)
Reply