473,385 Members | 1,958 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Parse OLE Object - C#

I have an Access database (images.mdb) that has 2 columns: one is the id of
the picture (an integer) and one (column named picture) is a field of type
OLE Object which contains an image stored as on OLE Object (it can store
jpg, bmp, gif, but I don't
know what image is stored inside).
I want to retrieve the picture stored in the database and identified by a
given id and display it in a web page (.aspx).
I write in Visual C# , but it does not matter, VB answers are just as
wellcome.
My problem is that this OLE Object field does not contain just the raw array
of bytes that form the image. So I can not just read the array of bytes and
output it to the browser.
no, the OLE Object contains some extra information about the type of the
file stored
(which would be good to know so I can know what kind of image it is). but I
don't know how to get this information.
I also don't know how to separate this information from the actual image.
Does anyone know how to solve this ?
Nov 22 '07 #1
3 7915
kenobewan
4,871 Expert 4TB
I notice that you copied a post from 2005:

Expand|Select|Wrap|Line Numbers
  1. using System;
  2. using System.Collections.Generic;
  3. using System.ComponentModel;
  4. using System.Data;
  5. using System.Data.OleDb;
  6. using System.Drawing;
  7. using System.Drawing.Imaging;
  8. using System.Text;
  9. using System.IO;
  10. using System.Windows.Forms;
  11. namespace OleImages
  12. {
  13.     public partial class Form1 : Form
  14.     {
  15.         public Form1()
  16.         {
  17.             InitializeComponent();
  18.         }
  19.         private void Form1_Load(object sender, EventArgs e)
  20.         {
  21.             String strConn = @"Provider = Microsoft.Jet.OLEDB.4.0;Data Source = C:\Nwind.mdb;";
  22.             OleDbConnection conn = new OleDbConnection(strConn);
  23.             Byte[] byPicture;
  24.             String strCmd = "Select Picture From Categories where CategoryID=1";
  25.             OleDbCommand cmd = new OleDbCommand(strCmd, conn);
  26.             try
  27.             {
  28.                 conn.Open();
  29.                 byPicture = (Byte[]) cmd.ExecuteScalar();
  30.                 conn.Close();
  31.                 MemoryStream ms = new MemoryStream();
  32.                 Bitmap bm;
  33.                 ms.Write(byPicture, 78, byPicture.Length - 78);
  34.                 bm = new Bitmap(ms);
  35.                 pictureBox1.Image = bm;
  36.                 String strPath = Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + "\\test.jpg";
  37.                 bm.Save(strPath, ImageFormat.Jpeg);
  38.             }
  39.             catch
  40.             {
  41.  
  42.             }
  43.         }
  44.     }
  45. }
Nov 22 '07 #2
Hi,

Thanks for the reply !!!

Yes i copied that post because i was not able to find any solution on that post and currentlyi am also facing the same issue.

I have tried the solution provided by you but i get "Parameter is not valid error" while initializing the Bitmap. Moreover, the Ole Object field in my case can contain any file: word, excel, jpg, bmp, gif. So, really not sure whether the offset will be 78 in every case.

i am not getting any way forward... will appreciate any help on this.
Nov 23 '07 #3
stucky
1
I recently had to fight this battle, and found no answers online. I normally don't write code at the bit level, so excuse any novice mistakes, but lacking a library to deal with OLE Files from Access I had to parse it as well as I could.

In C# I retrieve my document from Access (or SQL Server if the Access Database has been upsized) as a byte array. Access encapsulates files with its own header, which isn't an OLE file structure. It's something different...

Expand|Select|Wrap|Line Numbers
  1. byte[] doc = ld.GetSupportingDocument(docID);
  2.  
  3. MemoryStream ms = new MemoryStream();
  4. ms.Write(doc, 0, doc.Length);
  5. int firstByte;
  6. int secondByte;
  7. ms.Seek(0, SeekOrigin.Begin);
  8. firstByte = ms.ReadByte();
  9. secondByte = ms.ReadByte();
  10.  
  11. if (firstByte != 0x15 && secondByte != 0x1C) {
  12.     ErrorResponse("Stored object is not an Access File.");
  13.     return;
  14. }
  15.  
The first two bytes are a signature, if they don't equal to 0x15 and 0x1C, it's not an access ole file. The next short is the end of the file type:

Expand|Select|Wrap|Line Numbers
  1. int fileTypeLoc = 20; // begin of the file type
  2. short offset; // end of the file type
  3.  
  4. byte[] buffer = new byte[2];
  5. ms.Read(buffer, 0, 2);
  6. offset = BitConverter.ToInt16(buffer, 0);
  7.  
Keeping track of how far I've read into the file, I store a portion of the bytes as a string, starting from 0x14 (the 20th byte) up to the offset I retrieved in the previous block.

Expand|Select|Wrap|Line Numbers
  1. long seekTotal = 0; 
  2. seekTotal += offset;
  3.  
  4. string docType = String.Empty;
  5. for (int i = fileTypeLoc; i < offset; i++) {
  6.     docType += (char)doc[i];
  7. }
  8.  
The next bit is how I'm figuring out what type of file it is (so that when I serve it via HTTP, I can set the file name and content type properly. There's no real parsing going on in this block, with the exception of the Package type. A package can be anything, a zip file, a gif, a pdf, whatever. When you have a package, the original file name is stored in the access header, so I read in 256 bytes (an arbitrary number I selected based on trial and error), and pluck the original file extension from it. Because of my database I have no worries that it's anything but a pdf, but if you can't guaruntee that, you need to do a better job of parsing than I am.

Expand|Select|Wrap|Line Numbers
  1. bool packageIsPdf = false;
  2. string ext = "dat";
  3. string filename = "supporting-document";
  4. string contentType = "application/octet-stream";
  5. if (docType.Contains("Word.Document.8")) {
  6.     ext = "doc";
  7.     contentType = "application/ms-word";
  8. } else if (docType.Contains("AcroExch.Document.7")) {
  9.     contentType = "application/pdf";
  10.     ext = "pdf";
  11. } else if (docType.Contains("Package")) {
  12.     // packages are generic and require more processing
  13.     string packageBuffer = String.Empty;
  14.     for (int i = 20; i < 256; i++) {
  15.         packageBuffer += (char)doc[i];
  16.     }
  17.     if (packageBuffer.Contains(".pdf")) {
  18.         contentType = "application/pdf";
  19.         ext = "pdf";
  20.         packageIsPdf = true;
  21.     } else if (packageBuffer.Contains(".zip")) {
  22.         contentType = "application/zip";
  23.         ext = "zip";
  24.     } else {
  25.         ext = "dat";
  26.     }
  27. } else if (docType.Contains("Excel.Sheet.8")) {
  28.     ext = "xls";
  29.     contentType = "application/ms-excel";
  30. } else if (docType.Contains("PowerPoint.Show.8")) {
  31.     ext = "ppt";
  32.     contentType = "application/ms-powerpoint";
  33. } else if (docType.Contains("Word.Document.12")) {
  34.     ext = "docx";
  35.     contentType = "application/ms-word";
  36. } else if (docType.Contains("PowerPoint.Show.12")) {
  37.     ext = "pptx";
  38.     contentType = "application/ms-powerpoint";
  39. } else if (docType.Contains("Excel.Sheet.12")) {
  40.     ext = "xlsx";
  41.     contentType = "application/ms-excel";
  42. }
  43.  
Read 8 more bytes. These bytes should always be 01 05 00 00 02 00 00 00.

Expand|Select|Wrap|Line Numbers
  1. // magic eight bytes 01 05 00 00 02 00 00 00
  2. ms.Seek(seekTotal, SeekOrigin.Begin);
  3. buffer = new byte[8];
  4. ms.Read(buffer, 0, 8);
  5. seekTotal += 8;
  6.  
Read the next long. Move to that location.

Expand|Select|Wrap|Line Numbers
  1. // Second offset to move to 
  2. buffer = new byte[4];
  3. ms.Read(buffer, 0, 4);
  4. seekTotal += 4;
  5. long offset2 = BitConverter.ToInt32(buffer, 0);
  6. seekTotal += offset2;
  7. ms.Seek(seekTotal, SeekOrigin.Begin);
  8.  
Read 8 empty bytes.

Expand|Select|Wrap|Line Numbers
  1. // eight empty bytes
  2. buffer = new byte[8];
  3. ms.Read(buffer, 0, 8);
  4. seekTotal += 8;
  5.  
The next long will tell you how many bytes your encapsulated file is

Expand|Select|Wrap|Line Numbers
  1. // next n bytes are the length of the file
  2. buffer = new byte[4];
  3. ms.Read(buffer, 0, 4);
  4. seekTotal += 4;
  5. long fileByteLength = BitConverter.ToInt32(buffer, 0);
  6.  
The next N bytes consist of your file. Create a new buffer of this length and read from your memory stream into the buffer.

Expand|Select|Wrap|Line Numbers
  1. // next N bytes are the file
  2. byte[] data = new byte[fileByteLength];
  3.  
  4. // store file bytes in data buffer
  5. ms.Read(data, 0, Convert.ToInt32(fileByteLength));
  6.  
If your file is a PDF, you have another headache to deal with, OLE2 Compound Files. I deal with extracting the pdf from the OLE2 file in another method using the Gembox Compound File 1.1 library.

Expand|Select|Wrap|Line Numbers
  1.     if (contentType == "application/pdf" && !packageIsPdf) {
  2.         data = GetPdfFromOle(data, Convert.ToInt32(seekTotal), Convert.ToInt32(fileByteLength));
  3.     }
  4.  
If everything went well, I can return the byte array to my Response object

Expand|Select|Wrap|Line Numbers
  1.     if (data == null) {
  2.         Response.Write("Unable to retrieve file");
  3.         Response.End();
  4.     }
  5.  
  6.     string contentDisposition = String.Format("attachment; filename={0}.{1}", filename, ext);
  7.     Response.ContentType = contentType;
  8.     Response.AppendHeader("Content-Disposition", contentDisposition);
  9.     Response.BinaryWrite(data);
  10.     Response.End();
  11.  
A code sample of extracting the pdf from an OLE2 Compound file using Gembox's library

Expand|Select|Wrap|Line Numbers
  1. private byte[] GetPdfFromOle(byte[] data, int offset, int length) {
  2.     string tmpFileName = Path.GetTempFileName();
  3.  
  4.     FileStream fstmp = new FileStream(tmpFileName, FileMode.Create, FileAccess.Write, FileShare.None);
  5.     fstmp.Write(data, 0, data.Length);
  6.     fstmp.Close();
  7.  
  8.     byte[] pdfData = null;
  9.     Ole2CompoundFile ole2file = new Ole2CompoundFile();
  10.  
  11.     try {
  12.  
  13.         ole2file.Load(tmpFileName, false);
  14.         foreach (Ole2Stream entry in ole2file.Root) {
  15.             log.Debug(entry.Name);
  16.  
  17.             if (entry.Name.ToLower() == "contents") {
  18.                 pdfData = entry.GetData();
  19.                 break;
  20.             }
  21.         }
  22.     }catch (Exception ex) {
  23.         ErrorResponse(ex.Message);
  24.     } finally {
  25.         File.Delete(tmpFileName);
  26.         ole2file.Close();
  27.     }
  28.     return pdfData;
  29. }
  30.  
Hope this helps.
Nov 29 '07 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

2
by: iop | last post by:
Hello there, I'd like to "parse" an entire multi-dimension array like this : APP APP without knowing "framework" or "config" or anything passed as variables... 'cause it's simple to call...
24
by: | last post by:
Hi, I need to read a big CSV file, where different fields should be converted to different types, such as int, double, datetime, SqlMoney, etc. I have an array, which describes the fields and...
2
by: Samuel | last post by:
Hello, I am trying to convert a local time into UTC ISO8601, then parse it back into local time. I tried the following: ---------------------- #!/usr/bin/python import time import datetime...
3
by: Bob Rundle | last post by:
I would like to get something like this to work... Type t = FindMyType(); // might be int, float, double, etc string s = "1233"; object v = t.Parse(s); This doesn't work of couse, Parse is...
3
by: c0uch | last post by:
the first and third methods are in a usercontrol object. txtValue is a TextBox the float.parse in the if statement on line 3 always works fine. the second float.parse in the third method is...
3
by: Kevin Kenny | last post by:
Dear All, I have a date time validation method thus: public static bool IsDate(string date, System.IFormatProvider provider) { try { DateTime.Parse(date, provider) return true; } catch...
3
by: Slonocode | last post by:
I have some textboxes bound to an access db. I wanted to format the textboxes that displayed currency and date info so I did the following: Dim WithEvents oBidAmt As Binding oBidAmt = New...
8
by: Douglas Crockford | last post by:
There is a new version of JSON.parse in JavaScript. It is vastly faster and smaller than the previous version. It uses a single call to eval to do the conversion, guarded by a single regexp test to...
3
by: MMiGG | last post by:
Hi Our project need parse JAVA serialized object string in C, has any library? Thanx
5
by: js | last post by:
I have a textbox contains text in the format of "yyyy/MM/dd hh:mm:ss". I need to parse the text using System.DateTime.Parse() function with custom format. I got an error using the following code. ...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.