473,320 Members | 1,914 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Parse OLE Object - C#

I have an Access database (images.mdb) that has 2 columns: one is the id of
the picture (an integer) and one (column named picture) is a field of type
OLE Object which contains an image stored as on OLE Object (it can store
jpg, bmp, gif, but I don't
know what image is stored inside).
I want to retrieve the picture stored in the database and identified by a
given id and display it in a web page (.aspx).
I write in Visual C# , but it does not matter, VB answers are just as
wellcome.
My problem is that this OLE Object field does not contain just the raw array
of bytes that form the image. So I can not just read the array of bytes and
output it to the browser.
no, the OLE Object contains some extra information about the type of the
file stored
(which would be good to know so I can know what kind of image it is). but I
don't know how to get this information.
I also don't know how to separate this information from the actual image.
Does anyone know how to solve this ?
Nov 22 '07 #1
3 7899
kenobewan
4,871 Expert 4TB
I notice that you copied a post from 2005:

Expand|Select|Wrap|Line Numbers
  1. using System;
  2. using System.Collections.Generic;
  3. using System.ComponentModel;
  4. using System.Data;
  5. using System.Data.OleDb;
  6. using System.Drawing;
  7. using System.Drawing.Imaging;
  8. using System.Text;
  9. using System.IO;
  10. using System.Windows.Forms;
  11. namespace OleImages
  12. {
  13.     public partial class Form1 : Form
  14.     {
  15.         public Form1()
  16.         {
  17.             InitializeComponent();
  18.         }
  19.         private void Form1_Load(object sender, EventArgs e)
  20.         {
  21.             String strConn = @"Provider = Microsoft.Jet.OLEDB.4.0;Data Source = C:\Nwind.mdb;";
  22.             OleDbConnection conn = new OleDbConnection(strConn);
  23.             Byte[] byPicture;
  24.             String strCmd = "Select Picture From Categories where CategoryID=1";
  25.             OleDbCommand cmd = new OleDbCommand(strCmd, conn);
  26.             try
  27.             {
  28.                 conn.Open();
  29.                 byPicture = (Byte[]) cmd.ExecuteScalar();
  30.                 conn.Close();
  31.                 MemoryStream ms = new MemoryStream();
  32.                 Bitmap bm;
  33.                 ms.Write(byPicture, 78, byPicture.Length - 78);
  34.                 bm = new Bitmap(ms);
  35.                 pictureBox1.Image = bm;
  36.                 String strPath = Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + "\\test.jpg";
  37.                 bm.Save(strPath, ImageFormat.Jpeg);
  38.             }
  39.             catch
  40.             {
  41.  
  42.             }
  43.         }
  44.     }
  45. }
Nov 22 '07 #2
Hi,

Thanks for the reply !!!

Yes i copied that post because i was not able to find any solution on that post and currentlyi am also facing the same issue.

I have tried the solution provided by you but i get "Parameter is not valid error" while initializing the Bitmap. Moreover, the Ole Object field in my case can contain any file: word, excel, jpg, bmp, gif. So, really not sure whether the offset will be 78 in every case.

i am not getting any way forward... will appreciate any help on this.
Nov 23 '07 #3
stucky
1
I recently had to fight this battle, and found no answers online. I normally don't write code at the bit level, so excuse any novice mistakes, but lacking a library to deal with OLE Files from Access I had to parse it as well as I could.

In C# I retrieve my document from Access (or SQL Server if the Access Database has been upsized) as a byte array. Access encapsulates files with its own header, which isn't an OLE file structure. It's something different...

Expand|Select|Wrap|Line Numbers
  1. byte[] doc = ld.GetSupportingDocument(docID);
  2.  
  3. MemoryStream ms = new MemoryStream();
  4. ms.Write(doc, 0, doc.Length);
  5. int firstByte;
  6. int secondByte;
  7. ms.Seek(0, SeekOrigin.Begin);
  8. firstByte = ms.ReadByte();
  9. secondByte = ms.ReadByte();
  10.  
  11. if (firstByte != 0x15 && secondByte != 0x1C) {
  12.     ErrorResponse("Stored object is not an Access File.");
  13.     return;
  14. }
  15.  
The first two bytes are a signature, if they don't equal to 0x15 and 0x1C, it's not an access ole file. The next short is the end of the file type:

Expand|Select|Wrap|Line Numbers
  1. int fileTypeLoc = 20; // begin of the file type
  2. short offset; // end of the file type
  3.  
  4. byte[] buffer = new byte[2];
  5. ms.Read(buffer, 0, 2);
  6. offset = BitConverter.ToInt16(buffer, 0);
  7.  
Keeping track of how far I've read into the file, I store a portion of the bytes as a string, starting from 0x14 (the 20th byte) up to the offset I retrieved in the previous block.

Expand|Select|Wrap|Line Numbers
  1. long seekTotal = 0; 
  2. seekTotal += offset;
  3.  
  4. string docType = String.Empty;
  5. for (int i = fileTypeLoc; i < offset; i++) {
  6.     docType += (char)doc[i];
  7. }
  8.  
The next bit is how I'm figuring out what type of file it is (so that when I serve it via HTTP, I can set the file name and content type properly. There's no real parsing going on in this block, with the exception of the Package type. A package can be anything, a zip file, a gif, a pdf, whatever. When you have a package, the original file name is stored in the access header, so I read in 256 bytes (an arbitrary number I selected based on trial and error), and pluck the original file extension from it. Because of my database I have no worries that it's anything but a pdf, but if you can't guaruntee that, you need to do a better job of parsing than I am.

Expand|Select|Wrap|Line Numbers
  1. bool packageIsPdf = false;
  2. string ext = "dat";
  3. string filename = "supporting-document";
  4. string contentType = "application/octet-stream";
  5. if (docType.Contains("Word.Document.8")) {
  6.     ext = "doc";
  7.     contentType = "application/ms-word";
  8. } else if (docType.Contains("AcroExch.Document.7")) {
  9.     contentType = "application/pdf";
  10.     ext = "pdf";
  11. } else if (docType.Contains("Package")) {
  12.     // packages are generic and require more processing
  13.     string packageBuffer = String.Empty;
  14.     for (int i = 20; i < 256; i++) {
  15.         packageBuffer += (char)doc[i];
  16.     }
  17.     if (packageBuffer.Contains(".pdf")) {
  18.         contentType = "application/pdf";
  19.         ext = "pdf";
  20.         packageIsPdf = true;
  21.     } else if (packageBuffer.Contains(".zip")) {
  22.         contentType = "application/zip";
  23.         ext = "zip";
  24.     } else {
  25.         ext = "dat";
  26.     }
  27. } else if (docType.Contains("Excel.Sheet.8")) {
  28.     ext = "xls";
  29.     contentType = "application/ms-excel";
  30. } else if (docType.Contains("PowerPoint.Show.8")) {
  31.     ext = "ppt";
  32.     contentType = "application/ms-powerpoint";
  33. } else if (docType.Contains("Word.Document.12")) {
  34.     ext = "docx";
  35.     contentType = "application/ms-word";
  36. } else if (docType.Contains("PowerPoint.Show.12")) {
  37.     ext = "pptx";
  38.     contentType = "application/ms-powerpoint";
  39. } else if (docType.Contains("Excel.Sheet.12")) {
  40.     ext = "xlsx";
  41.     contentType = "application/ms-excel";
  42. }
  43.  
Read 8 more bytes. These bytes should always be 01 05 00 00 02 00 00 00.

Expand|Select|Wrap|Line Numbers
  1. // magic eight bytes 01 05 00 00 02 00 00 00
  2. ms.Seek(seekTotal, SeekOrigin.Begin);
  3. buffer = new byte[8];
  4. ms.Read(buffer, 0, 8);
  5. seekTotal += 8;
  6.  
Read the next long. Move to that location.

Expand|Select|Wrap|Line Numbers
  1. // Second offset to move to 
  2. buffer = new byte[4];
  3. ms.Read(buffer, 0, 4);
  4. seekTotal += 4;
  5. long offset2 = BitConverter.ToInt32(buffer, 0);
  6. seekTotal += offset2;
  7. ms.Seek(seekTotal, SeekOrigin.Begin);
  8.  
Read 8 empty bytes.

Expand|Select|Wrap|Line Numbers
  1. // eight empty bytes
  2. buffer = new byte[8];
  3. ms.Read(buffer, 0, 8);
  4. seekTotal += 8;
  5.  
The next long will tell you how many bytes your encapsulated file is

Expand|Select|Wrap|Line Numbers
  1. // next n bytes are the length of the file
  2. buffer = new byte[4];
  3. ms.Read(buffer, 0, 4);
  4. seekTotal += 4;
  5. long fileByteLength = BitConverter.ToInt32(buffer, 0);
  6.  
The next N bytes consist of your file. Create a new buffer of this length and read from your memory stream into the buffer.

Expand|Select|Wrap|Line Numbers
  1. // next N bytes are the file
  2. byte[] data = new byte[fileByteLength];
  3.  
  4. // store file bytes in data buffer
  5. ms.Read(data, 0, Convert.ToInt32(fileByteLength));
  6.  
If your file is a PDF, you have another headache to deal with, OLE2 Compound Files. I deal with extracting the pdf from the OLE2 file in another method using the Gembox Compound File 1.1 library.

Expand|Select|Wrap|Line Numbers
  1.     if (contentType == "application/pdf" && !packageIsPdf) {
  2.         data = GetPdfFromOle(data, Convert.ToInt32(seekTotal), Convert.ToInt32(fileByteLength));
  3.     }
  4.  
If everything went well, I can return the byte array to my Response object

Expand|Select|Wrap|Line Numbers
  1.     if (data == null) {
  2.         Response.Write("Unable to retrieve file");
  3.         Response.End();
  4.     }
  5.  
  6.     string contentDisposition = String.Format("attachment; filename={0}.{1}", filename, ext);
  7.     Response.ContentType = contentType;
  8.     Response.AppendHeader("Content-Disposition", contentDisposition);
  9.     Response.BinaryWrite(data);
  10.     Response.End();
  11.  
A code sample of extracting the pdf from an OLE2 Compound file using Gembox's library

Expand|Select|Wrap|Line Numbers
  1. private byte[] GetPdfFromOle(byte[] data, int offset, int length) {
  2.     string tmpFileName = Path.GetTempFileName();
  3.  
  4.     FileStream fstmp = new FileStream(tmpFileName, FileMode.Create, FileAccess.Write, FileShare.None);
  5.     fstmp.Write(data, 0, data.Length);
  6.     fstmp.Close();
  7.  
  8.     byte[] pdfData = null;
  9.     Ole2CompoundFile ole2file = new Ole2CompoundFile();
  10.  
  11.     try {
  12.  
  13.         ole2file.Load(tmpFileName, false);
  14.         foreach (Ole2Stream entry in ole2file.Root) {
  15.             log.Debug(entry.Name);
  16.  
  17.             if (entry.Name.ToLower() == "contents") {
  18.                 pdfData = entry.GetData();
  19.                 break;
  20.             }
  21.         }
  22.     }catch (Exception ex) {
  23.         ErrorResponse(ex.Message);
  24.     } finally {
  25.         File.Delete(tmpFileName);
  26.         ole2file.Close();
  27.     }
  28.     return pdfData;
  29. }
  30.  
Hope this helps.
Nov 29 '07 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

2
by: iop | last post by:
Hello there, I'd like to "parse" an entire multi-dimension array like this : APP APP without knowing "framework" or "config" or anything passed as variables... 'cause it's simple to call...
24
by: | last post by:
Hi, I need to read a big CSV file, where different fields should be converted to different types, such as int, double, datetime, SqlMoney, etc. I have an array, which describes the fields and...
2
by: Samuel | last post by:
Hello, I am trying to convert a local time into UTC ISO8601, then parse it back into local time. I tried the following: ---------------------- #!/usr/bin/python import time import datetime...
3
by: Bob Rundle | last post by:
I would like to get something like this to work... Type t = FindMyType(); // might be int, float, double, etc string s = "1233"; object v = t.Parse(s); This doesn't work of couse, Parse is...
3
by: c0uch | last post by:
the first and third methods are in a usercontrol object. txtValue is a TextBox the float.parse in the if statement on line 3 always works fine. the second float.parse in the third method is...
3
by: Kevin Kenny | last post by:
Dear All, I have a date time validation method thus: public static bool IsDate(string date, System.IFormatProvider provider) { try { DateTime.Parse(date, provider) return true; } catch...
3
by: Slonocode | last post by:
I have some textboxes bound to an access db. I wanted to format the textboxes that displayed currency and date info so I did the following: Dim WithEvents oBidAmt As Binding oBidAmt = New...
8
by: Douglas Crockford | last post by:
There is a new version of JSON.parse in JavaScript. It is vastly faster and smaller than the previous version. It uses a single call to eval to do the conversion, guarded by a single regexp test to...
3
by: MMiGG | last post by:
Hi Our project need parse JAVA serialized object string in C, has any library? Thanx
5
by: js | last post by:
I have a textbox contains text in the format of "yyyy/MM/dd hh:mm:ss". I need to parse the text using System.DateTime.Parse() function with custom format. I got an error using the following code. ...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.