Parse OLE Object - C# - .NET Framework

I have an Access database (images.mdb) that has 2 columns: one is the id of
the picture (an integer) and one (column named picture) is a field of type
OLE Object which contains an image stored as on OLE Object (it can store
jpg, bmp, gif, but I don't
know what image is stored inside).
I want to retrieve the picture stored in the database and identified by a
given id and display it in a web page (.aspx).
I write in Visual C# , but it does not matter, VB answers are just as
wellcome.
My problem is that this OLE Object field does not contain just the raw array
of bytes that form the image. So I can not just read the array of bytes and
output it to the browser.
no, the OLE Object contains some extra information about the type of the
file stored
(which would be good to know so I can know what kind of image it is). but I
don't know how to get this information.
I also don't know how to separate this information from the actual image.
Does anyone know how to solve this ?

Nov 22 '07 #1

Subscribe Post Reply

7915

kenobewan

4,871

Expert 4TB

I notice that you copied a post from 2005:

Expand|Select|Wrap|Line Numbers

 using System;

using System.Collections.Generic;

using System.ComponentModel;

using System.Data;

using System.Data.OleDb;

using System.Drawing;

using System.Drawing.Imaging;

using System.Text;

using System.IO;

using System.Windows.Forms;

namespace OleImages

{

    public partial class Form1 : Form

    {

        public Form1()

        {

            InitializeComponent();

        }

        private void Form1_Load(object sender, EventArgs e)

        {

            String strConn = @"Provider = Microsoft.Jet.OLEDB.4.0;Data Source = C:\Nwind.mdb;";

            OleDbConnection conn = new OleDbConnection(strConn);

            Byte[] byPicture;

            String strCmd = "Select Picture From Categories where CategoryID=1";

            OleDbCommand cmd = new OleDbCommand(strCmd, conn);

            try

            {

                conn.Open();

                byPicture = (Byte[]) cmd.ExecuteScalar();

                conn.Close();

                MemoryStream ms = new MemoryStream();

                Bitmap bm;

                ms.Write(byPicture, 78, byPicture.Length - 78);

                bm = new Bitmap(ms);

                pictureBox1.Image = bm;

                String strPath = Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + "\\test.jpg";

                bm.Save(strPath, ImageFormat.Jpeg);

            }

            catch

            {
 
            }

        }

    }

}

Nov 22 '07 #2

narinder2

Hi,

Thanks for the reply !!!

Yes i copied that post because i was not able to find any solution on that post and currentlyi am also facing the same issue.

I have tried the solution provided by you but i get "Parameter is not valid error" while initializing the Bitmap. Moreover, the Ole Object field in my case can contain any file: word, excel, jpg, bmp, gif. So, really not sure whether the offset will be 78 in every case.

i am not getting any way forward... will appreciate any help on this.

Nov 23 '07 #3

stucky

I recently had to fight this battle, and found no answers online. I normally don't write code at the bit level, so excuse any novice mistakes, but lacking a library to deal with OLE Files from Access I had to parse it as well as I could.

In C# I retrieve my document from Access (or SQL Server if the Access Database has been upsized) as a byte array. Access encapsulates files with its own header, which isn't an OLE file structure. It's something different...

Expand|Select|Wrap|Line Numbers

 
byte[] doc = ld.GetSupportingDocument(docID);
 
MemoryStream ms = new MemoryStream();

ms.Write(doc, 0, doc.Length);

int firstByte;

int secondByte;

ms.Seek(0, SeekOrigin.Begin);

firstByte = ms.ReadByte();

secondByte = ms.ReadByte();
 
if (firstByte != 0x15 && secondByte != 0x1C) {

    ErrorResponse("Stored object is not an Access File.");

    return;

}

The first two bytes are a signature, if they don't equal to 0x15 and 0x1C, it's not an access ole file. The next short is the end of the file type:

Expand|Select|Wrap|Line Numbers

 
int fileTypeLoc = 20; // begin of the file type

short offset; // end of the file type
 
byte[] buffer = new byte[2];

ms.Read(buffer, 0, 2);

offset = BitConverter.ToInt16(buffer, 0);

Keeping track of how far I've read into the file, I store a portion of the bytes as a string, starting from 0x14 (the 20th byte) up to the offset I retrieved in the previous block.

Expand|Select|Wrap|Line Numbers

 
long seekTotal = 0; 

seekTotal += offset;
 
string docType = String.Empty;

for (int i = fileTypeLoc; i < offset; i++) {

    docType += (char)doc[i];

}

The next bit is how I'm figuring out what type of file it is (so that when I serve it via HTTP, I can set the file name and content type properly. There's no real parsing going on in this block, with the exception of the Package type. A package can be anything, a zip file, a gif, a pdf, whatever. When you have a package, the original file name is stored in the access header, so I read in 256 bytes (an arbitrary number I selected based on trial and error), and pluck the original file extension from it. Because of my database I have no worries that it's anything but a pdf, but if you can't guaruntee that, you need to do a better job of parsing than I am.

Expand|Select|Wrap|Line Numbers

 
bool packageIsPdf = false;

string ext = "dat";

string filename = "supporting-document";

string contentType = "application/octet-stream";

if (docType.Contains("Word.Document.8")) {

    ext = "doc";

    contentType = "application/ms-word";

} else if (docType.Contains("AcroExch.Document.7")) {

    contentType = "application/pdf";

    ext = "pdf";

} else if (docType.Contains("Package")) {

    // packages are generic and require more processing

    string packageBuffer = String.Empty;

    for (int i = 20; i < 256; i++) {

        packageBuffer += (char)doc[i];

    }

    if (packageBuffer.Contains(".pdf")) {

        contentType = "application/pdf";

        ext = "pdf";

        packageIsPdf = true;

    } else if (packageBuffer.Contains(".zip")) {

        contentType = "application/zip";

        ext = "zip";

    } else {

        ext = "dat";

    }

} else if (docType.Contains("Excel.Sheet.8")) {

    ext = "xls";

    contentType = "application/ms-excel";

} else if (docType.Contains("PowerPoint.Show.8")) {

    ext = "ppt";

    contentType = "application/ms-powerpoint";

} else if (docType.Contains("Word.Document.12")) {

    ext = "docx";

    contentType = "application/ms-word";

} else if (docType.Contains("PowerPoint.Show.12")) {

    ext = "pptx";

    contentType = "application/ms-powerpoint";

} else if (docType.Contains("Excel.Sheet.12")) {

    ext = "xlsx";

    contentType = "application/ms-excel";

}

Read 8 more bytes. These bytes should always be 01 05 00 00 02 00 00 00.

Expand|Select|Wrap|Line Numbers

 
// magic eight bytes 01 05 00 00 02 00 00 00

ms.Seek(seekTotal, SeekOrigin.Begin);

buffer = new byte[8];

ms.Read(buffer, 0, 8);

seekTotal += 8;

Read the next long. Move to that location.

Expand|Select|Wrap|Line Numbers

 
// Second offset to move to 

buffer = new byte[4];

ms.Read(buffer, 0, 4);

seekTotal += 4;

long offset2 = BitConverter.ToInt32(buffer, 0);

seekTotal += offset2;

ms.Seek(seekTotal, SeekOrigin.Begin);

Read 8 empty bytes.

Expand|Select|Wrap|Line Numbers

 
// eight empty bytes

buffer = new byte[8];

ms.Read(buffer, 0, 8);

seekTotal += 8;

The next long will tell you how many bytes your encapsulated file is

Expand|Select|Wrap|Line Numbers

 
// next n bytes are the length of the file

buffer = new byte[4];

ms.Read(buffer, 0, 4);

seekTotal += 4;

long fileByteLength = BitConverter.ToInt32(buffer, 0);

The next N bytes consist of your file. Create a new buffer of this length and read from your memory stream into the buffer.

Expand|Select|Wrap|Line Numbers

 
// next N bytes are the file

byte[] data = new byte[fileByteLength];
 
// store file bytes in data buffer

ms.Read(data, 0, Convert.ToInt32(fileByteLength));

If your file is a PDF, you have another headache to deal with, OLE2 Compound Files. I deal with extracting the pdf from the OLE2 file in another method using the Gembox Compound File 1.1 library.

Expand|Select|Wrap|Line Numbers

 
    if (contentType == "application/pdf" && !packageIsPdf) {

        data = GetPdfFromOle(data, Convert.ToInt32(seekTotal), Convert.ToInt32(fileByteLength));

    }

If everything went well, I can return the byte array to my Response object

Expand|Select|Wrap|Line Numbers

 
    if (data == null) {

        Response.Write("Unable to retrieve file");

        Response.End();

    }
 
    string contentDisposition = String.Format("attachment; filename={0}.{1}", filename, ext);

    Response.ContentType = contentType;

    Response.AppendHeader("Content-Disposition", contentDisposition);

    Response.BinaryWrite(data);

    Response.End();

A code sample of extracting the pdf from an OLE2 Compound file using Gembox's library

Expand|Select|Wrap|Line Numbers

 
private byte[] GetPdfFromOle(byte[] data, int offset, int length) {

    string tmpFileName = Path.GetTempFileName();
 
    FileStream fstmp = new FileStream(tmpFileName, FileMode.Create, FileAccess.Write, FileShare.None);

    fstmp.Write(data, 0, data.Length);

    fstmp.Close();
 
    byte[] pdfData = null;

    Ole2CompoundFile ole2file = new Ole2CompoundFile();
 
    try {
 
        ole2file.Load(tmpFileName, false);

        foreach (Ole2Stream entry in ole2file.Root) {

            log.Debug(entry.Name);
 
            if (entry.Name.ToLower() == "contents") {

                pdfData = entry.GetData();

                break;

            }

        }

    }catch (Exception ex) {

        ErrorResponse(ex.Message);

    } finally {

        File.Delete(tmpFileName);

        ole2file.Close();

    }

    return pdfData;

}

Hope this helps.

Nov 29 '07 #4

by: iop | last post by:

Hello there, I'd like to "parse" an entire multi-dimension array like this : APP APP without knowing "framework" or "config" or anything passed as variables... 'cause it's simple to call...

Javascript

How to parse various types without a switch?

by: | last post by:

Hi, I need to read a big CSV file, where different fields should be converted to different types, such as int, double, datetime, SqlMoney, etc. I have an array, which describes the fields and...

.NET Framework

Timezone and ISO8601 struggles with datetime and xml.utils.iso8601.parse

by: Samuel | last post by:

Hello, I am trying to convert a local time into UTC ISO8601, then parse it back into local time. I tried the following: ---------------------- #!/usr/bin/python import time import datetime...

Python

Accessing Parse method for arbitrary Type

by: Bob Rundle | last post by:

I would like to get something like this to work... Type t = FindMyType(); // might be int, float, double, etc string s = "1233"; object v = t.Parse(s); This doesn't work of couse, Parse is...

C# / C Sharp

float.Parse works with a breakpoint, doesn't without.

by: c0uch | last post by:

the first and third methods are in a usercontrol object. txtValue is a TextBox the float.parse in the if statement on line 3 always works fine. the second float.parse in the third method is...

C# / C Sharp

ASP.NET DateTime.Parse oddness

by: Kevin Kenny | last post by:

Dear All, I have a date time validation method thus: public static bool IsDate(string date, System.IFormatProvider provider) { try { DateTime.Parse(date, provider) return true; } catch...

ASP.NET

Binding, Format, Parse...Not able to edit

by: Slonocode | last post by:

I have some textboxes bound to an access db. I wanted to format the textboxes that displayed currency and date info so I did the following: Dim WithEvents oBidAmt As Binding oBidAmt = New...

Visual Basic .NET

JSON.parse

by: Douglas Crockford | last post by:

There is a new version of JSON.parse in JavaScript. It is vastly faster and smaller than the previous version. It uses a single call to eval to do the conversion, guarded by a single regexp test to...

Javascript

Has any C library to parse JAVA serialized object string?

by: MMiGG | last post by:

Hi Our project need parse JAVA serialized object string in C, has any library? Thanx

C / C++

DateTime.Parse()

by: js | last post by:

I have a textbox contains text in the format of "yyyy/MM/dd hh:mm:ss". I need to parse the text using System.DateTime.Parse() function with custom format. I got an error using the following code. ...

ASP.NET

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Parse OLE Object - C#

Similar topics