473,890 Members | 1,658 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Couldn’t get equations in html when convert word .docx file to html file in C#.

1 New Member
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Expand|Select|Wrap|Line Numbers
  1. Globals.ThisAddIn.Application.ActiveDocument.Select();
  2. Microsoft.Office.Interop.Word.Document doc = Globals.ThisAddIn.Application.ActiveDocument;
  3.  
  4. string result = Path.GetTempPath();
  5.  
  6. string tmpFileName = Globals.ThisAddIn.Application.ActiveDocument.FullName;
  7. doc.SaveEncoding = Microsoft.Office.Core.MsoEncoding.msoEncodingUSASCII;
  8. if (File.Exists(result + "temp.html"))
  9. {
  10.     File.Delete(result + "temp.html");
  11. }
  12. doc.SaveAs(result + "temp.html", WdSaveFormat.wdFormatFilteredHTML); 
  13.  
  14. doc.Close(Microsoft.Office.Interop.Word.WdSaveOptions.wdDoNotSaveChanges);
  15.  
  16. HtmlAgilityPack.HtmlDocument mangledHTML = new HtmlAgilityPack.HtmlDocument();
  17. mangledHTML.Load(result + "temp.html");
  18.  
  19.  
  20. if (File.Exists(result + "newtemp.html"))
  21. {
  22.     File.Delete(result + "newtemp.html");
  23. }
  24.  
  25. mangledHTML.Save(result + "newtemp.html");
  26. // Remove standalone CRLF
  27.  
  28. string badHTML = File.ReadAllText(result + "newtemp.html");
  29. badHTML = badHTML.Replace("\r\n\r\n", "ackThbbtt ");
  30. badHTML = badHTML.Replace("\r\n", " ");
  31. badHTML = badHTML.Replace("ackThbbtt ", "\r\n");
  32. badHTML = badHTML.Replace('�', ' ');
  33. if (File.Exists(result + "finaltemp.html"))
  34. {
  35.     File.Delete(result + "finaltemp.html");
  36. }
  37. File.WriteAllText(result + "finaltemp.html", badHTML);
  38.  
  39. // Clean up temp files, show the finished result in Notepad
  40. File.Delete(result + "temp.html");
  41. File.Delete(result + "newtemp.html");
  42.  
  43. Microsoft.Office.Interop.Word.Document orignalDoc = new Document();
  44. orignalDoc = Globals.ThisAddIn.Application.Documents.Open(tmpFileName);
  45.  
  46.  
Basically, what I want to do is I want to store all word document paragraph data separately in database and I also want it’s all property like font size, font width, font name and font style. So that I can show it in my application as it is as I written in word document file.

To represent it as it is I need to convert it html format and the by sepreting all paragraphs I can store it in database. But when in my word document has paragraph which have equations then

Expand|Select|Wrap|Line Numbers
  1. Globals.ThisAddIn.Application.ActiveDocument.Select();
  2. Microsoft.Office.Interop.Word.Document doc = Globals.ThisAddIn.Application.ActiveDocument;
  3.  
  4. string result = Path.GetTempPath();
  5.  
  6. string tmpFileName = Globals.ThisAddIn.Application.ActiveDocument.FullName;
  7. doc.SaveEncoding = Microsoft.Office.Core.MsoEncoding.msoEncodingUSASCII;
This code converts my word documents all equations in Images and as it convert in image I can’t show the equation properly in my application.

So I tried to convert this equations in MATHML form but I couldn’t solve this.
Apr 25 '24 #1
0 7161

Sign in to post your reply or Sign up for a free account.

Similar topics

1
3873
by: Ashutosh | last post by:
How can i convert Word file to txt file in ASP.NET using CSharp?
3
4692
by: Chris Davoli | last post by:
I've got a requirement to build a page using MS WORD and then have the page show up on a web site. I know I can do a binary write and open up the WORD document in IE plugin. Don't really want to do this because it displays in the word editor. What I'm looking for is a way to upload the WORD document to the server (I know how to do this using the HTML Upload control) then take the WORD document and run it thru some kind of converter class...
1
2172
by: ananth | last post by:
Hi All, Do anyone know how to get a word document in a rich text field and convert them into a HTML page programatically.The requirement is that there shouldnt be any third party tool involved.All these has to be done in visual basic on a form load or in a button click. Kindly help me in this regard.Thanks In Advance
1
1728
by: firozfasilan | last post by:
I want the complete module for converting a word document to html file using visual basic 6 can you help me?
5
5157
by: sangith | last post by:
Hi, How do I convert a word document into a text file. (For eg: If I give input as file1.doc, my Perl program should automatically convert it into file1.txt) Is there any Perl module which does this conversion? I would appreciate your response! Thanks in advance! Sangith
0
3404
DaBarrett
by: DaBarrett | last post by:
Hi, I tried to word repair 2007 document from the recycle bin on windows 2010 home edition. When I try to open it now i get the message; Word experienced an error trying to open this file. Try these suggestions. *Check the file permissions.......... any help appreciated. Thanks
0
9977
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9816
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
11218
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10802
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10910
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9618
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
8009
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5837
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
4260
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.