473,382 Members | 1,390 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

Parsing HTML to find format of each word

PhilOfWalton
1,430 Expert 1GB
I have a rich textbox on a form into which I type a Rich Text message. I am trying to determine how big (height and width) it needs to be to show the complete message.
To do this, I need to check the formatting of each piece of text in the message (Font, size, bold, italic .. I think that is all)
I have tried various methods – height is easy by having a report with a Rich text box on it with CanShrink & CanGrow set to True and seeing how high the text box on the report ends up. This doesn’t help with the width.
Method 2 which is showing promise is to have a second rich text box on the form into which I load the words from the first text box which have the same formatting. I then apply that format to the second box. I have routines that determine the height and width of the second box.
So if I load each line, block by block (by block, I mean words with the same format), into the second text box, and get that text box’s dimensions, for that block, the sum of the widths gives me the line length and the maximum height gives me the line height.
After all the text from the first Text Box has been analysed, the maximum width of the second box gives the width I am looking for, and the sum of the heights gives me the height I am looking for.

So the problem is how to extract the text from first text box and how to read what it’s format is.
I have a lot of code that is unreliable and only uses the font and size information, not bold and italic and fails when it comes across things like “&” or “*”

Be grateful for any assistance

Phil
Mar 15 '16 #1
5 3130
jforbes
1,107 Expert 1GB
I'm still a little unclear on this.

First question is why are you doing all this? This is some pretty serious code to be writing and it could be with a different approach to your problem, you can accomplish what you want without all the headache.

Second, what does the code you are developing look like. It's hard to fully grasp what you are trying to accomplish talking in abstract terms when you have actual code that you are referring to.

I've seen some code somewhere, I think here on Bytes, that creates a Report with a Rich TextBox to find out how much space some sample text would need to be fully painted on the screen, but I haven't looked at it much so I don't know if that would be helpful. ...This may be the code you are using.

Lastly, a sample of the data you are attempting to wrangle could be helpful. It's hard to give advice on how to break up some data without some frame of reference.
Mar 16 '16 #2
NeoPa
32,556 Expert Mod 16PB
I had a question a while back on RichText TextBoxes. I can't recall all the details, and I never use them myself, but I seem to recall that the tags of the value are included in the .Text property while the displayed data only is stored in .Value.

Hopefully you should be able to read the .Text value and determine the attributes of the enclosed text.
Mar 17 '16 #3
PhilOfWalton
1,430 Expert 1GB
The ultimate goal is to produce a Rich Text version of the standard Msgbox. I am reasonably close to this, but still some bugs to iron out. At the moment, I have a form RichBox and a module that you can drop into any database and use find & replace Msgbox with RichMsgBox. This accepts all the normal parameters of the MsgBox and works normally with the additional advantage that one can copy and paste the displayed message.
However instead of using standard text for the Prompt part of a standard Msgbox, one can use Rich Text for the prompt part of the RichMsgBox.

My problem is working out the correct size or the RichBox. I have one method which works reasonably well providing the fonts used are all of similar size, but if a person wants some fonts size 10 and other fonts size 36, the basic calculation fails.

I have a second form (MsgboxDesigner), used only by the programmer, that sets up the Rich Text message, the various buttons, title, Help file etc., and on pressing a command button, shows the RichBox as it will appear in your project, and generates code which can be pasted directly into your project to give a rich text Msgbox.
Hence the original question.

On the MsgboxDesigner form, I have a Rich text box where the message is written and formatted and a second Rich Text Box. I believe that
if I can load each bit of the text from the main Rich Text Box into the second box, and apply the formatting of that text to the second box, I can calculate the size of the second box. So on each change of font for each bit of text in the main message, I load the second box with the text. A bit of arithmetic to add widths and heights of the second box should give me the size that I need.

Your remark about using a report to get the size of a Rich Text box works perfectly for getting the height, but gives no informtion on the width.

Here is the Default Value of the main Rich Text Box
="<div><Font Face=""Comic Sans MS""> <font size=3>abcde&amp;</font><font face=""Forte"" size=3 color=red>fghij</font></div>

<div><font face=""Bodoni MT Black"" size=5 color=#C3D69B><strong>&nbsp;This is line 1</strong></font></div>

<div><font face=""Bauhaus 93"" size=6 color=black>This is line 2. It's quite long </font></div>

<div>&nbsp;</div>"

THe output I would like to see is something like
Text Font Size Bold Italic
abcde& "Comic Sans MS" 12 No No
fghij "Forte" 12 No No
This is line 1 "Bodini BT Black" 18 Yes No
This is line 2. It's quite long "Bauhaus 93" 24 No No

I have routines to convert HTMD sizes into points.
I need to do some work on the line feeds

Phil
Mar 17 '16 #4
jforbes
1,107 Expert 1GB
If all you are just wanting a little fancier Message Box, you might want to look at this first: https://bytes.com/topic/access/insig...-message-boxes

A pretty message box would be greatly welcomed, but I think there might be a fundamental flaw with your approach. I don't think that you will ever definitively be able to determine the size of a TextBox that you need. At least not practically without writing the Message Box from scratch (or inherited from Windows). Examples of the problems that I don't see having a practical solvable answer are:
  • Word Wrap - If a line break due to Word Wrapping multiple times, I don't think your going to find a way to know where the line would break. So then you will never know where all the breaks occur, making it impossible to determine the lengths, but more importantly, how many lines are needed.
  • Mixed Fonts and Font Sizes on the same line. I think from you sample text, "abcde" and "fghij" are on the same line, or at least could be. And they could be in different font sizes.
I'm not saying it can't be done, but I don't think it will be reliable short of creating your own control and creating your own onPaint Event and using DrawText yourself.

A much less bothersome approach is to display the text to the user in a TextBox with the ScrollBar property set to Vertical. It would be much less work. If you wanted, you could support multiple Form sizes as a parameter or write a less in-depth routine to estimate the size needed. I think your users would feel at home seeing a MsgBox of this sort. ...if you really wanted to get fancy, you could use a Web Browser instead and supply HTML with Hyperlinks and Pictures. =)
Mar 18 '16 #5
PhilOfWalton
1,430 Expert 1GB
Thanks for your various inputs. I have no come up with a reasonable solution. basically, if there are no great variation in font sizes, the rich message box comes up at a sensible size to accommodate the message.
With the MessageBoxDesigner form, mentioned earlier, I write my message (I can now incorporate variables from the project the message box will finally be used in), set up buttons, title, help file etc. etc.
On pressing the test button, the Rich MessagBox is displayed exactly as it will finally look (other than any variables that will be incorporated). If the size is unacceptable, I can drag the bottom corner of the box that the original message has been typed in to make it bigger or smaller, and the final Rich MessageBox comes out the same size.
If anyone would like a copy to evaluate, please let me know. All suggestions for improvements gratefully received.
Phil
Mar 20 '16 #6

Sign in to post your reply or Sign up for a free account.

Similar topics

8
by: C Gillespie | last post by:
Dear All, I have hopefully a very simple problem. I wish to parse an html page and extract everything between the <body> tags. E.g. <head> <body> <b>afsdf</b> </body>
8
by: Anders Eriksson | last post by:
Hello! I want to extract some info from a some specific HTML pages, Microsofts International Word list (e.g. http://msdn.microsoft.com/library/en-us/dnwue/html/swe_word_list.htm). I want to...
16
by: Terry | last post by:
Hi, This is a newbie's question. I want to preload 4 images and only when all 4 images has been loaded into browser's cache, I want to start a slideshow() function. If images are not completed...
1
by: Dwight Shubert | last post by:
Hi all, This is my first try using Access. I understand how to format a text field for uppercase and lowercase, but how do you capitalize only the first letter of each word? tia Dwight
2
by: KnotKnormal | last post by:
I would like to dynamically load a HTML page (or a Word document), which is embedded in a table when the user clicks on a hyperlink to go from HTML page one to HTML page two. For example, I would...
3
by: AlexCC | last post by:
Hi, everybody. I am a programming beginner. I am now trying to write a C++ program to count the number of each word in an English text. I know how to read from a file, how to output into a file,...
3
by: Seb | last post by:
Hello, I am trying to find some object/function able to take an HTML page (code) as an input, strip out all images, stylesheets and other external references, and returns "cleaned" HTML only...
9
by: jd | last post by:
I am looking for python code (working or sample code) that can take an html document created by Microsoft Word and clean it up (if you've never had to look at a Word-generated html document,...
9
by: sebzzz | last post by:
Hi, I work at this company and we are re-building our website: http://caslt.org/. The new website will be built by an external firm (I could do it myself, but since I'm just the summer student...
7
by: Benjamin | last post by:
I'm trying to parse an HTML file. I want to retrieve all of the text inside a certain tag that I find with XPath. The DOM seems to make this available with the innerHTML element, but I haven't...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.