473,503 Members | 13,381 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

RegExp split for Spell Check

Hi,
I am working on a Spell checker for my richtext editor. I cannot use
any open source, and must develop everything myself. I need a RegExp
pattern to split text into a word array. I have been doing it by
splitting by spaces or <ptags. I run into a probelm with the
richtext part of my editor. When I change the font, it wraps the text
in a tag. the tag has something like <font face="arial>some words</
font This splits the text at font^face so I need to split on spaces
unless they are within the HTML tag. I am just looking for the
pattern for my regExp. I know there may be better ways for me to do
it, but right now I just need help with this issue.

Thanks in advance.

Pete
Nov 23 '07 #1
22 2865
SmokeWilliams wrote on 23 nov 2007 in comp.lang.javascript:
I am working on a Spell checker for my richtext editor.
I cannot use any open source, and must develop everything myself.
Why? At least look at all the code you can find. Coming up with complex
code from scratch does not give you the benefit of years of code
experimentation of the collective of world's programmers.
I need a RegExp pattern to split text into a word array.
Why? Does it matter how you do it? Parsing seems so much simpler.
I have been doing it by
splitting by spaces or <ptags. I run into a probelm with the
richtext part of my editor. When I change the font, it wraps the text
in a tag.
the tag has something like <font face="arial>some words</font>
That is last century's code. Why not use <spanand CSS exclusively?
This splits the text at font^face so I need to split on spaces
unless they are within the HTML tag.
I am just looking for the pattern for my regExp.
I know there may be better ways for me to do
it, but right now I just need help with this issue.
I think that by stipulating the above unneccessary constraints, you will
get yourself into much trouble.

However try this:

var wordArrray = textString.replace(/(<[^>]*>)/g,' ').split(/\s+/)

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Nov 23 '07 #2
Evertjan. said the following on 11/23/2007 1:49 PM:
SmokeWilliams wrote on 23 nov 2007 in comp.lang.javascript:
<snip>
>the tag has something like <font face="arial>some words</font>

That is last century's code. Why not use <spanand CSS exclusively?
Because that is what the browsers put in the code in a contentEditable
element :)

--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Nov 23 '07 #3
Randy Webb wrote on 23 nov 2007 in comp.lang.javascript:
Evertjan. said the following on 11/23/2007 1:49 PM:
>SmokeWilliams wrote on 23 nov 2007 in comp.lang.javascript:

<snip>
>>the tag has something like <font face="arial>some words</font>

That is last century's code. Why not use <spanand CSS exclusively?

Because that is what the browsers put in the code in a contentEditable
element :)
So why use contentEditable if you cannot control it?

Wouldn't a simple <divwith onkeypress do?

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Nov 23 '07 #4
In comp.lang.javascript message <Xn********************@194.109.133.242>
, Fri, 23 Nov 2007 18:49:24, Evertjan. <ex**************@interxnl.net>
posted:
>
However try this:

var wordArrray = textString.replace(/(<[^>]*>)/g,' ').split(/\s+/)
If the page contains <script>...<\/script then ISTM that the script
will be spell-checked; likewise the content of any textarea and possibly
others.

Could one write the full text to a page or div as HTML (useful anyway)
and read it back as .innerText for spell-checking ?

--
(c) John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
Web <URL:http://www.merlyn.demon.co.uk/- FAQqish topics, acronyms & links;
Astro stuff via astron-1.htm, gravity0.htm ; quotings.htm, pascal.htm, etc.
No Encoding. Quotes before replies. Snip well. Write clearly. Don't Mail News.
Nov 23 '07 #5
Evertjan. said the following on 11/23/2007 6:16 PM:
Randy Webb wrote on 23 nov 2007 in comp.lang.javascript:
>Evertjan. said the following on 11/23/2007 1:49 PM:
>>SmokeWilliams wrote on 23 nov 2007 in comp.lang.javascript:
<snip>
>>>the tag has something like <font face="arial>some words</font>
That is last century's code. Why not use <spanand CSS exclusively?
Because that is what the browsers put in the code in a contentEditable
element :)

So why use contentEditable if you cannot control it?
That is basically all it is. It isn't so much the contentEditable that
does it but rather the built-in functions (most notably in IE) that do
the formatting. I haven't messed with it in a long time but I do
remember that the styling of text was horrible.

--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Nov 24 '07 #6
Dr J R Stockton said the following on 11/23/2007 3:01 PM:
In comp.lang.javascript message <Xn********************@194.109.133.242>
, Fri, 23 Nov 2007 18:49:24, Evertjan. <ex**************@interxnl.net>
posted:
>However try this:

var wordArrray = textString.replace(/(<[^>]*>)/g,' ').split(/\s+/)

If the page contains <script>...<\/script then ISTM that the script
will be spell-checked; likewise the content of any textarea and possibly
others.

Could one write the full text to a page or div as HTML (useful anyway)
and read it back as .innerText for spell-checking ?
The idea of spell-checking, in the sense of a true spell-checker is
almost impossible to implement in a browser due to the inherent size of
the dictionary that you must use.

--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Nov 24 '07 #7
In comp.lang.javascript message <Xp*********************@giganews.com>,
Fri, 23 Nov 2007 19:57:39, Randy Webb <Hi************@aol.composted:
>
The idea of spell-checking, in the sense of a true spell-checker is
almost impossible to implement in a browser due to the inherent size of
the dictionary that you must use.
Fifty thousand words is sufficient for ordinary use. I have to hand a
"Universal" pocket dictionary of a language resembling English, with 407
pages of two columns of about 15 words each; so about a quarter of that
size. The Little Oxford dictionary, 606 * 2 * 20, is about 25000 words.

I have to hand the New Testament in Basic English; its Note refers to
Basic English having 850 words, and to the NT using another 150
particular to the topic. It lacks the richness of the King James
version; but the text looks quite normal.

A spell-checker for use by the younger half of school-children would not
need very many words.

An alphabetical list of words, compressed, should not need much more
than two bytes per word.

So the list of words need be no longer than my largest Web pace,
currently 105000 bytes; and that's quite acceptable over broadband if
expected and cached properly.

There should be plenty of room to store such data in Javascript, from
what I've read here in other threads.

Lookup needs be no faster than typing, and properly implemented should
need only O(log2(N)) comparisons when using the main dictionary. It
would seem a potentially smart move to cache in a sub-dictionary the
words actually already seen (right or wrong) in the current text, since
words are often repeated. FAQ 2.3 contains about 675 words, but only
about 343 different ones. One third of its words are in the Top 8, "the
to and of in not a is".

The sub-dictionary can be pre-loaded with the commonest good and bad
spellings, if that helps.

Of course, it would be quite wrong to impose the full OED on an
unsuspecting dial-up user.

--
(c) John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
Web <URL:http://www.merlyn.demon.co.uk/- FAQish topics, acronyms, & links.
Proper <= 4-line sig. separator as above, a line exactly "-- " (SonOfRFC1036)
Do not Mail News to me. Before a reply, quote with ">" or "" (SonOfRFC1036)
Nov 24 '07 #8
Dr J R Stockton said the following on 11/24/2007 4:32 PM:
In comp.lang.javascript message <Xp*********************@giganews.com>,
Fri, 23 Nov 2007 19:57:39, Randy Webb <Hi************@aol.composted:
>The idea of spell-checking, in the sense of a true spell-checker is
almost impossible to implement in a browser due to the inherent size of
the dictionary that you must use.

Fifty thousand words is sufficient for ordinary use. I have to hand a
"Universal" pocket dictionary of a language resembling English, with 407
pages of two columns of about 15 words each; so about a quarter of that
size. The Little Oxford dictionary, 606 * 2 * 20, is about 25000 words.
I found a text file after looking for almost an hour. It has 213,558
words in it. The text file is 2.4 mbs. The biggest problem with even a
25,000 word dictionary is going to be lookup time. That can be helped a
lot by splitting it up into 26 dictionaries by beginning letter.

Too bad I can't look up half of what those words mean to know what they
mean. What the heck is a zakkeu?
--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Nov 25 '07 #9
Randy Webb said the following on 11/26/2007 8:26 PM:

<snip>
if(Dic['word here'])
Testing that with a 215,000 word dictionary, the results were almost
instantaneous. It did tell me that the word list I had is pretty useless
since it didn't have the word "test" in it.
--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Nov 27 '07 #10
In comp.lang.javascript message <D4*********************@giganews.com>,
Wed, 28 Nov 2007 00:05:18, Randy Webb <Hi************@aol.composted:
>
I guessed at the 500kb based on 215,000 entries being 4.5Mb. Creating a
test file with 25,000 entries in it where each entry is 6 characters
long - to create an "average" word length - the file was 439Kb so I
wasn't far off. Of course, the actual size would depend on the 25,000
words you used.
25000 6-character words in 7-bit ASCII, with CRLF separators, needs
exactly 200kB. It may use more if created in Word, or if encoded in a
manner allowing letters other than A-Z.

For ordinary English, one only needs A to Z - ' and a separator, so
5-bit characters could be used by mere packing - 25000*5*6/8 -under
100 kbytes, before any additional compression.

Of course, dictionary words are longer than the average.

--
(c) John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
Web <URL:http://www.merlyn.demon.co.uk/- FAQish topics, acronyms, & links.
Proper <= 4-line sig. separator as above, a line exactly "-- " (SonOfRFC1036)
Do not Mail News to me. Before a reply, quote with ">" or "" (SonOfRFC1036)
Nov 28 '07 #11
This is exactly what I was afraid of. I know it isn't the best
solution. I know there are better ways. I need a pattern to be used
only in the split because I need to maintain the length of the
string. So again, if anyone knows how to make the pattern to split
text by spaces or cariage returns "\r| " this is the split I am using
now. But as I stated above I need to ignore the spaces within HTML
tags. Please help me. Just the simple pattern will do. Thanks.

Pete
Dec 3 '07 #12
Hello Evertjan, thanks for replying.
>
However try this:

var wordArrray = textString.replace(/(<[^>]*>)/g,' ').split(/\s+/)
I need a pattern that will split without replacing. So I need to
split on spaces or carriage returns, but not spaces that are withing
html tags. I know there are better ways, but I am using an IFrame in
IE and I work for a government agency which doesn't allow me to use
open source. I am depending on a RegEx wizard out there to supply me
with the pattern.

So I need a pattern that matches any space or carriage return that is
not within an html tag.

<font face="arial" size=2>test</font><p>yo this is a test

Splitting this text should return an array containing:
1: <font face="arial" size=2>test</font>
2: yo
3: this
4: is
5: a
6: test

Thanks for your help.

Pete

Dec 3 '07 #13
pr
SmokeWilliams wrote:
<font face="arial" size=2>test</font><p>yo this is a test

Splitting this text should return an array containing:
1: <font face="arial" size=2>test</font>
2: yo
3: this
4: is
5: a
6: test
Try:

alert('<font face="arial" size=2>test</font><p>yo this is a
test'.replace(/\s(?=[^<]*>)/g, "~").split(/<p>|\s/).join("\n"));

You can either replace the '~'s or leave them in; either way, your
string lengths are the same as the original HTML (as long as you clear
up the <p!= whitespace issue).
Dec 4 '07 #14
SmokeWilliams wrote:
I need a pattern that will split without replacing. So I need to
split on spaces or carriage returns, but not spaces that are withing
html tags. I know there are better ways, but I am using an IFrame in
IE and I work for a government agency which doesn't allow me to use
open source. I am depending on a RegEx wizard out there to supply me
with the pattern.

So I need a pattern that matches any space or carriage return that is
not within an html tag.

<font face="arial" size=2>test</font><p>yo this is a test

Splitting this text should return an array containing:
1: <font face="arial" size=2>test</font>
2: yo
3: this
4: is
5: a
6: test
Suppose you have

var s = '<font face="arial" size=2>test</font><p>yo this is a test';

Either you have a weird idea of "html tag" (HTML is an acronym, BTW),
or (which is more likely) instead you want the resulting array to be

['', 'test', '', 'yo', 'this', 'is', 'a', 'test']

This could be achieved by using tags as additional delimiters:

var a = s.split(/<[^>]+>|\s+/);

Microsoft JScript will not include the empty strings in the array.
PointedEars
--
realism: HTML 4.01 Strict
evangelism: XHTML 1.0 Strict
madness: XHTML 1.1 as application/xhtml+xml
-- Bjoern Hoehrmann
Dec 4 '07 #15
pr
Thomas 'PointedEars' Lahn wrote:
pr wrote:
> alert('<font face="arial" size=2>test</font><p>yo this is a
test'.replace(/\s(?=[^<]*>)/g, "~").split(/<p>|\s/).join("\n"));
^^^^^^^
| I need a pattern that will split without replacing.
Hanged if I can think of a good reason why, but well spotted, Thomas,
this is more efficient in any case:

alert('<font face="arial" size=2>test</font><p>yo this is a
test'.split(/\s(?![^<]*>)|<p>/).join("\n"));
Dec 4 '07 #16
Thomas 'PointedEars' Lahn said the following on 12/4/2007 2:38 PM:
SmokeWilliams wrote:
>I need a pattern that will split without replacing. So I need to
split on spaces or carriage returns, but not spaces that are withing
html tags. I know there are better ways, but I am using an IFrame in
IE and I work for a government agency which doesn't allow me to use
open source. I am depending on a RegEx wizard out there to supply me
with the pattern.

So I need a pattern that matches any space or carriage return that is
not within an html tag.

<font face="arial" size=2>test</font><p>yo this is a test

Splitting this text should return an array containing:
1: <font face="arial" size=2>test</font>
2: yo
3: this
4: is
5: a
6: test

Suppose you have

var s = '<font face="arial" size=2>test</font><p>yo this is a test';

Either you have a weird idea of "html tag" (HTML is an acronym, BTW),
or (which is more likely) instead you want the resulting array to be

['', 'test', '', 'yo', 'this', 'is', 'a', 'test']
No, that is not what he said. Perhaps you should try reading what he
wrote and the intended results. Your "solution" leaves out the 1. listed
above.

--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Dec 4 '07 #17
Randy Webb wrote:
Thomas 'PointedEars' Lahn said the following on 12/4/2007 2:38 PM:
>SmokeWilliams wrote:
>>I need a pattern that will split without replacing. So I need to
split on spaces or carriage returns, but not spaces that are withing
html tags. I know there are better ways, but I am using an IFrame in
IE and I work for a government agency which doesn't allow me to use
open source. I am depending on a RegEx wizard out there to supply me
with the pattern.

So I need a pattern that matches any space or carriage return that is
not within an html tag.

<font face="arial" size=2>test</font><p>yo this is a test

Splitting this text should return an array containing:
1: <font face="arial" size=2>test</font>
2: yo
3: this
4: is
5: a
6: test
Suppose you have

var s = '<font face="arial" size=2>test</font><p>yo this is a test';

Either you have a weird idea of "html tag" (HTML is an acronym, BTW),
or (which is more likely) instead you want the resulting array to be

['', 'test', '', 'yo', 'this', 'is', 'a', 'test']

No, that is not what he said.
Are you stupid or what? I *know* that this is not what he said. However, I
don't think he really knows what he wants. Because it does not make sense
for a spell checker in a structural editor to ignore HTML element content.
And therefore, I posted my solution as it is.
Perhaps you should try reading what he wrote
You should read what I wrote, not what you wanted me to have written.

So much for reading.
and the intended results. Your "solution" leaves out the 1. listed
above.
I know.
PointedEars
--
var bugRiddenCrashPronePieceOfJunk = (
navigator.userAgent.indexOf('MSIE 5') != -1
&& navigator.userAgent.indexOf('Mac') != -1
) // Plone, register_function.js:16
Dec 4 '07 #18
Thomas 'PointedEars' Lahn said the following on 12/4/2007 4:25 PM:
Randy Webb wrote:
>Thomas 'PointedEars' Lahn said the following on 12/4/2007 2:38 PM:
>>SmokeWilliams wrote:
I need a pattern that will split without replacing. So I need to
split on spaces or carriage returns, but not spaces that are withing
html tags. I know there are better ways, but I am using an IFrame in
IE and I work for a government agency which doesn't allow me to use
open source. I am depending on a RegEx wizard out there to supply me
with the pattern.

So I need a pattern that matches any space or carriage return that is
not within an html tag.

<font face="arial" size=2>test</font><p>yo this is a test

Splitting this text should return an array containing:
1: <font face="arial" size=2>test</font>
2: yo
3: this
4: is
5: a
6: test
Suppose you have

var s = '<font face="arial" size=2>test</font><p>yo this is a test';

Either you have a weird idea of "html tag" (HTML is an acronym, BTW),
or (which is more likely) instead you want the resulting array to be

['', 'test', '', 'yo', 'this', 'is', 'a', 'test']
No, that is not what he said.

Are you stupid or what?
If imitation is the sincerest form of flattery, you flatter the shit out
of me sometimes.
I *know* that this is not what he said. However, I don't think he really
knows what he wants.
He knows *exactly* what he wants, he just isn't sure how to implement
it. Subtle difference my friend.

EOD.

--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Dec 4 '07 #19
Randy Webb wrote:
Thomas 'PointedEars' Lahn said the following on 12/4/2007 4:25 PM:
>Randy Webb wrote:
>>Thomas 'PointedEars' Lahn said the following on 12/4/2007 2:38 PM:
SmokeWilliams wrote:
I need a pattern that will split without replacing. So I need to
split on spaces or carriage returns, but not spaces that are withing
html tags. [...]
>
So I need a pattern that matches any space or carriage return that is
not within an html tag.
>
<font face="arial" size=2>test</font><p>yo this is a test
>
Splitting this text should return an array containing:
1: <font face="arial" size=2>test</font>
2: yo
3: this
4: is
5: a
6: test
Suppose you have

var s = '<font face="arial" size=2>test</font><p>yo this is a test';

Either you have a weird idea of "html tag" (HTML is an acronym, BTW),
or (which is more likely) instead you want the resulting array to be

['', 'test', '', 'yo', 'this', 'is', 'a', 'test']
No, that is not what he said.
Are you stupid or what?

If imitation is the sincerest form of flattery, you flatter the shit out
of me sometimes.
I have tried your lower level of language this time so that you may better
understand me. Obviously, I am not very good at it. Sorry.
>I *know* that this is not what he said. However, I don't think he really
knows what he wants.

He knows *exactly* what he wants,
No, he does not. He has the idea of a spell checker and the problem that he
can not simply split on whitespace because of whitespace in HTML tags:
>>>>So I need a pattern that matches any space or carriage return that is
not within an html tag.
But the his example says otherwise. So I assumed that what he really wants
is to exclude the tags from consideration which leaves only the plain text
for the spell check. And that my solution allows. It is also a solution
that works with any script engine that supports regular expressions, while
solutions including negative lookahead or non-greedy matching do not.
However, these solutions posted so far have assumed that he wants exactly
the result he has posted; they did not take the practical application, or
rather the lack thereof, of that result into account, and they did not take
into account that he may have posted merely a bad example.

Of course, much of that remains speculation until he clears that up. But I
have explicitly stated in my posting that my solution was _not_ to provide
the result that he posted last. And so your followup to that was
unnecessary and the style in which it was written was completely uncalled
for. If you only had read not only *his* postings, but also *my* posting
*properly*.
he just isn't sure how to implement it.
He is pretty much unsure about anything so far.
Subtle difference my friend.
Don't be familiar with me until you have earned it.
PointedEars
Dec 5 '07 #20
Thomas 'PointedEars' Lahn said the following on 12/4/2007 7:48 PM:
Randy Webb wrote:
>Thomas 'PointedEars' Lahn said the following on 12/4/2007 4:25 PM:
>>Randy Webb wrote:
Thomas 'PointedEars' Lahn said the following on 12/4/2007 2:38 PM:
SmokeWilliams wrote:
>I need a pattern that will split without replacing. So I need to
>split on spaces or carriage returns, but not spaces that are withing
>html tags. [...]
>>
>So I need a pattern that matches any space or carriage return that is
>not within an html tag.
>>
><font face="arial" size=2>test</font><p>yo this is a test
>>
>Splitting this text should return an array containing:
>1: <font face="arial" size=2>test</font>
>2: yo
>3: this
>4: is
>5: a
>6: test
Suppose you have
>
var s = '<font face="arial" size=2>test</font><p>yo this is a test';
>
Either you have a weird idea of "html tag" (HTML is an acronym, BTW),
or (which is more likely) instead you want the resulting array to be
>
['', 'test', '', 'yo', 'this', 'is', 'a', 'test']
No, that is not what he said.
Are you stupid or what?
If imitation is the sincerest form of flattery, you flatter the shit out
of me sometimes.

I have tried your lower level of language this time so that you may better
understand me. Obviously, I am not very good at it. Sorry.
You still flatter the shit out of me. You are failing in your endeavor,
but I am still flattered.
>>I *know* that this is not what he said. However, I don't think he really
knows what he wants.
He knows *exactly* what he wants,

No, he does not. He has the idea of a spell checker and the problem that he
can not simply split on whitespace because of whitespace in HTML tags:
He wants a spell checker. He knows what he wants, he just doesn't know
the best way to implement it. And, the "best solution" doesn't involve a
regular expression, just a simple split on the text.
>Subtle difference my friend.

Don't be familiar with me until you have earned it.
If I thought, for one minute, that you even came close to understanding
what I wrote really means, then your statement wouldn't be so ludicrous.

I am way more familiar with you than I ever cared to be.

--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Dec 5 '07 #21
Randy Webb wrote:
Thomas 'PointedEars' Lahn said the following on 12/4/2007 7:48 PM:
>Randy Webb wrote:
>>Thomas 'PointedEars' Lahn said the following on 12/4/2007 4:25 PM:
I *know* that this is not what he said. However, I don't think he really
knows what he wants.
He knows *exactly* what he wants,
No, he does not. He has the idea of a spell checker and the problem that he
can not simply split on whitespace because of whitespace in HTML tags:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^
He wants a spell checker. He knows what he wants, he just doesn't know
the best way to implement it. And, the "best solution" doesn't involve a
regular expression, just a simple split on the text.
Learn to read.
PointedEars
Dec 5 '07 #22
Thomas 'PointedEars' Lahn said the following on 12/5/2007 6:49 PM:
Randy Webb wrote:
>Thomas 'PointedEars' Lahn said the following on 12/4/2007 7:48 PM:
>>Randy Webb wrote:
Thomas 'PointedEars' Lahn said the following on 12/4/2007 4:25 PM:
I *know* that this is not what he said. However, I don't think he really
knows what he wants.
He knows *exactly* what he wants,
No, he does not. He has the idea of a spell checker and the problem that he
can not simply split on whitespace because of whitespace in HTML tags:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^
>He wants a spell checker. He knows what he wants, he just doesn't know
the best way to implement it. And, the "best solution" doesn't involve a
regular expression, just a simple split on the text.

Learn to read.
"He knows what he wants, he just doesn't know how to implement it".
Let's see if I can read what I write and if you can understand what I write.

He wants a spell checker.
He thinks that to implement it he has to split the HTML code.
He doesn't.
To implement a spell checker, you simply read the *text* of the page.
Then you split the text on spaces.
Then you spell check the words from the *text* of the page.
You find mis-spelled words and notify the user.

Now, since he thinks he has to read the HTML code to implement it, then
he doesn't know how to implement it.

Your problem isn't that you can't read, you refuse to understand what
you read sometimes.

But, you did manage to flatter me again.

--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Dec 6 '07 #23

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
2336
by: Lukas Holcik | last post by:
Hi everyone! How can I simply search text for regexps (lets say <a href="(.*?)">(.*?)</a>) and save all URLs(1) and link contents(2) in a dictionary { name : URL}? In a single pass if it could....
7
1725
by: serpent17 | last post by:
Hello all, I have this line of numbers: 04242005 18:20:42-0.000002, 271.1748608, , , repeated several times in a text file and I would like each element to be part of a vector. how do...
12
4019
by: Xah Lee | last post by:
Python Doc Problem Example Quote from: http://docs.python.org/lib/module-os.path.html ---------- split( path) Split the pathname path into a pair, (head, tail) where tail is the last...
6
1283
by: papa.coen | last post by:
Hi, I need to split/match the following type of (singleline) syntax on all commas (or text in between) that are not between qoutes: A,'B,B',C,,'E',F The text between quotes can be _any_ text...
6
2068
by: micklee74 | last post by:
hi i created a script to ask user for an input that can be a pattern right now, i use re to compile that pattern pat = re.compile(r"%s" %(userinput) ) #userinput is passed from command line...
9
1917
by: vbfoobar | last post by:
Hello I am looking for python code that takes as input a list of strings (most similar, but not necessarily, and rather short: say not longer than 50 chars) and that computes and outputs the...
12
5881
by: Ryan | last post by:
Is there anyway to enable spell-checking for user input in a Text Box? Either auto spell-check or create a spell-check button. Using VB 2005.
1
8352
Atli
by: Atli | last post by:
The following small HowTo is a compilation of an original problem in getting some cookie-values through different methods of string-handling. The original Problem was posted as follows: As...
6
10818
by: Neil | last post by:
Is there way to have control over the MS-Access spell checking (besides just launching it)? We want to tell it to check all records, but skip certain fields (or, alternatively, ONLY check certain...
0
7212
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7098
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7296
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7364
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
7017
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7470
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
3174
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1524
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
405
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.