Hello everyone,
I am using a regular expression to parse a text string into various parts -- for ex: string "How do you do" will be changed to array with all the words and white spaces.
I am using the following code (which has been copied from internet) -
<html>
-
<body>
-
-
<script type="text/javascript">
-
-
var text = "Hello how@are you.com";
-
var result = tokenize(text,true,true);
-
document.write(result.join(','));
-
-
function tokenize(text,capture,noflatten)
-
{
-
_normalizer_regex_str='(?:(?:^| +)["\'.\\-]+ *)|(?: *[\'".\\-]+(?: +|$)|@| +)';
-
_normalizer_regex=new RegExp(_normalizer_regex_str,'g');
-
_normalizer_regex_capture=new RegExp('('+_normalizer_regex_str+')','g');
-
return(noflatten?text:flatten_string(text)).split(capture?_normalizer_regex_capture:_normalizer_regex);
-
}
-
-
function flatten_string(text)
-
{
-
var accents={a:/à |á|â|ã|ä|Ã¥/g,c:/ç/g,d:/ð/g,e:/è|é|ê|ë/g,i:/ì|Ã*|î|ï/g,n:/ñ/g,o:/ø|ö|õ|ô|ó|ò/g,u:/ü|û|ú|ù/g,y:/ÿ|ý/g,ae:/æ/g,oe:/Å“/g}
-
-
text=text.toLowerCase();
-
for(var i in accents)
-
{
-
text=text.replace(accents[i],i);
-
}
-
return text;
-
}
-
</script>
-
-
</body>
-
</html>
-
This code is working fine in Mozilla Firefox 2.0 but not working fine in IE 7.0.
If you execute this code, you will see that the result in both browsers are different.
While firefox also returns the splitting delimiters as a part of the array, IE 7.0 seems to ignore the delimiters and simply pass back the array without the delimiters.
I am new to regular expresssions and not able to find out how this regular expression works (since it has been copied from internet).
If someone can help me fix the above code to return same results in case of IE7 and Firefox, that would be great help.
Thanks,
Rupinder
3 8920 acoder 16,027
Recognized Expert Moderator MVP
Changed the thread title to better describe the problem.
Read about regular expressions in Javascript here.
Hello Everyone,
I was able to find the solution to the problem. The original coder on the internet has extended the String.split function to achieve proper functionality.
Sharing it below for others to use: -
String.prototype._split=String.prototype.split;
-
String.prototype.split=function(separator,limit)
-
{
-
var flags="";
-
if(separator===null||limit===null)
-
{
-
return[];
-
}
-
else if(typeof separator=='string')
-
{
-
return this._split(separator,limit);
-
}
-
else if(separator===undefined)
-
{
-
return[this.toString()];
-
}
-
else if(separator instanceof RegExp)
-
{
-
if(!separator._2||!separator._1)
-
{
-
flags=separator.toString().replace(/^[\S\s]+\//,"");
-
if(!separator._1)
-
{
-
if(!separator.global)
-
{
-
separator._1=new RegExp(separator.source,"g"+flags);
-
}
-
else
-
{
-
separator._1=1;
-
}
-
}
-
}
-
separator1=separator._1==1?separator:separator._1;
-
var separator2=(separator._2?separator._2:separator._2=new RegExp("^"+separator1.source+"$",flags));
-
if(limit===undefined||limit<0)
-
{
-
limit=false;
-
}
-
else
-
{
-
limit=Math.floor(limit);
-
if(!limit)return[];
-
}
-
var match,output=[],lastLastIndex=0,i=0;
-
while((limit?i++<=limit:true)&&(match=separator1.exec(this)))
-
{
-
if((match[0].length===0)&&(separator1.lastIndex>match.index))
-
{
-
separator1.lastIndex--;
-
}
-
if(separator1.lastIndex>lastLastIndex)
-
{
-
if(match.length>1)
-
{
-
match[0].replace(separator2,function(){for(var j=1;j<arguments.length-2;j++){if(arguments[j]===undefined)match[j]=undefined;}});
-
}
-
output=output.concat(this.substring(lastLastIndex,match.index),(match.index===this.length?[]:match.slice(1)));
-
lastLastIndex=separator1.lastIndex;
-
}
-
if(match[0].length===0)
-
{
-
separator1.lastIndex++;
-
}
-
}
-
return(lastLastIndex===this.length)?(separator1.test("")?output:output.concat("")):(limit?output:output.concat(this.substring(lastLastIndex)));
-
}
-
else
-
{
-
return this._split(separator,limit);
-
}
-
}
-
Thanks,
Rupinder
acoder 16,027
Recognized Expert Moderator MVP
Thanks for posting your solution. Glad to hear that you got it working. Post again any time if you have more questions.
Sign in to post your reply or Sign up for a free account.
Similar topics |
by: Michael McGarry |
last post by:
Hi,
I am horrible with Regular Expressions, can anyone recommend a book on it?
Also I am trying to parse the following string to extract the number
after load average.
".... load average: 0.04, 0.02, 0.01"
how can I extract this number with RE or otherwise?
|
by: Martin Robins |
last post by:
I am trying to parse a string that is similar in form to an OLEDB connection string using regular expressions; in principle it is working, but certain character combinations in the string being parsed can completely wreck it.
The string I am trying to parse is as follows:
commandText=insert into (Text) values (@message + N': ' + @category);commandType=StoredProcedure; message=@message; category=@category
I am looking to retrive name value...
|
by: Zachary Turner |
last post by:
I am hopeing someone can help me with a regular expression. I want to use
RegExp.Split, to split a string such as the following
text_1 /text_3/text_4/.../text_n/
into an array that contains the following elements:
text_1
text_2
text_3
|
by: Craig Buchanan |
last post by:
I have a string in the format "name" <address> that i would like to split
into an array of two values. name should be the first value, address the
second value. what does my regex pattern need to be? If the regex doesn't
find occurances of two double quotes and an occurance of < and an occurance
of >, will i get a null string array?
btw, is there a difference between:
dim X() as string
and
|
by: Schorschi |
last post by:
Not having used regular expressions much, I need some help.
Given a string... "This\0Guy\0Needs\0Some\0Help\0\0\0\0\0"
Need result as array of strings... "This","Guy", "Needs", "Some",
"Help"
Where '\0' is a literal zero-byte.
I think I need two regular expressions? One to strip the multiple
instances of '\0' bytes, and another to split the string.
| |
by: moondaddy |
last post by:
I'm writing an app in vb.net 1.1 and I need to parse strings that look
similar to the one below. All 5 rows will make up one string. I have a
form where a use can copy/paste data like what you see below from excel,
word, notepad, etc.. into a textbox on my form. I need to break each line
into 2 numbers which I'll use as parameters for another function. in all
cases each line will be separated with a vbNewline and in most cases the 2...
|
by: Mike |
last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in
matches. I would like to get what the actual regular expression is.
In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART (CONDUCTION
DEFECT) 37.33/2 HEART (CONDUCTION DEFECT) WITH
CATHETER 37.34/2 " the expression is "HEART (CONDUCTION DEFECT)". How
do I gain access to the expression (not the matches) at runtime?
Thanks,
Mike
|
by: Steve |
last post by:
Hi All,
I'm having a tough time converting the following regex.compile patterns
into the new re.compile format. There is also a differences in the
regsub.sub() vs. re.sub()
Could anyone lend a hand?
import regsub
|
by: ahropak |
last post by:
Hi,
I have a question regarding a regular expression within Regex.Split() method which will help me to break each line of code into tokens.
I'm trying to parse some lines of C# source code and split them into tokens.
My logic is very simple: construct a regular expression with 'operators and punctuators' which will serve as delimiters and split a string into array of tokens including those delimiters.
For example:
I have the...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |