473,738 Members | 7,110 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regular expression to exclude lines?

Sorry to ask what is surely a trivial question. Also sorry that I don't have
my current code version on hand, but... Anyway, must be some problem with
trying to do the negative. It seems like I get into these ruts each time I
try to deal with regular expressions.

All I'm trying to do is delete the lines which don't contain a particular
string. Actually a filter to edit a log file. I can find and replace a thing
with null, but can't figure out how to find the lines which do not contain
the thing.

Going further, I want to generalize and use a JavaScript variable containing
the decision string, but first I need to worry about the not-within-a-line
problem.

Jul 20 '05 #1
26 11787
Shannon Jacobs wrote:
Sorry to ask what is surely a trivial question.
Hm, I don't think it is this trivial.
All I'm trying to do is delete the lines which don't contain a
particular string. Actually a filter to edit a log file. I can
find and replace a thing with null, but can't figure out how to
find the lines which do not contain the thing.


Here's a quickhack that filters out of three lines the one that
does not contain the word `line':

alert("this is a line\nthis is a\nthis is a
line".match(/\n*[^\n]*\n*([^\n]*[^l][^i][^n][^e][^\n]*\n)*[^\n]*\n*/)[1])

But there must be a better a better way, IIRC there is something
called `negative lookahead', supported from JavaScript 1.5 on,
which I have yet not worked with.
PointedEars
Jul 20 '05 #2
Thomas 'PointedEars' Lahn <Po*********@we b.de> writes:
Shannon Jacobs wrote:
Sorry to ask what is surely a trivial question.
Hm, I don't think it is this trivial.


Neither do I. Negative matches in regular expressions rarely are.
Here's a quickhack that filters out of three lines the one that
does not contain the word `line':

alert("this is a line\nthis is a\nthis is a
line".match(/\n*[^\n]*\n*([^\n]*[^l][^i][^n][^e][^\n]*\n)*[^\n]*\n*/)[1])
That's purely accidental. If you add a line in front, e.g.,
"bad thing\nthis is a line\nthis is a\n this is a line", it matches
the string containing the second and third line.
But there must be a better a better way, IIRC there is something
called `negative lookahead', supported from JavaScript 1.5 on,
which I have yet not worked with.


Negative lookahead might be an easier way to do it.

The hard way:

/^([^l\n]*(l[^i]|li[^n]|lin[^e]))*([^l\n])*$/m
(any "l" is not followed by "ine")

With negative lookahead:
/^([^l\n]*l(?!ine))*[^l\n]*$/m

The "m" at the end makes "^" and "$" match beginning/end of line.

These regexps only check for the letters "line", not whether they
occur as a word. To do that, one must check for word boundaries around it:

Hard:
/^([^l\n]*(\bl([^i]|i[^n]|in[^e]|ine\B)|\Bl))*[^l\n]*$/m
Easy:
/^([^l\n]*(\bl(?!ine\b)| \Bl))*[^l\n]*$/m

Any "l" right after a word boundary is not followed by ine+word boundary.

To test this regexp, try:
---
var regexp = /^([^l\n]*(\bl(?!ine\b)| \Bl))*[^l\n]*$/mg ;
var lines = "nonline\nline\ nlinefeed\nwith line in the middle\n"+
"no l-word here\n\npreviou s l-word was empty\nand ending in line";
var dellines = lines.replace(r egexp,"---DELETED---");
alert(lines);
alert(dellines) ;
---
A longer explanation of:
/^([^l\n]*(\bl(?!ine\b)| \Bl))*[^l\n]*$/m
^ beginning of line
^ some non-l/non-newlines
^ either wordboundary + l not followed by "ine"+wordbound ary
^or l not after word boundary
^any number of times
^ and then some non-l/non-newlines again.

Good luck:)
/L
--
Lasse Reichstein Nielsen - lr*@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleD OM.html>
'Faith without judgement merely degrades the spirit divine.'
Jul 20 '05 #3
Lasse Reichstein Nielsen wrote on 24 nov 2003 in comp.lang.javas cript:
Negative lookahead might be an easier way to do it.


What about this non greedy "*?" form:

<script>

function replLine(x,t){
t+="\n"
var re = new RegExp("[^\n]*?"+x+"[^\n]*\n","g");
t = t.replace(re ,"")
return t.replace(/\n$/,"")
}

tx="bad thing\nthis is a line\nthis is a\n this is a line"

alert(replLine( "thing",tx) )
alert(replLine( "line",tx))

</script>


--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Jul 20 '05 #4
Evertjan. wrote on 24 nov 2003 in comp.lang.javas cript:
Lasse Reichstein Nielsen wrote on 24 nov 2003 in comp.lang.javas cript:
Negative lookahead might be an easier way to do it.


What about this non greedy "*?" form:

<script>

function replLine(x,t){
t+="\n"
var re = new RegExp("[^\n]*?"+x+"[^\n]*\n","g");
t = t.replace(re ,"")
return t.replace(/\n$/,"")
}

tx="bad thing\nthis is a line\nthis is a\n this is a line"

alert(replLine( "thing",tx) )
alert(replLine( "line",tx))

</script>


"All I'm trying to do is delete the lines which don't contain a
particular string. "

Wow, I missed the "n't"

I will try again later.

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Jul 20 '05 #5
Evertjan. wrote on 24 nov 2003 in comp.lang.javas cript:
"All I'm trying to do is delete the lines which don't contain a
particular string. "

Wow, I missed the "n't"

I will try again later.


This better?

<script>

function replLine(x,t){
var re = new RegExp(x,"");
t+="\n"
t = t.replace(
/.*?\n/g,
function($0,$1, $2)
{return (!re.test($0))? $0:""}
)
return t.replace(/\n$/,"")
}

tx="bad thing\nthis is a line\nthis is a\n this is a line"

alert(replLine( "thing",tx) )
alert(replLine( "line",tx))

</script>

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Jul 20 '05 #6
Evertjan. wrote on 24 nov 2003 in comp.lang.javas cript:
Evertjan. wrote on 24 nov 2003 in comp.lang.javas cript:
"All I'm trying to do is delete the lines which don't contain a
particular string. "

Wow, I missed the "n't"

I will try again later.


This better?

<script>

function replLine(x,t){
var re = new RegExp(x,"");
t+="\n"
t = t.replace(
/.*?\n/g,
function($0,$1, $2)
{return (!re.test($0))? $0:""}
)
return t.replace(/\n$/,"")
}

tx="bad thing\nthis is a line\nthis is a\n this is a line"

alert(replLine( "thing",tx) )
alert(replLine( "line",tx))

</script>


Monologue follows.

Damn, forgot to remove the "!"

<script>

function replLine(x,t){
var re = new RegExp(x,"");
t+="\n"
t = t.replace(
/.*?\n/g,
function($0,$1, $2)
{return (re.test($0))?$ 0:""}
)
return t.replace(/\n$/,"")
}

tx="bad thing\nthis is a line\nthis is a\n this is a line"

alert(replLine( "thing",tx) )
alert(replLine( "line",tx))

</script>

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Jul 20 '05 #7
"Evertjan." <ex************ **@interxnl.net > writes:
Evertjan. wrote on 24 nov 2003 in comp.lang.javas cript:
This better?

Monologue follows.

Damn, forgot to remove the "!" function replLine(x,t){
var re = new RegExp(x,"");
t+="\n"
t = t.replace(
/.*?\n/g,
function($0,$1, $2)
{return (re.test($0))?$ 0:""}
)
return t.replace(/\n$/,"")
}


This first splits the string into lines, and then replaces each line
based on a second test.
It should work (and seems to).

I don't think you need a non-greedy match (.*?) since . doesn't match
a newline character.
Maybe you can get around adding the extra "\n" by using multiline
matching: /^.*$/gm,
It doesn't remove the newlines in the string though. None of my
attempts have done that so far.

This method uses several regexp matches, not just one (which is
sometimes the better approach :), but the first is really just
splitting into lines. You can use the split method for that.

How about this:

// returns new array containing only those elements that match re
Array.prototype .filter = function filter(re) {
var res = [];
for (var i=0;i<this.leng th;i++) {
if (re.test(this[i])) {res.push(this[i]);}
}
return res;
}

var tx="bad thing\nthis is a line\nthis is a\n this is a line";
alert(tx.split( "\n").filte r(/line/).join("\n"));

Sadly, adding properties to Array.prototype means that you can't
(easily) use
for (var i in this)
to iterate through a sparse array. The filter method is enumerable,
so it is also included.

/L
--
Lasse Reichstein Nielsen - lr*@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleD OM.html>
'Faith without judgement merely degrades the spirit divine.'
Jul 20 '05 #8
Lasse Reichstein Nielsen wrote on 24 nov 2003 in comp.lang.javas cript:
I don't think you need a non-greedy match (.*?) since . doesn't match
a newline character.

True !

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Jul 20 '05 #9
JRS: In article <3F************ **@PointedEars. de>, seen in
news:comp.lang. javascript, Thomas 'PointedEars' Lahn
<Po*********@we b.de> posted at Mon, 24 Nov 2003 16:39:25 :-
Shannon Jacobs wrote:
Sorry to ask what is surely a trivial question.


Hm, I don't think it is this trivial.
All I'm trying to do is delete the lines which don't contain a
particular string. Actually a filter to edit a log file. I can
find and replace a thing with null, but can't figure out how to
find the lines which do not contain the thing.


Here's a quickhack that filters out of three lines the one that
does not contain the word `line':

alert("this is a line\nthis is a\nthis is a
line".match(/\n*[^\n]*\n*([^\n]*[^l][^i][^n][^e][^\n]*\n)*[^\n]*\n*/)[1])

But there must be a better a better way, IIRC there is something
called `negative lookahead', supported from JavaScript 1.5 on,
which I have yet not worked with.

AIUI, the OP wants a file which is the previous file minus those lines
which do not contain the string. That code, after broken-string
correction, pops up a box showing the first unwanted line.

Javascript "alone" is not capable of file handling, AFAICS.

If the OP can read and write the file line by line, controlled by
javascript, and apply script to each line, then it is only necessary to
do (pseudo-code follows)

while not EoF(FI) do begin Readln(FI, S) ; // pascal
if ( ! /«string»/.test(S) ) continue // javascript
Writeln(FO, S) end ; // pascal


The OP has MSOE, which suggests Windows. If the job is to be run in
DOS, Windows, or UNIX, then the task is trivial using MiniTrue, which
IMHO is a most valuable tool. Example :

mtr -no~ jt.htm - e

will put, on standard output, all those lines of jt.htm which do not
contain the letter e. A RegExp can be used for the search, in place
of e. There may be a way of doing it without using standard output.

--
© John Stockton, Surrey, UK. ?@merlyn.demon. co.uk Turnpike v4.00 MIME. ©
Web <URL:http://www.merlyn.demo n.co.uk/> - FAQish topics, acronyms, & links.
I find MiniTrue useful for viewing/searching/altering files, at a DOS prompt;
free, DOS/Win/UNIX, <URL:http://www.idiotsdelig ht.net/minitrue/> Update soon?
Jul 20 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
2428
by: dreamerbin | last post by:
Hi, I'm having trouble extracting substrings using regular expression. Here is my problem: Want to find the substring that is immediately before a given substring. For example: from "00 noise1 01 noise2 00 target 01 target_mark", want to get "00 target 01"
4
5158
by: Buddy | last post by:
Can someone please show me how to create a regular expression to do the following My text is set to MyColumn{1, 100} Test I want a regular expression that sets the text to the following testMyColumn{1, 100}Test Basically I want the regular expression to add the word test infront of the
4
3226
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go over each document, find out if it contains a header and/or a footer and extract only the main content part. The headers and the footers have no specific format and I have to detect and remove them using a list of strings that may appear as...
2
8565
by: Dan Schumm | last post by:
I'm relatively new to regular expressions and was looking for some help on a problem that I need to solve. Basically, given an HTML string, I need to highlight certain words within the text of the string. I had it working somewhat, but ran into problems if one of the highlighted words could also be part of an HTML tag (such as 'Table' or 'Border'). What I need is the regex to find the word, but ignore any words that fall between an HTML...
1
1226
by: Jim Heavey | last post by:
I created a regular expression and I used the following regular expression: {1,1000} I used \091 and \093 to indicate that "" are valid characters, but it seems to ignore those values as it will not allow me to enter those values into the field where this reqular expression is associated. How can I tell it to allow these characters to be keyed? Also, I tried to use a bind expression such as ValidationExpression=<%#
3
2565
by: Zach | last post by:
Hello, Please forgive if this is not the most appropriate newsgroup for this question. Unfortunately I didn't find a newsgroup specific to regular expressions. I have the following regular expression. ^(.+?) uses (?!a spoon)\.$
25
5162
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART (CONDUCTION DEFECT) 37.33/2 HEART (CONDUCTION DEFECT) WITH CATHETER 37.34/2 " the expression is "HEART (CONDUCTION DEFECT)". How do I gain access to the expression (not the matches) at runtime? Thanks, Mike
1
1766
by: othellomy | last post by:
I am trying to exclude all strings that has 'a' inside (I have simplified the actual problem) select 1 where 'bb b a dfg' like '%%' However, the above does not work. By the way, I can not use 'not like' such as: select 1 where 'bb b a dfg' not like '%a%'
14
4989
by: Andy B | last post by:
I need to create a regular expression that will match a 5 digit number, a space and then anything up to but not including the next closing html tag. Here is an example: <startTag>55555 any text</aClosingTag> I need a Regex that will get all of the text between the html tags above (the html tags are random and i do not know them before hand). The match string always starts with at least 5 digits.
0
8969
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8788
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9476
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9335
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9263
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
6751
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
1
3279
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2745
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2193
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.