re.search slashes

pyluke

I'm parsing LaTeX document and want to find lines with equations blocked
by "\[" and "\]", but not other instances of "\[" like "a & b & c \\[5pt]"

so, in short, I was to match "\[" but not "\\]"

to add to this, I also don't want lines that start with comments.
I've tried:
check_eq = re.compile('(?!\%\s*)\\\\\[')
check_eq.search(line)

this works in finding the "\[" but also the "\\["

so I would think this would work
check_eq = re.compile('(?![\%\s*\\\\])\\\\\[')
check_eq.search(line)

but it doesn't. Any tips?

Feb 4 '06 #1

Subscribe Post Reply

1409

Scott David Daniels

pyluke wrote:

I'm parsing LaTeX document and want to find lines with equations blocked
by "\[" and "\]", but not other instances of "\[" like "a & b & c \\[5pt]"
so, in short, I was to match "\[" but not "\\]" .... I've tried:
check_eq = re.compile('(?!\%\s*)\\\\\[')
check_eq.search(line)
this works in finding the "\[" but also the "\\["
If you are parsing with regular expressions, you are running a marathon.
If you are doing regular expressions without raw strings, you are running
a marathon barefoot.

Notice: len('(?!\%\s*)\\\\\[') == 13
len(r'(?!\%\s*)\\\\\[') == 15
so I would think this would work
check_eq = re.compile('(?![\%\s*\\\\])\\\\\[')
check_eq.search(line)

but it doesn't. Any tips?

Give us examples that should work and that should not (test cases),
and the proper results of those tests. Don't make people trying to
help you guess about anything you know.

--Scott David Daniels
sc***********@acm.org

Feb 4 '06 #2

Xavier Morel

Scott David Daniels wrote:

pyluke wrote:
I'm parsing LaTeX document and want to find lines with equations blocked
by "\[" and "\]", but not other instances of "\[" like "a & b & c \\[5pt]"
so, in short, I was to match "\[" but not "\\]" .... I've tried:
check_eq = re.compile('(?!\%\s*)\\\\\[')
> check_eq.search(line)
> this works in finding the "\[" but also the "\\["

If you are parsing with regular expressions, you are running a marathon.
If you are doing regular expressions without raw strings, you are running
a marathon barefoot.

Notice: len('(?!\%\s*)\\\\\[') == 13
len(r'(?!\%\s*)\\\\\[') == 15
so I would think this would work
check_eq = re.compile('(?![\%\s*\\\\])\\\\\[')
check_eq.search(line)

but it doesn't. Any tips?

Give us examples that should work and that should not (test cases),
and the proper results of those tests. Don't make people trying to
help you guess about anything you know.

--Scott David Daniels
sc***********@acm.org

To add to what scott said, two advices:
1. Use Kodos, it's a RE debugger and an extremely fine tool to generate
your regular expressions.
2. Read the module's documentation. Several time. In your case read the
"negative lookbehind assertion" part "(?<! ... )" several time, until
you understand how it may be of use to you.

Feb 4 '06 #3

pyluke

Scott David Daniels wrote:

pyluke wrote:
I'm parsing LaTeX document and want to find lines with equations
blocked by "\[" and "\]", but not other instances of "\[" like "a & b
& c \\[5pt]"
so, in short, I was to match "\[" but not "\\]" .... I've tried:
check_eq = re.compile('(?!\%\s*)\\\\\[')
> check_eq.search(line)
> this works in finding the "\[" but also the "\\["

If you are parsing with regular expressions, you are running a marathon.
If you are doing regular expressions without raw strings, you are running
a marathon barefoot.

Notice: len('(?!\%\s*)\\\\\[') == 13
len(r'(?!\%\s*)\\\\\[') == 15
so I would think this would work
check_eq = re.compile('(?![\%\s*\\\\])\\\\\[')
check_eq.search(line)

but it doesn't. Any tips?

Give us examples that should work and that should not (test cases),
and the proper results of those tests. Don't make people trying to
help you guess about anything you know.

--Scott David Daniels
sc***********@acm.org

Alright, I'll try to clarify. I'm taking a tex file and modifying some
of the content. I want to be able to identify a block like the following:

\[
\nabla \cdot u = 0
\]
I don't want to find the following

\begin{tabular}{c c}
a & b \\[4pt]
1 & 2 \\[3pt]
\end{tabular}
When I search a line for the first block by looking for "\[", I find it.
The problem is, that this also find the second block due to the "\\[".

I'm not sure what you mean by running a marathon. I do follow your
statement on raw strings, but that doesn't seem to be the problem. The
difference in your length example above is just from the two escaped
slashes... not sure what my point is...

Thanks
Lou

Feb 4 '06 #4

pyluke

To add to what scott said, two advices:
1. Use Kodos, it's a RE debugger and an extremely fine tool to generate
your regular expressions.
Ok, just found this. Will be helpful.
2. Read the module's documentation. Several time. In your case read the
"negative lookbehind assertion" part "(?<! ... )" several time, until
you understand how it may be of use to you.

Quite a teacher. I'll read it several times...

Thanks anyway.

Feb 4 '06 #5

pyluke

2. Read the module's documentation. Several time. In your case read the
"negative lookbehind assertion" part "(?<! ... )" several time, until
you understand how it may be of use to you.

OK. lookbehind would be more useful/suitable here...

Feb 4 '06 #6

pyluke

pyluke wrote:

I'm parsing LaTeX document and want to find lines with equations blocked
by "\[" and "\]", but not other instances of "\[" like "a & b & c \\[5pt]"

so, in short, I was to match "\[" but not "\\]"

to add to this, I also don't want lines that start with comments.
I've tried:
check_eq = re.compile('(?!\%\s*)\\\\\[')
check_eq.search(line)

this works in finding the "\[" but also the "\\["

so I would think this would work
check_eq = re.compile('(?![\%\s*\\\\])\\\\\[')
check_eq.search(line)

but it doesn't. Any tips?

Alright, this seems to work:

re.compile('(?<![(\%\s*)(\\\\)])\\\\\[')

Feb 4 '06 #7

Scott David Daniels

pyluke wrote:

Scott David Daniels wrote:
pyluke wrote:
I... want to find lines with ... "\[" but not instances of "\\["
If you are parsing with regular expressions, you are running a marathon.
If you are doing regular expressions without raw strings, you are running
a marathon barefoot.

I'm not sure what you mean by running a marathon.

I'm referring to this quote from: http://www.jwz.org/hacks/marginal.html
"(Some people, when confronted with a problem, think ``I know, I'll
use regular expressions.'' Now they have two problems.)"
I do follow your statement on raw strings, but that doesn't seem
to be the problem.
It is an issue in the readability of your code, not the cause of the
code behavior that you don't like. In your particular case, this is
all made doubly hard to read since your patterns and search targets
include back slashes.
\[
\nabla \cdot u = 0
\]

I don't want to find the following

\begin{tabular}{c c}
a & b \\[4pt]
1 & 2 \\[3pt]
\end{tabular}

how about: r'(^|[^\\])\\\['
Which is:
Find something beginning with either start-of-line or a
non-backslash, followed (in either case) by a backslash
and ending with an open square bracket.

Generally, (for the example) I would have said a good test set
describing your problem was:

re.compile(pattern).search(r'\[ ') is not None
re.compile(pattern).search(r' \[ ') is not None
re.compile(pattern).search(r'\\[ ') is None
re.compile(pattern).search(r' \\[ ') is None

--Scott David Daniels
sc***********@acm.org

Feb 4 '06 #8

by: lawrence | last post by:

A user writes this sentence: "It was the New Urbanist's nightmare of sprawl run amok." They input that and my PHP script hits it with addslashes() and then the sentence gets put in the database....

PHP

New to Python.

by: droog | last post by:

Hello! I have just started learning python and encountered a problem. All I wanted to do, was to open a text file search and count the number of occurances of a single word and print that count....

Python

Strange "feature" involving double slashes in Win98

by: Carlos Ribeiro | last post by:

Hello all. I'm sending this to the list because I would like to know if someone else has ever stumbled across this one, and also because one possible solution is to patch, or simply "decorate",...

Python

How to reference table/view that has "slashes" in its name!

by: Alex | last post by:

I'm new to Oracle, so this question may sound silly. I have been given a list of Oracle tables (they may be views, I'm not sure) that are available to me. I can run simple SQL select statements...

Oracle Database

Text Entry For Dates With Slashes?

by: bdwise | last post by:

I have a textbox in a form, and I need to allow users to enter dates in US Format (MM/DD/YYYY). But they do not want to type any slashes, just 8 numbers, and have the slashes added for them. ...

Javascript

search engine safe urls

by: rob | last post by:

Hey, Has anyone seen a good solution for Search Engine Safe URLs with Classic ASP? I've been looking for a while and short of using the 404 or installing tool on the server I was hoping there...

ASP / Active Server Pages

radeditor/dataset/slashes/System.Data.DBConcurrencyException

by: Henrik de Jong | last post by:

Hello, I'm using radeditor. When I set the html-property to the text wich has to be altered, it gives me an error when try to save the new values of the html-proterty. It occurs only when i'm...

ASP.NET

How do I run the MS Windows file search program from html

by: Ray Muforosky | last post by:

Hello all: Task: I want to do file search, using the "conatining text" option from a web page. How do I search for a file on my local drive containing a certain string, from a web page. That...

Javascript

search candidates

by: tomjones75 | last post by:

dear community, i want to search the content of all fields in one table in a access database. it already works for the content of one field in the table. please take a look at the code in...

Microsoft Access / VBA

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

re.search slashes

Similar topics