473,385 Members | 1,863 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Searching for text

I have a batch of files that I am trying to search for specific text in
a specific format. Each file contains several items I want to search
for.

Here is a snippet from the file:
....
/FontName /ACaslonPro-Semibold def
/FontInfo 7 dict dup begin
/Notice (Copyright 2000 Adobe Systems Incorporated. All Rights
Reserved.Adobe Caslon is either a registered trademark or a trademark
of Adobe Systems Incorporated in the United States and/or other
countries.) def
/Weight (Semibold) def
/ItalicAngle 0 def
/FSType 8 def
....

I want to search the file until I find '/FontName /ACaslonPro-Semibold'
and then jump forward 7 lines where I expect to find '/FSType 8'. I
then want to continue searching from *that* point forward for the next
FontName/FSType pair. Unfortunately, I haven't been able to figure out
how to do this in Python, although I could do it fairly easily in a
batch file. Would someone care to enlighten me?

Aug 28 '06 #1
5 1550
I want to search the file until I find '/FontName /ACaslonPro-Semibold'
and then jump forward 7 lines where I expect to find '/FSType 8'. I
then want to continue searching from *that* point forward for the next
FontName/FSType pair. Unfortunately, I haven't been able to figure out
how to do this in Python, although I could do it fairly easily in a
batch file. Would someone care to enlighten me?
found_fontname = False
font_search = '/FontName /ACaslonPro-Semibold'
type_search = '/FSType 8'
for line in file('foo.txt'):
if font_search in line:
found_fontname = True
if found_fontname and type_search in line:
print 'doing something with %s' % line
# reset to look for font_search
found_fontname = False
or, you could

sed -n '/\/FontName \/ACaslonPro-Semibold/,/\/FSType 8/{/\/FSType
8/p}'

You omit what you want to do with the results when you find
them...or what should happen when they both appear on the same
line (though you hint that they're a couple lines apart, you
don't define this as a "this is always the case" sort of scenario)

-tkc
Aug 29 '06 #2
You omit what you want to do with the results when you find
them...or what should happen when they both appear on the same
line (though you hint that they're a couple lines apart, you
don't define this as a "this is always the case" sort of scenario)
I don't do anything, per se. I just need to verify that I find the
FontName/FSType pair. And they *always* have to be in the same
location in relation to each other, i.e. they should never appear on
the same line or any closer/farther from each other.

Aug 29 '06 #3
The other thing I failed to mention is that I need to ensure that I
find the fsType *before* I find the next FontName.

Aug 29 '06 #4
The other thing I failed to mention is that I need to ensure that I
find the fsType *before* I find the next FontName.
found_fontname = False
font_search = '/FontName /ACaslonPro-Semibold'
type_search = '/FSType 8'
for line in file('foo.txt'):
if font_search in line:
if found_fontname:
print "Uh, oh!"
else:
found_fontname = True
if found_fontname and type_search in line:
print 'doing something with %s' % line
# reset to look for font_search
found_fontname = False

and look for it to report "Uh, oh!" where it has found another
"/FontName /ACaslonPro-Semibold".

You can reduce your font_search to just '/FontName' if that's all
you care about, or if you just want any '/FontName' inside an
'/ACaslonPro-SemiBold' block, you can tweak it to be something like

for line in file('foo.txt'):
if found_fontname and '/FontName' in line:
print "Uh, oh!"
if font_search in line:
found_fontname = True

-tkc

Aug 29 '06 #5
robinsiebler wrote:
The other thing I failed to mention is that I need to ensure that I
find the fsType *before* I find the next FontName.
Given these requirements, I'd formulate the script something like this:
f = open(filename)

NUM_LINES_BETWEEN = 7

Fo = '/FontName /ACaslonPro-Semibold'
FS = '/FSType 8'
def checkfile(f):
# Get a (index, line) generator on the file.
G = enumerate(f)

for i, line in G:

# make sure we don't find a FSType
if FS in line:
print 'Found FSType without FontName %i' % i
return False

# Look for FontName.
if Fo in line:
print 'Found FontName at line %i' % i

try:

# Check the next 7 lines for NO FSType
# and NO FontName
n = NUM_LINES_BETWEEN
while n:
i, line = G.next()

if FS in line:
print 'Found FSType prematurely at %i' % i
return False

if Fo in line:
print "Found '%s' before '%s' at %i" % \
(Fo, FS, i)
return False
n =- 1

# Make sure there's a FSType.
i, line = G.next()

if FS in line:
print 'Found FSType at %i' % i

elif Fo in line:
print "Found '%s' instead of '%s' at %i" % \
(Fo, FS, i)
return False

else:
print 'FSType not found at %i' % i
return False

except StopIteration:
print 'File ended before FSType found.'
return False

return True
if checkfile(f):
# File passes...
pass
Be sure to close your file object when you're done with it. And you
might want fewer or different print statements.

HTH

Peace,
~Simon

Aug 29 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Michi | last post by:
Hello, I am creating a databse with a large number of text (or blob?) entries. I want users to be able to search these fields. An example would be a forum or journal. Indexing every word in...
4
by: Michi | last post by:
I was wondering what the best solution is for making large numbers of TEXT (or BLOB?) fields searchable. For example, if I have a forum, what is the best way to be able to search for specific...
2
by: Roberto Dias | last post by:
Hi all, What to do for searching for more than one string occurrence in the same string (this last are line of a text). I have used getline(), to get the text lines by means of WHILE loop and...
3
by: Paul H | last post by:
I have a text file that contains the following: ******************** __StartCustomerID_41 Name: Fred Smith Address: 57 Pew Road Croydon
8
by: Gordon Knote | last post by:
Hi can anyone tell me what's the best way to search in binary content? Best if someone could post or link me to some source code (in C/C++). The search should be as fast as possible and it would...
3
by: Aaron | last post by:
I'm trying to parse a table on a webpage to pull down some data I need. The page is based off of information entered into a form. when you submit the data from the form it displays a...
1
Corster
by: Corster | last post by:
I went through a great deal of hassle to figure this out for myself, but now it is complete, I would like to share it with the world! I know afew other people have had trouble with FindFirst and...
4
by: Costa | last post by:
I am looking for a c/c++ text search engine library that supports: - free text searching - not only beginning of words but substrings as well - wildcard searching - I want strings such as...
12
by: Alexnb | last post by:
This is similar to my last post, but a little different. Here is what I would like to do. Lets say I have a text file. The contents look like this, only there is A LOT of the same thing. () A...
1
by: alamodgal | last post by:
hiiiiiii I have a problem in highlighting searching keyword.Actually im using this function for searching Public Function HighLight(ByVal Keyword As String, ByVal ContentFor As String) Dim...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.