473,735 Members | 7,637 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

finding/replacing a long binary pattern in a .bin file

What would be the common sense way of finding a binary pattern in a
..bin file, say some 200 bytes, and replacing it with an updated pattern
of the same length at the same offset?

Also, the pattern can occur on any byte boundary in the file, so
chunking through the code at 16 bytes a frame maybe a problem. The
file itself isn't so large, maybe 32 kbytes is all and the need for
speed is not so great, but the need for accuracy in the
search/replacement is very important.

Thanks,

--Alan

Jul 18 '05 #1
13 15252
On 12 Jan 2005 22:36:54 -0800, yaipa <ya***@yahoo.co m> wrote:
What would be the common sense way of finding a binary pattern in a
.bin file, say some 200 bytes, and replacing it with an updated pattern
of the same length at the same offset?

Also, the pattern can occur on any byte boundary in the file, so
chunking through the code at 16 bytes a frame maybe a problem. The
file itself isn't so large, maybe 32 kbytes is all and the need for
speed is not so great, but the need for accuracy in the
search/replacement is very important.


Okay, given the requirements.

f = file('mybinfile ')
contents = f.read().replac e(oldbinstring, newbinstring)
f.close()
f = file('mybinfile ','w')
f.write(content s)
f.close()

Will do it, and do it accurately. But it will also read the entire
file into memory.

Stephen.
Jul 18 '05 #2
On Thu, 13 Jan 2005 16:51:46 +1000, Stephen Thorne <st************ @gmail.com> wrote:
On 12 Jan 2005 22:36:54 -0800, yaipa <ya***@yahoo.co m> wrote:
What would be the common sense way of finding a binary pattern in a
.bin file, say some 200 bytes, and replacing it with an updated pattern
of the same length at the same offset?

Also, the pattern can occur on any byte boundary in the file, so
chunking through the code at 16 bytes a frame maybe a problem. The
file itself isn't so large, maybe 32 kbytes is all and the need for
speed is not so great, but the need for accuracy in the
search/replacement is very important.


Okay, given the requirements.

f = file('mybinfile ')
contents = f.read().replac e(oldbinstring, newbinstring)
f.close()
f = file('mybinfile ','w')
f.write(conten ts)
f.close()

Will do it, and do it accurately. But it will also read the entire
file into memory.

You must be on linux or such, otherwise you would have shown opening the
_binary_ files (I assume that's what a .bin file is) with 'rb' and 'wb', IWT.

Not sure what system the OP was/is on.

BTW, I'm sure you could write a generator that would take a file name
and oldbinstring and newbinstring as arguments, and read and yield nice
os-file-system-friendly disk-sector-multiple chunks, so you could write

fout = open('mynewbinf ile', 'wb')
for buf in updated_file_st ream('myoldbinf ile','rb', oldbinstring, newbinstring):
fout.write(buf)
fout.close()

(left as an exercise ;-)
(modifying a file "in place" is another exercise)
(doing the latter with defined maximum memory buffer usage
even when mods increase the length of the file is another ;-)

Regards,
Bengt Richter
Jul 18 '05 #3
[Stephen Thorne]
On 12 Jan 2005 22:36:54 -0800, yaipa <ya***@yahoo.co m> wrote:
What would be the common sense way of finding a binary pattern in
a .bin file, say some 200 bytes, and replacing it with an updated
pattern of the same length at the same offset? The file itself
isn't so large, maybe 32 kbytes is all and the need for speed is not
so great, but the need for accuracy in the search/replacement is
very important.
Okay, given the requirements. f = file('mybinfile ')
contents = f.read().replac e(oldbinstring, newbinstring)
f.close()
f = file('mybinfile ','w')
f.write(content s)
f.close() Will do it, and do it accurately. But it will also read the entire
file into memory.


32Kb is a small file indeed, reading it in memory is not a problem!

People sometimes like writing long Python programs. Here is about the
same, a bit shorter: :-)

buffer = file('mybinfile ', 'rb').read().re place(oldbinstr ing, newbinstring)
file('mybinfile ', 'wb').write(buf fer)

--
François Pinard http://pinard.progiciels-bpi.ca
Jul 18 '05 #4
Bengt Richter wrote:
BTW, I'm sure you could write a generator that would take a file name
and oldbinstring and newbinstring as arguments, and read and yield nice
os-file-system-friendly disk-sector-multiple chunks, so you could write

fout = open('mynewbinf ile', 'wb')
for buf in updated_file_st ream('myoldbinf ile','rb', oldbinstring, newbinstring):
fout.write(buf)
fout.close()


What happens when the bytes to be replaced are broken across a block
boundary? ISTM that neither half would be recognized....

I believe that this requires either reading the entire file into
memory, to scan all at once, or else conditionally matching an
arbitrary fragment of the end of a block against the beginning of the
oldbinstring... Given that the file in question is only a few tens of
kbytes, I'd think that doing it in one gulp is simpler. (For a large
file, chunking it might be necessary, though...)

Jeff Shannon
Technician/Programmer
Credit International

Jul 18 '05 #5
On Thu, 13 Jan 2005 11:40:52 -0800, Jeff Shannon <je**@ccvcorp.c om> wrote:
Bengt Richter wrote:
BTW, I'm sure you could write a generator that would take a file name
and oldbinstring and newbinstring as arguments, and read and yield nice
os-file-system-friendly disk-sector-multiple chunks, so you could write

fout = open('mynewbinf ile', 'wb')
for buf in updated_file_st ream('myoldbinf ile','rb', oldbinstring, newbinstring):
fout.write(buf)
fout.close()
What happens when the bytes to be replaced are broken across a block
boundary? ISTM that neither half would be recognized....

That was part of the exercise ;-)

(Hint: use str.find to find unbroken oldbinstrings in current inputbuffer and buffer out
safe changes, then when find fails, delete the safely used front of the input buffer,
and append another chunk from the input file. Repeat until last chunk has been appended
and find finds no more. Then buffer out the tail of the input buffer (if any) that then
won't have an oldbinstring to change).

I believe that this requires either reading the entire file into
memory, to scan all at once, or else conditionally matching an
arbitrary fragment of the end of a block against the beginning of the
oldbinstring.. . Given that the file in question is only a few tens of
kbytes, I'd think that doing it in one gulp is simpler. (For a large
file, chunking it might be necessary, though...)


It's certainly simpler to do it in one gulp, but it's not really hard to
do it in chunks. You just have to make sure your input buffer/chunksize is/are
larger than oldbinstring ;-)

Regards,
Bengt Richter
Jul 18 '05 #6
On Thu, 13 Jan 2005 11:40:52 -0800, Jeff Shannon <je**@ccvcorp.c om> wrote:
Bengt Richter wrote:
BTW, I'm sure you could write a generator that would take a file name
and oldbinstring and newbinstring as arguments, and read and yield nice
os-file-system-friendly disk-sector-multiple chunks, so you could write

fout = open('mynewbinf ile', 'wb')
for buf in updated_file_st ream('myoldbinf ile','rb', oldbinstring, newbinstring):
fout.write(buf)
fout.close()


What happens when the bytes to be replaced are broken across a block
boundary? ISTM that neither half would be recognized....

I believe that this requires either reading the entire file into
memory, to scan all at once, or else conditionally matching an
arbitrary fragment of the end of a block against the beginning of the
oldbinstring.. . Given that the file in question is only a few tens of
kbytes, I'd think that doing it in one gulp is simpler. (For a large
file, chunking it might be necessary, though...)

Might as well post this, in case you're interested... warning, not very tested.
You want to write a proper test? ;-)

----< sreplace.py >-------------------------------------------------
def sreplace(sseq, old, new, retsize=4096):
"""
iterate through sseq input string chunk sequence treating it
as a continuous stream, replacing each substring old with new,
and generating a sequence of retsize returned strings, except
that the last may be shorter depedning on available input.
"""
inbuf = ''
endsseq = False
out = []
start = 0
lenold = len(old)
lennew = len(new)
while not endsseq:
start, endprev = old and inbuf.find(old, start) or -1, start
if start<0:
start = endprev # restore find start pos
for chunk in sseq: inbuf+= chunk; break
else:
out.append(inbu f[start:])
endsseq = True
else:
out.append(inbu f[endprev:start])
start += lenold
out.append(new)
if endsseq or sum(map(len, out))>=retsize:
s = ''.join(out)
while len(s)>= retsize:
yield s[:retsize]
s = s[retsize:]
if endsseq:
if s: yield s
else:
out = [s]

if __name__ == '__main__':
import sys
args = sys.argv[:]
usage = """
Test usage: [python] sreplace.py old new retsize [rest of args is string chunks for test]
where old is old string to find in chunked stream and new is replacement
and retsize is returned buffer size, except that last may be shorter"""
if not args[1:]: raise SystemExit, usage
try:
args[3] = int(args[3])
args[0] = iter(sys.argv[4:])
print '%r\n-----------\n%s\n------------' %(sys.argv[1:], '\n'.join(srepl ace(*args[:4])))
except Exception, e:
print '%s: %s' %(e.__class__._ _name__, e)
raise SystemExit, usage
--------------------------------------------------------------------

As mentioned, not tested very much beyond what you see:

[ 2:43] C:\pywk\ut>py24 sreplace.py x _XX_ 20 This is x and abcxdef 012x345 zzxx zzz x
['x', '_XX_', '20', 'This', 'is', 'x', 'and', 'abcxdef', '012x345', 'zzxx', 'zzz', 'x']
-----------
Thisis_XX_andab c_XX_
def012_XX_345zz _XX__
XX_zzz_XX_
------------

[ 2:43] C:\pywk\ut>py24 sreplace.py x _XX_ 80 This is x and abcxdef 012x345 zzxx zzz x
['x', '_XX_', '80', 'This', 'is', 'x', 'and', 'abcxdef', '012x345', 'zzxx', 'zzz', 'x']
-----------
Thisis_XX_andab c_XX_def012_XX_ 345zz_XX__XX_zz z_XX_
------------

[ 2:43] C:\pywk\ut>py24 sreplace.py x _XX_ 4 This is x and abcxdef 012x345 zzxx zzz x
['x', '_XX_', '4', 'This', 'is', 'x', 'and', 'abcxdef', '012x345', 'zzxx', 'zzz', 'x']
-----------
This
is_X
X_an
dabc
_XX_
def0
12_X
X_34
5zz_
XX__
XX_z
zz_X
X_
------------

[ 2:44] C:\pywk\ut>py24 sreplace.py def DEF 80 This is x and abcxdef 012x345 zzxx zzz x
['def', 'DEF', '80', 'This', 'is', 'x', 'and', 'abcxdef', '012x345', 'zzxx', 'zzz', 'x']
-----------
ThisisxandabcxD EF012x345zzxxzz zx
------------

If you wanted to change a binary file, you'd use it something like (although probably let
the default buffer size be at 4096, not 20, which is pretty silly other than demoing.
At least the input chunks are 512 ;-)
from sreplace import sreplace
fw = open('sreplace. py.txt','wb')
for buf in sreplace(iter(l ambda f=open('sreplac e.py','rb'):f.r ead(512), ''),'out','OUT' ,20): ... fw.write(buf)
... fw.close()
^Z

[ 3:00] C:\pywk\ut>diff -u sreplace.py sreplace.py.txt
--- sreplace.py Fri Jan 14 02:39:52 2005
+++ sreplace.py.txt Fri Jan 14 03:00:01 2005
@@ -7,7 +7,7 @@
"""
inbuf = ''
endsseq = False
- out = []
+ OUT = []
start = 0
lenold = len(old)
lennew = len(new)
@@ -17,21 +17,21 @@
start = endprev # restore find start pos
for chunk in sseq: inbuf+= chunk; break
else:
- out.append(inbu f[start:])
+ OUT.append(inbu f[start:])
endsseq = True
else:
- out.append(inbu f[endprev:start])
+ OUT.append(inbu f[endprev:start])
start += lenold
- out.append(new)
- if endsseq or sum(map(len, out))>=retsize:
- s = ''.join(out)
+ OUT.append(new)
+ if endsseq or sum(map(len, OUT))>=retsize:
+ s = ''.join(OUT)
while len(s)>= retsize:
yield s[:retsize]
s = s[retsize:]
if endsseq:
if s: yield s
else:
- out = [s]
+ OUT = [s]

if __name__ == '__main__':
import sys
Regards,
Bengt Richter
Jul 18 '05 #7
Bengt, and all,

Thanks for all the good input. The problems seems to be that .find()
is good for text files on Windows, but is not much use when it is
binary data. The script is for a Assy Language build tool, so I know
the exact seek address of the binary data that I need to replace, so
maybe I'll just go that way. It just seemed a little more general to
do a search and replace rather than having to type in a seek address.

Of course I could use a Lib function to convert the binary data to
ascii and back, but seems a little over the top in this case.

Cheers,

--Alan
Bengt Richter wrote:
On Thu, 13 Jan 2005 11:40:52 -0800, Jeff Shannon <je**@ccvcorp.c om> wrote:
Bengt Richter wrote:
BTW, I'm sure you could write a generator that would take a file name and oldbinstring and newbinstring as arguments, and read and yield nice os-file-system-friendly disk-sector-multiple chunks, so you could write
fout = open('mynewbinf ile', 'wb')
for buf in updated_file_st ream('myoldbinf ile','rb', oldbinstring, newbinstring): fout.write(buf)
fout.close()
What happens when the bytes to be replaced are broken across a block
boundary? ISTM that neither half would be recognized....

I believe that this requires either reading the entire file into
memory, to scan all at once, or else conditionally matching an
arbitrary fragment of the end of a block against the beginning of theoldbinstring.. . Given that the file in question is only a few tens ofkbytes, I'd think that doing it in one gulp is simpler. (For a largefile, chunking it might be necessary, though...)

Might as well post this, in case you're interested... warning, not

very tested. You want to write a proper test? ;-)

----< sreplace.py >-------------------------------------------------
def sreplace(sseq, old, new, retsize=4096):
"""
iterate through sseq input string chunk sequence treating it
as a continuous stream, replacing each substring old with new,
and generating a sequence of retsize returned strings, except
that the last may be shorter depedning on available input.
"""
inbuf = ''
endsseq = False
out = []
start = 0
lenold = len(old)
lennew = len(new)
while not endsseq:
start, endprev = old and inbuf.find(old, start) or -1, start
if start<0:
start = endprev # restore find start pos
for chunk in sseq: inbuf+= chunk; break
else:
out.append(inbu f[start:])
endsseq = True
else:
out.append(inbu f[endprev:start])
start += lenold
out.append(new)
if endsseq or sum(map(len, out))>=retsize:
s = ''.join(out)
while len(s)>= retsize:
yield s[:retsize]
s = s[retsize:]
if endsseq:
if s: yield s
else:
out = [s]

if __name__ == '__main__':
import sys
args = sys.argv[:]
usage = """
Test usage: [python] sreplace.py old new retsize [rest of args is string chunks for test] where old is old string to find in chunked stream and new is replacement and retsize is returned buffer size, except that last may be shorter""" if not args[1:]: raise SystemExit, usage
try:
args[3] = int(args[3])
args[0] = iter(sys.argv[4:])
print '%r\n-----------\n%s\n------------' %(sys.argv[1:], '\n'.join(srepl ace(*args[:4]))) except Exception, e:
print '%s: %s' %(e.__class__._ _name__, e)
raise SystemExit, usage
--------------------------------------------------------------------

As mentioned, not tested very much beyond what you see:

[ 2:43] C:\pywk\ut>py24 sreplace.py x _XX_ 20 This is x and abcxdef 012x345 zzxx zzz x ['x', '_XX_', '20', 'This', 'is', 'x', 'and', 'abcxdef', '012x345', 'zzxx', 'zzz', 'x'] -----------
Thisis_XX_andab c_XX_
def012_XX_345zz _XX__
XX_zzz_XX_
------------

[ 2:43] C:\pywk\ut>py24 sreplace.py x _XX_ 80 This is x and abcxdef 012x345 zzxx zzz x ['x', '_XX_', '80', 'This', 'is', 'x', 'and', 'abcxdef', '012x345', 'zzxx', 'zzz', 'x'] -----------
Thisis_XX_andab c_XX_def012_XX_ 345zz_XX__XX_zz z_XX_
------------

[ 2:43] C:\pywk\ut>py24 sreplace.py x _XX_ 4 This is x and abcxdef 012x345 zzxx zzz x ['x', '_XX_', '4', 'This', 'is', 'x', 'and', 'abcxdef', '012x345', 'zzxx', 'zzz', 'x'] -----------
This
is_X
X_an
dabc
_XX_
def0
12_X
X_34
5zz_
XX__
XX_z
zz_X
X_
------------

[ 2:44] C:\pywk\ut>py24 sreplace.py def DEF 80 This is x and abcxdef 012x345 zzxx zzz x ['def', 'DEF', '80', 'This', 'is', 'x', 'and', 'abcxdef', '012x345', 'zzxx', 'zzz', 'x'] -----------
ThisisxandabcxD EF012x345zzxxzz zx
------------

If you wanted to change a binary file, you'd use it something like (although probably let the default buffer size be at 4096, not 20, which is pretty silly other than demoing. At least the input chunks are 512 ;-)
>>> from sreplace import sreplace
>>> fw = open('sreplace. py.txt','wb')
>>> for buf in sreplace(iter(l ambda
f=open('sreplac e.py','rb'):f.r ead(512), ''),'out','OUT' ,20):
... fw.write(buf)
... >>> fw.close()
>>> ^Z

[ 3:00] C:\pywk\ut>diff -u sreplace.py sreplace.py.txt
--- sreplace.py Fri Jan 14 02:39:52 2005
+++ sreplace.py.txt Fri Jan 14 03:00:01 2005
@@ -7,7 +7,7 @@
"""
inbuf = ''
endsseq = False
- out = []
+ OUT = []
start = 0
lenold = len(old)
lennew = len(new)
@@ -17,21 +17,21 @@
start = endprev # restore find start pos
for chunk in sseq: inbuf+= chunk; break
else:
- out.append(inbu f[start:])
+ OUT.append(inbu f[start:])
endsseq = True
else:
- out.append(inbu f[endprev:start])
+ OUT.append(inbu f[endprev:start])
start += lenold
- out.append(new)
- if endsseq or sum(map(len, out))>=retsize:
- s = ''.join(out)
+ OUT.append(new)
+ if endsseq or sum(map(len, OUT))>=retsize:
+ s = ''.join(OUT)
while len(s)>= retsize:
yield s[:retsize]
s = s[retsize:]
if endsseq:
if s: yield s
else:
- out = [s]
+ OUT = [s]

if __name__ == '__main__':
import sys
Regards,
Bengt Richter


Jul 18 '05 #8
On 14 Jan 2005 15:40:27 -0800, "yaipa" <ya***@yahoo.co m> wrote:
Bengt, and all,

Thanks for all the good input. The problems seems to be that .find()
is good for text files on Windows, but is not much use when it is
binary data. The script is for a Assy Language build tool, so I know Did you try it? Why shouldn't find work for binary data?? At the end of
this, I showed an example of opening and modding a text file _in binary_.
s= ''.join(chr(i) for i in xrange(256))
s '\x00\x01\x02\x 03\x04\x05\x06\ x07\x08\t\n\x0b \x0c\r\x0e\x0f\ x10\x11\x12\x13 \x14\x15\x16\x1 7\x18\
x19\x1a\x1b\x1c \x1d\x1e\x1f !"#$%&\'()*+ ,-./0123456789:;<=> ?@ABCDEFGHIJKLM NOPQRSTUVWXYZ[\\]^_`ab
cdefghijklmnopq rstuvwxyz{|}~\x 7f\x80\x81\x82\ x83\x84\x85\x86 \x87\x88\x89\x8 a\x8b\x8c\x8d\x 8e\x8f
\x90\x91\x92\x9 3\x94\x95\x96\x 97\x98\x99\x9a\ x9b\x9c\x9d\x9e \x9f\xa0\xa1\xa 2\xa3\xa4\xa5\x a6\xa7
\xa8\xa9\xaa\xa b\xac\xad\xae\x af\xb0\xb1\xb2\ xb3\xb4\xb5\xb6 \xb7\xb8\xb9\xb a\xbb\xbc\xbd\x be\xbf
\xc0\xc1\xc2\xc 3\xc4\xc5\xc6\x c7\xc8\xc9\xca\ xcb\xcc\xcd\xce \xcf\xd0\xd1\xd 2\xd3\xd4\xd5\x d6\xd7
\xd8\xd9\xda\xd b\xdc\xdd\xde\x df\xe0\xe1\xe2\ xe3\xe4\xe5\xe6 \xe7\xe8\xe9\xe a\xeb\xec\xed\x ee\xef
\xf0\xf1\xf2\xf 3\xf4\xf5\xf6\x f7\xf8\xf9\xfa\ xfb\xfc\xfd\xfe \xff' for i in xrange(256): ... assert i == s.find(chr(i))
...
I.e., all the finds succeded for all 256 possible bytes. Why wouldn't you think that would work fine
for data from a binary file? Of course, find is case sensitive and fixed, not a regex, so it's
not very flexible. It wouldn't be that hard to expand to a list of old,new pairs as a change spec
though. Of course that would slow it down some.

the exact seek address of the binary data that I need to replace, so
maybe I'll just go that way. It just seemed a little more general to
do a search and replace rather than having to type in a seek address. Except you run the risk of not having a unique search result, unless you
have a really guaranteed unique pattern.
Of course I could use a Lib function to convert the binary data to
ascii and back, but seems a little over the top in this case. I think you misunderstand Python strings. There is no need to "convert" the result
of open(filename, 'rb').read(chun ksize). Re-read the example below ;-)
[...]

If you wanted to change a binary file, you'd use it something like ^^^^^^^^^^^(although probably let
the default buffer size be at 4096, not 20, which is pretty silly

other than demoing.
At least the input chunks are 512 ;-)
>>> from sreplace import sreplace
>>> fw = open('sreplace. py.txt','wb')

opens a binary output file
>>> for buf in sreplace(iter(l ambda

f=open('srepla ce.py','rb'):f. read(512), ''),'out','OUT' ,20):

iter(f, sentinel) is the format above. I creates an iterator that
keeps calling f() until f()==sentinel, which it doesn't return, and that ends the sequence
f in this case is lambda f=open(inputfil ename):f.read(i nputchunksize)
and the sentinel is '' -- which is what is returned at EOF.
The old thing to find was 'out', to be changed to 'OUT', and the 20 was a silly small
return chunks size for the sreplace(...) iterator. Alll these chunks were simply passed
to ... fw.write(buf)
...
>>> fw.close() and closing the file explicitly wrapped it up. >>> ^Z


I just typed that in interactively to demo the file change process with the source itself, so the diff
could show the changes. I guess I should have made sreplace.py runnable as a binary file updater, rather
than a cute demo using command line text. The files are no worry, but what is the source of your old
and new binary patterns that you want use for find and replace? You can't enter them in unescaped format
on a command line, so you may want to specify them in separate binary files, or you could specify them
as Python strings in a module that could be imported. E.g.,

---< old2new.py >------
# example of various ways to specify binary bytes in strings
from binascii import unhexlify as hex2chr
old = (
'This is plain text.'
+ ''.join(map(chr ,[33,44,55, 0xaa])) + '<<-- arbitrary list of binary bytes specified in numerically if desired'
+ chr(33)+chr(44) +chr(55)+ '<<-- though this is plainer for a short sequence'
+ hex2chr('414243 3031320001ff') + r'<<-- should be ABC012\x00\x01\ xff'
)

new = '\x00'*len(old) # replace with zero bytes
-----------------------

BTW: Note: changing binaries can be dangerous! Do so at your own risk!!
And this has not been tested worth a darn, so caveat**n.

---< binfupd.py >------
from sreplace import sreplace
def main(infnam, outfnam, old, new):
infile = open(infnam, 'rb')
inseq = iter(lambda: infile.read(409 6), '')
outfile = open(outfnam, 'wb')
try:
try:
for buf in sreplace(inseq, old, new):
outfile.write(b uf)
finally:
infile.close()
outfile.close()
except Exception, e:
print '%s:%s' %(e.__class__._ _name__, e)

if __name__ == '__main__':
import sys
try:
oldnew = __import__(sys. argv[3])
main(sys.argv[1], sys.argv[2], oldnew.old, oldnew.new)
except Exception, e:
print '%s:%s' %(e.__class__._ _name__, e)
raise SystemExit, """
Usage: [python] binfupd.py infname outfname oldnewmodulenam e
where infname is read in binary, and outfname is written
in binary, replacing instances of old binary data with new
specified as python strings named old and new respectively
in a module named oldnewmodulenam e (without .py extension).
"""
-----------------------

REMEMBER: NO WARRANTY FOR ANY PURPOSE! USE AT YOUR OWN RISK!

And, if you know where to seek to, that seems like the best way ;-)

Regards,
Bengt Richter
Jul 18 '05 #9
On Wed, Jan 12, 2005 at 10:36:54PM -0800, yaipa wrote:
What would be the common sense way of finding a binary pattern in a
.bin file, say some 200 bytes, and replacing it with an updated pattern
of the same length at the same offset?

Also, the pattern can occur on any byte boundary in the file, so
chunking through the code at 16 bytes a frame maybe a problem. The
file itself isn't so large, maybe 32 kbytes is all and the need for
speed is not so great, but the need for accuracy in the
search/replacement is very important.


ok, after having read the answers, I feel I must, once again, bring
mmap into the discussion. It's not that I'm any kind of mmap expert,
that I twirl mmaps for a living; in fact I barely have cause to use it
in my work, but give me a break! this is the kind of thing mmap
*shines* at!

Let's say m is your mmap handle, a is the pattern you want to find,
b is the pattern you want to replace, and n is the size of both a and
b.

You do this:

p = m.find(a)
m[p:p+n] = b

and that is *it*. Ok, so getting m to be a mmap handle takes more work
than open() (*) A *lot* more work, in fact, so maybe you're justified
in not using it; some people can't afford the extra

s = os.stat(fn).st_ size
m = mmap.mmap(f.fil eno(), s)

and now I'm all out of single-letter variables.

*) why isn't mmap easier to use? I've never used it with something
other than the file size as its second argument, and with its access
argument in sync with open()'s second arg.

--
John Lenton (jo**@grulic.or g.ar) -- Random fortune:
If the aborigine drafted an IQ test, all of Western civilization would
presumably flunk it.
-- Stanley Garn

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFB6KYegPq u395ykGsRAi2MAK CAgLlfIfiKMvOYT N3n+hWgd/u7wgCgkEIv
pr3dzPovxdjsVbZ jhIVC+6E=
=dNOf
-----END PGP SIGNATURE-----

Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1855
by: bart plessers | last post by:
Hello, For a while I am working with ThumbsPlus ( http://www.cerious.com/ ) as manager for pics. The benefit of the program is that it stores all kind of information in a central Microsoft Database that easily can be manipulated. A thumbnail of the picture is also stored in de MDB as long binary (jpeg format) I made a small script that extracts all kind of information of the MDB by
1
5974
by: bart plessers | last post by:
Hello, For a while I am working with ThumbsPlus ( http://www.cerious.com/ ) as manager for pics. The benefit of the program is that it stores all kind of information in a central Microsoft Database that easily can be manipulated. A thumbnail of the picture is also stored in de MDB as long binary (jpeg format) I made a small script that extracts all kind of information of the MDB by
3
7216
by: Randy | last post by:
I have heard that access 2003 has functions for dealing with Long Binary Data. Does anyone know if this is true? Background: I am using 2000 with a table linked to a SQL server. One of the fields is of type OLE Object. This table is populated from a website where excell spreadsheets are uploaded. But in the linked Access table instead of saying Excell spreadsheet in that field it says Long Binary Data, which can not be opened. My users on...
8
25401
by: Jerry | last post by:
I have an off-the-shelf app that uses an Access database as its backend. One of the tables contains a field with an "OLE Object" datatype. I'm writing some reports against this database, and I believe this field contains data I need. When I view the table in datasheet view, all I can see in this field is the string "Long binary data". So, I've got the problem of needing to extract data from this field, but I don't know what format...
1
1808
by: junkmauler | last post by:
I need a pre-made class (this is way over my head.. I can imagine the pseudo code but cant implement it myself) that can search a file for a specified binary pattern. Actually what I really need is a way to read in a buffer (say 4k) and search that buffer for multiple binary patterns, which would require a smart sliding window to search a large amount of patterns in a short amount of time. Anyone have any suggestions?
0
1621
by: Chris3000 | last post by:
Hi everyone How can I embeded an image to OLE Object field using Long Binary data. and what Long Binary data means and how to use it. does anyone have any ideas on I would to display images in my asp page. I am trying to insert the images in to MS Access using long binary data. so How can I insert a long binary data to ole object field. Thanx
0
4100
by: phoenix7 | last post by:
Dear all, I want to store some data in form of a zip file into an access database. I created a table with with a column of type OLE Object, then I designed a form to insert data to the table. I inserted some files to it, but when I tried to read them using my java client it retrieves an OLE object but I just need the file content (what the program gets form the db was the original file added with some extra binary information at the...
5
2157
by: ron.longo | last post by:
Is there any way that I can find the path of the main .py file of my application? For example, I have an application with some resources which are in a subdirectory: myPythonApp.py /resources image1
9
13353
by: kzzz | last post by:
Hello, I have a microsoft access file with fields that have "long binary data". I would like keep the same file, but change the "long binary data" into regular text. There are a large number of fields, so it needs to automated. I'm not a programmer, but a network engineer. Is there any way to do this without programming? Thank you, Ken
0
8962
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
1
9251
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9200
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8201
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6747
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6049
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4822
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2739
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2188
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.