473,770 Members | 7,287 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Can you improve this code : Search Byte[] backwards for byte Pattern

Hi,

The code I have posted searches for a pattern of bytes starting from the end
of
Byte[] array *backwords*, if a match is found, return the starting index of
those
found bytes. Since one of the parameters is starting index you can start the
search
anywhere within the Byte[] you are searching.

Example:

Suppose you have a Byte[] that contains the 4 bytes 0x01, 0x01, 0xBA, 0xC8.
Find the starting index for the two bytes 0x01, 0xBA, searching from the
end.
The method would return 0x01, because that is the starting index of the
sequence
of bytes 0x01, 0xBA.

The code I have posted below works, but can you simplfy / optimize the
FindBackward() method?

Thank you.

Russell Mangel
Las Vegas, NV

PS
The Byte[] array in production, will have average size of 3MB - 10MB, they
are .mp3 files.

// Start code
using System;

class Program
{
private Byte[] m_Bytes = { 0x01, 0x01, 0xBA, 0xC8 };
private const UInt32 NOT_FOUND = 0xFFFFFFFF;
private readonly UInt32 END_OF_FILE;

public Program( )
{
END_OF_FILE = (UInt32)m_Bytes .Length - 1;
}

static void Main( string[] args )
{
Program p = new Program( );

Byte[] searchFor = { 0x01, 0xBA };
UInt32 startSearchInde x = 0x03;

UInt32 x = p.FindBackward( searchFor, startSearchInde x );

Console.WriteLi ne( "{0:X8}", x );

Console.ReadLin e( );
}
public UInt32 FindBackward( Byte[] bytes, UInt32 index )
{
if (bytes == null || index END_OF_FILE )
{
return NOT_FOUND;
}

Boolean isMatch = true;
UInt32 i = index;
UInt32 size = ( UInt32 )bytes.Length - 1;
UInt32 j = size;
UInt32 t = 0;

do
{
isMatch = true;
j = size;
t = 0;
do
{
if ( m_Bytes[ i - t ] != bytes[ j ] )
{
isMatch = false;
break;
}
t++;
}
while ( j-- 0 );

if ( isMatch )
{
return ( i - size ); // Success
}
}
while ( i-- 0 );

return NOT_FOUND;
}
}
// End Code
Sep 25 '08 #1
4 1760
On Thu, 25 Sep 2008 14:23:52 -0700, Russell Mangel <ru*****@tymer. net>
wrote:
[...]
The code I have posted below works, but can you simplfy / optimize the
FindBackward() method?
Why? Are you having some problem with it?

Assuming you always just want to search for a single, pre-determined
string of bytes, I don't think you're going to get much better than what
you've posted. There are some techniques that would allow you to only
have to visit each byte in the searched array once, but I think they would
be overkill if you have only the one string of bytes you're looking for.
They have considerable overhead, both in terms of setup and in terms of
cost during iteration, providing very good improvements in the overall
efficiency of the algorithm as measured by "big-O notation", but
increasing the cost of each loop iteration.

You could certainly clean up the code you posted a bit, and if you wanted
to go nuts you could even work the Array.FindLast< T>() method into it.
But I'm not sure what the point would be, if what you have works and you
don't have any specific complaints about it.

Pete
Sep 25 '08 #2
I just wanted to make it as efficient as possible, because that method will
be called a lot.
I am parsing mp3 files, 1.5 Terabytes, hundreds of thousands of files. The
routine
parse's every frame and every tag of the binary mp3 file. So I wanted to
test using
pure byte arrays first and see how performance is. Then I'd like to try an
unsafe version if it makes sense. I avoided using any of the stream and
binaryreader objects, for now cuz I doubt they would increase performance
because
we are forced to read every Frame inside Mp3 file from start to end. However
the stream/binary reader objects would use much less memory than loading
3-10MB into one Byte[].
Thanks for your reply.

Russ

"Peter Duniho" <Np*********@nn owslpianmk.comw rote in message
news:op******** *******@petes-computer.local. ..
On Thu, 25 Sep 2008 14:23:52 -0700, Russell Mangel <ru*****@tymer. net>
wrote:
>[...]
The code I have posted below works, but can you simplfy / optimize the
FindBackward () method?

Why? Are you having some problem with it?

Assuming you always just want to search for a single, pre-determined
string of bytes, I don't think you're going to get much better than what
you've posted. There are some techniques that would allow you to only
have to visit each byte in the searched array once, but I think they would
be overkill if you have only the one string of bytes you're looking for.
They have considerable overhead, both in terms of setup and in terms of
cost during iteration, providing very good improvements in the overall
efficiency of the algorithm as measured by "big-O notation", but
increasing the cost of each loop iteration.

You could certainly clean up the code you posted a bit, and if you wanted
to go nuts you could even work the Array.FindLast< T>() method into it.
But I'm not sure what the point would be, if what you have works and you
don't have any specific complaints about it.

Pete

Sep 25 '08 #3
I'm currious why you made the mp3 byte array a member of the class
containing the function, but pass the search string to the function. I
doubt it
matters much though.
"Russell Mangel" <ru*****@tymer. netwrote in message
news:%2******** ********@TK2MSF TNGP05.phx.gbl. ..
Hi,

The code I have posted searches for a pattern of bytes starting from the
end of
Byte[] array *backwords*, if a match is found, return the starting index
of those
found bytes. Since one of the parameters is starting index you can start
the search
anywhere within the Byte[] you are searching.

Example:

Suppose you have a Byte[] that contains the 4 bytes 0x01, 0x01, 0xBA,
0xC8.
Find the starting index for the two bytes 0x01, 0xBA, searching from the
end.
The method would return 0x01, because that is the starting index of the
sequence
of bytes 0x01, 0xBA.

The code I have posted below works, but can you simplfy / optimize the
FindBackward() method?

Thank you.

Russell Mangel
Las Vegas, NV

PS
The Byte[] array in production, will have average size of 3MB - 10MB,
they are .mp3 files.

// Start code
using System;

class Program
{
private Byte[] m_Bytes = { 0x01, 0x01, 0xBA, 0xC8 };
private const UInt32 NOT_FOUND = 0xFFFFFFFF;
private readonly UInt32 END_OF_FILE;

public Program( )
{
END_OF_FILE = (UInt32)m_Bytes .Length - 1;
}

static void Main( string[] args )
{
Program p = new Program( );

Byte[] searchFor = { 0x01, 0xBA };
UInt32 startSearchInde x = 0x03;

UInt32 x = p.FindBackward( searchFor, startSearchInde x );

Console.WriteLi ne( "{0:X8}", x );

Console.ReadLin e( );
}
public UInt32 FindBackward( Byte[] bytes, UInt32 index )
{
if (bytes == null || index END_OF_FILE )
{
return NOT_FOUND;
}

Boolean isMatch = true;
UInt32 i = index;
UInt32 size = ( UInt32 )bytes.Length - 1;
UInt32 j = size;
UInt32 t = 0;

do
{
isMatch = true;
j = size;
t = 0;
do
{
if ( m_Bytes[ i - t ] != bytes[ j ] )
{
isMatch = false;
break;
}
t++;
}
while ( j-- 0 );

if ( isMatch )
{
return ( i - size ); // Success
}
}
while ( i-- 0 );

return NOT_FOUND;
}
}
// End Code
Sep 25 '08 #4

"Family Tree Mike" <Fa************ @ThisOldHouse.c omwrote in message
news:e5******** ******@TK2MSFTN GP03.phx.gbl...
I'm currious why you made the mp3 byte array a member of the class
containing the function, but pass the search string to the function. I
doubt it
matters much though.

I just threw that class together, without any design, I am only interested
in improving the method.
The method will be used in another class in a .dll assembly.

Thanks for noticing though.

Russell Mangel
Las Vegas, NV
Sep 26 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
2214
by: Sharif T. Karim | last post by:
I am trying to do the following with my search script that looks for records in a mysql table. The following is an example of what I am trying to do. Text being searched: -- The brown fox jumped over the green fence then jumped into the web monitor. It was hurt so it jumped backwards and fell on its! -- The word we're searching for "web".
699
34222
by: mike420 | last post by:
I think everyone who used Python will agree that its syntax is the best thing going for it. It is very readable and easy for everyone to learn. But, Python does not a have very good macro capabilities, unfortunately. I'd like to know if it may be possible to add a powerful macro system to Python, while keeping its amazing syntax, and if it could be possible to add Pythonistic syntax to Lisp or Scheme, while keeping all of the...
14
2506
by: Stuart D. Gathman | last post by:
I have a function that recognizes PTR records for dynamic IPs. There is no hard and fast rule for this - every ISP does it differently, and may change their policy at any time, and use different conventions in different places. Nevertheless, it is useful to apply stricter authentication standards to incoming email when the PTR for the IP indicates a dynamic IP (namely, the PTR record is ignored since it doesn't mean anything except to...
15
3624
by: SK | last post by:
Hey folks, I am searching for a string (say "ABC") backwards in a file. First I seek to the end. Then I try to make a check like - do { file.clear (); file.get(c); file.seekg(-2, std::ios::cur);
10
3140
by: pembed2003 | last post by:
Hi all, I asked this question in the C group but no one seems to be interested in answering it. :-( Basically, I wrote a search and replace function so I can do: char source = "abcd?1234?x"; char search = '?'; char* replace = "***"; char* result = search_and_replace(source,search,replace);
235
11803
by: napi | last post by:
I think you would agree with me that a C compiler that directly produces Java Byte Code to be run on any JVM is something that is missing to software programmers so far. With such a tool one could stay with C and still be able to produce Java byte code for platform independent apps. Also, old programs (with some tweaking) could be re-compiled and ported to the JVM. We have been developing such a tool over the last 2 years and currently...
17
2087
by: Sean Kenwrick | last post by:
I am writing a byte-code interpreter/emulator for a language that exclusively uses strings for variables (i.e all variables are pushed onto the stack as strings). Therefore for arithmetic operations I need to convert the string to a 32 bit integer, carry out the arithmetic operation, then convert it back to a string before pushing it back onto the stack. This can happen millions of times per second so it is essential that I have...
232
13343
by: robert maas, see http://tinyurl.com/uh3t | last post by:
I'm working on examples of programming in several languages, all (except PHP) running under CGI so that I can show both the source files and the actually running of the examples online. The first set of examples, after decoding the HTML FORM contents, merely verifies the text within a field to make sure it is a valid representation of an integer, without any junk thrown in, i.e. it must satisfy the regular expression: ^ *?+ *$ If the...
47
3450
by: Henning_Thornblad | last post by:
What can be the cause of the large difference between re.search and grep? This script takes about 5 min to run on my computer: #!/usr/bin/env python import re row="" for a in range(156000): row+="a"
0
9592
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10230
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10058
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10004
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8886
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6678
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5313
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3972
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2817
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.