473,324 Members | 2,196 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

Searching of byte array with wildcards?

Hello,

I need to search a byte[] array for a sequence of bytes. The sequence
may include wildcards. For example if the array contains 0xAA, 0xBB,
0xAA, OxDD then I want to be able to search for 0xAA, 0x?? and get two
matches... I've been all around Google but still can't find any
suggestions on how anything like this can be implemented..... Can
someone please give me a clue? Some well-known algorithm maybe?

Thanks!
Jul 22 '05 #1
5 4373
Richard Berg wrote:
I need to search a byte[] array for a sequence of bytes. The sequence
may include wildcards. For example if the array contains 0xAA, 0xBB,
0xAA, OxDD then I want to be able to search for 0xAA, 0x?? and get two
matches... I've been all around Google but still can't find any
suggestions on how anything like this can be implemented..... Can
someone please give me a clue? Some well-known algorithm maybe?


Try boost::regex.
Jul 22 '05 #2
"Richard Berg" <bi***@mail.com> wrote in message
I need to search a byte[] array for a sequence of bytes. The sequence
may include wildcards. For example if the array contains 0xAA, 0xBB,
0xAA, OxDD then I want to be able to search for 0xAA, 0x?? and get two
matches... I've been all around Google but still can't find any
suggestions on how anything like this can be implemented..... Can
someone please give me a clue? Some well-known algorithm maybe?


Your post example suggests you're looking for ? or . matches, that is match
any one character. In this case, you can use the strstr or
std::search algorithm with an appropriate comparison function.

struct eqany {
bool operator()(char c1, char c2) const {
if (c2=='?') return true;
return c1 == c2;
}
};

using std::string;
const string source("ababcdabcxd");
const string expected("cx");
string::const_iterator find =
std::search(source.begin(), source.end(),
expected.begin(), expected.end(),
eqany());
The worst case running time of the above algorithm is O(n*m) where
n = length(source) and m = length(expected).

If all the chars in the expected string are unique we can get O(n) running
time. Maintain a pointer to the expected char in the expected string,
initially pointing at 'c'. Now loop over the chars in the source string.
When you find a char that matches increment
the expected pointer. You'll loop through "ababc" and the last 'c' matches,
so you increment the expected
pointer to expect 'x'. But the next char is 'd' so reset the expected
pointer back to 'c'. So eventually you'll get to "ababcdabc" and the
'c' matches, so increment the expected pointer to 'x'. The next
character in the sequence is 'x' which matches the expected character,
so increment the expected pointer. At this point there are no more
expected chars so it means you've found match.

To match any number of characters, we could call std::search
repeatedly. Suppose the expected string is "ba*cx". Split this
into "ba" and "cx". Search the source string for "ba". If not
found then "ba*cx" couldn't possibly be found either. But if
"ba" found then find "cx".

using std::string;
typedef string::const_iterator Iter;
const string source("ababcdabcxd");
const string expected("ba*cx");
Iter expect = expected.begin();
while (expect!=expected.end())
{
using std::find;
using std::search;
Iter expectend = find(expect, expect.end(), '*');
Iter s = search(source.begin(), source.end(), expect, expectend);
if (s == source.end()) return false; // not found
if (expectend == expected.end()) return true;
expect = expectend;
++expect;
++s;
}
The efficient algorithm is to loop over the chars in the search string
's' or "ababcdabcxd". It is sufficient to loop over the first
length(s)-length(s2)+1 chars.

Note the std::search algorithm does not assume null terminated arrays
and works with any container, such as std::vector<char>,
std::vector<int>, etc.
Jul 22 '05 #3
Richard Berg wrote:
Hello,

I need to search a byte[] array for a sequence of bytes. The sequence
may include wildcards. For example if the array contains 0xAA, 0xBB,
0xAA, OxDD then I want to be able to search for 0xAA, 0x?? and get two
matches... I've been all around Google but still can't find any
suggestions on how anything like this can be implemented..... Can
someone please give me a clue? Some well-known algorithm maybe?

Thanks!


See language theory books such as _Introduction to Automata Theory,
Languages, and Computation_, by Hopcroft and Ullman, 1979, or similar.

See compiler textbooks such as _Compilers: Principles, Techniques, and
Tools_, by Aho, Sethi, and Ullman, 1987, Chapter 3: Lexical Analysis, or
similar.

Topics: regular expressions, regular languages, deterministic finite
automata, lexical analysis.

You would build a little dynamic lexical-analyzer generator. The
generated lexical analyzer would be a little deterministic finite
automaton (DFA). Start off reading about DFAs and regular expressions
and you will rather soon see how to do it.

Jul 22 '05 #4
"Matt" <ma**@themattfella.zzzz.com> wrote in message news:aOpfc.1742
See language theory books such as _Introduction to Automata Theory,
Languages, and Computation_, by Hopcroft and Ullman, 1979, or similar.

See compiler textbooks such as _Compilers: Principles, Techniques, and
Tools_, by Aho, Sethi, and Ullman, 1987, Chapter 3: Lexical Analysis, or
similar.

Topics: regular expressions, regular languages, deterministic finite
automata, lexical analysis.

You would build a little dynamic lexical-analyzer generator. The
generated lexical analyzer would be a little deterministic finite
automaton (DFA). Start off reading about DFAs and regular expressions
and you will rather soon see how to do it.


Fine, but the real thig. But what if std::search will suffice?
Jul 22 '05 #5
Siemel Naran wrote:
"Matt" <ma**@themattfella.zzzz.com> wrote in message news:aOpfc.1742

See language theory books such as _Introduction to Automata Theory,
Languages, and Computation_, by Hopcroft and Ullman, 1979, or similar.

See compiler textbooks such as _Compilers: Principles, Techniques, and
Tools_, by Aho, Sethi, and Ullman, 1987, Chapter 3: Lexical Analysis, or
similar.

Topics: regular expressions, regular languages, deterministic finite
automata, lexical analysis.

You would build a little dynamic lexical-analyzer generator. The
generated lexical analyzer would be a little deterministic finite
automaton (DFA). Start off reading about DFAs and regular expressions
and you will rather soon see how to do it.

Fine, but the real thig. But what if std::search will suffice?


For the OP's special purpose, your solution is simpler and better.

Jul 22 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Skwerl | last post by:
I'm trying to quickly grab the dimensions out of the headers of JPEG files. I need to look for the hex string FFC0001108 in a file that will be from 1-40 megs in size. If someone has experience...
1
by: Richard Berg | last post by:
Hello, I need to search a byte array for a sequence of bytes. The sequence may include wildcards. For example if the array contains 0xAA, 0xBB, 0xAA, OxDD then I want to be able to search for...
7
by: junky_fellow | last post by:
Can anyone suggest some efficient way to search a substring in a text file. For eg. suppose i want to search the pattern "abc" in a text file, then which algorithm should i use. I just want some...
8
by: intrepid_dw | last post by:
Hello, all. I've created a C# dll that contains, among other things, two functions dealing with byte arrays. The first is a function that returns a byte array, and the other is intended to...
6
by: Dennis | last post by:
I was trying to determine the fastest way to build a byte array from components where the size of the individual components varied depending on the user's input. I tried three classes I built: (1)...
1
by: Bud Dean | last post by:
I need to search files for given text. In particular, I'm searching dll's, exe's, asp, aspx and html pages. I am having difficulty converting the byte arrays to strings. The following code...
2
by: Larry | last post by:
I have a Byte Array Dim A1() as byte = {1,2,3,4,9,9,9,11,12,13,14,9,9,9} I want to find the location of the first occurance of the byte sequence {9,9,9}. Is there a built in Framework class...
10
by: Phil Latio | last post by:
How do I use wildcards when searching in array? At least that's what I think I need !! I have the line: if ($attribute != "id") The above is not 100% correct because it should also be...
17
by: =?Utf-8?B?U2hhcm9u?= | last post by:
Hi Gurus, I need to transfer a jagged array of byte by reference to unmanaged function, The unmanaged code should changed the values of the array, and when the unmanaged function returns I need...
3
by: Ahmad Jalil Qarshi | last post by:
Hi, I have a text file having size about 2 GB. The text file format is like: Numeric valueAlphaNumeric values Numeric valueAlphaNumeric values Numeric valueAlphaNumeric values For example...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.