Hi,
Got a query, don't know if anyone would be kind enough to help...
I have to either match a string exactly eg
"123456789ABCDE F123456789ABCDE F" == "123456789ABCDE F123456789ABCDE F",
or if a packed byte bit mask is set mask1 = 1, mask2 = 4,
"123456789ABCDE F123456789ABCDE F" == "123456789ABCDE F1234x6789ABCDE x".
The mask means that the character at x can match either the original
number, or a set character ('x'). The original string can be up to 80
chars long with 10 8-bit bytes containg the masks.
I have done this, by a plodding for loop with lots of ifs, alas it's
untidy and slow. Any pointers to tidy & speed would be a great help.
Thank-you. 9 1430
What is the relationship between x and the mask?
Quentin Godfroy <qu************ *@hotmail.comwr ites:
What is the relationship between x and the mask?
Imagine reading the above without having seen the previous article.
Please provide context when you post a followup.
See <http://cfaj.freeshell. org/google/>.
--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
On 23 May 2007 16:35:34 -0700, Quentin Godfroy
<qu************ *@hotmail.comwr ote:
>What is the relationship between x and the mask?
if that byte is masked, the field can either be it's original value
or one other pre-defined character
so string to compare against = "12345678", mask = 2 (00000010),
pre-defined char = "Z"
Strings to compare:-
"12345678" = match
"123456Z8" = match
"12345Z78" = NO match
"12345578" = NO match
Hope that makes it a little clearer.
On 24 May, 00:09, f...@off.spamme rs.org wrote:
Hi,
Got a query, don't know if anyone would be kind enough to help...
I have to either match a string exactly eg
"123456789ABCDE F123456789ABCDE F" == "123456789ABCDE F123456789ABCDE F",
or if a packed byte bit mask is set mask1 = 1, mask2 = 4,
"123456789ABCDE F123456789ABCDE F" == "123456789ABCDE F1234x6789ABCDE x".
The mask means that the character at x can match either the original
number, or a set character ('x'). The original string can be up to 80
chars long with 10 8-bit bytes containg the masks.
I have done this, by a plodding for loop with lots of ifs, alas it's
untidy and slow. Any pointers to tidy & speed would be a great help.
As far as I understand the required behaviour, you have a string
("original") of up to 80 bytes which you are examining, and another
string ("pattern") of the same size (?) against which you are
comparing it. In addition to simple character equality, you need to
test bytes at specific locations - bytes at these locations in
"original" can either match the corresponding byte in "pattern" or
match a single specific other byte. Locations for this dual matching
are specified by bit masks in an additional byte array, with one byte
in the array corresponding to 8 bytes in the original string. Is that
correct so far?
My natural instinct here is to split the matching down into 8-byte
"chunks". This was fairly easy to do and IMHO didn't seem hugely ugly
(I'm not going to post my code yet, as it's far from clear that I've
understood the requirement). How ugly is your code? Could you post a
small self-contained example to show what you've currently tried?
I haven't measured speed in my case. Why do you feel it's slow in
yours? What measures have you used? Have you profiled to the code to
see where the time is going? fe**@off.spamme rs.org writes:
Hi,
Got a query, don't know if anyone would be kind enough to help...
I have to either match a string exactly eg
"123456789ABCDE F123456789ABCDE F" == "123456789ABCDE F123456789ABCDE F",
or if a packed byte bit mask is set mask1 = 1, mask2 = 4,
"123456789ABCDE F123456789ABCDE F" == "123456789ABCDE F1234x6789ABCDE x".
The mask means that the character at x can match either the original
number, or a set character ('x'). The original string can be up to 80
chars long with 10 8-bit bytes containg the masks.
I have done this, by a plodding for loop with lots of ifs, alas it's
untidy and slow. Any pointers to tidy & speed would be a great help.
It would help more if you posted what *you* have. We could then
decide if we see anything better. Anyway, in case it help, I'd do it
like this in the first instance:
#include <limits.h>
#define BIT_SET(mask, x) ((mask)[(x) / CHAR_BIT] & (1 << (x) % CHAR_BIT))
int quasi_match(con st char *s1, const char *s2,
const unsigned char *mask, char optional)
{
int i = 0;
while (s1[i] && (s1[i] == s2[i] || BIT_SET(mask, i) && s2[i] == optional))
i += 1;
return s1[i] == s2[i];
}
The meaning (numbering) of the mask bits can be changed to reflect
your usage simply by altering the arithmetic in the macro.
There is quite a lot of arithmetic in the macro but:
(a) the compiler should make /CHAR_BIT and %CHAR_BIT fast on most
machines;
(b) it is only used when the characters *don't* match.
--
Ben. fe**@off.spamme rs.org wrote:
>
Got a query, don't know if anyone would be kind enough to help...
I have to either match a string exactly eg
"123456789ABCDE F123456789ABCDE F" == "123456789ABCDE F123456789ABCDE F",
or if a packed byte bit mask is set mask1 = 1, mask2 = 4,
"123456789ABCDE F123456789ABCDE F" == "123456789ABCDE F1234x6789ABCDE x".
The mask means that the character at x can match either the original
number, or a set character ('x'). The original string can be up to 80
chars long with 10 8-bit bytes containg the masks.
I have done this, by a plodding for loop with lots of ifs, alas it's
untidy and slow. Any pointers to tidy & speed would be a great help.
The optimum will probably depend on your data. I suggest an
initial check with strcmp. If that shows equal, no more fuss. You
might want to write your own str_cmp, and have it return a pointer
to the unequal location. You can then proceed accordingly.
--
If you want to post a followup via groups.google.c om, ensure
you quote enough for the article to make sense. Google is only
an interface to Usenet; it's not Usenet itself. Don't assume
your readers can, or ever will, see any previous articles.
More details at: <http://cfaj.freeshell. org/google/>
--
Posted via a free Usenet account from http://www.teranews.com
On 24 May 2007 02:32:24 -0700, ma**********@po box.com wrote:
>On 24 May, 00:09, f...@off.spamme rs.org wrote:
>Hi,
Got a query, don't know if anyone would be kind enough to help...
I have to either match a string exactly eg "123456789ABCD EF123456789ABCD EF" == "123456789ABCDE F123456789ABCDE F", or if a packed byte bit mask is set mask1 = 1, mask2 = 4, "123456789ABCD EF123456789ABCD EF" == "123456789ABCDE F1234x6789ABCDE x". The mask means that the character at x can match either the original number, or a set character ('x'). The original string can be up to 80 chars long with 10 8-bit bytes containg the masks.
I have done this, by a plodding for loop with lots of ifs, alas it's untidy and slow. Any pointers to tidy & speed would be a great help.
As far as I understand the required behaviour, you have a string ("original") of up to 80 bytes
No, can be up to 40, but actually shorter (37) (5 mask bytes * 8 bits
= 40 'masks').
which you are examining, and another string ("pattern") of the same size (?) against which you are comparing it. In addition to simple character equality, you need to test bytes at specific locations - bytes at these locations in "original" can either match the corresponding byte in "pattern" or match a single specific other byte. Locations for this dual matching are specified by bit masks in an additional byte array, with one byte in the array corresponding to 8 bytes in the original string. Is that correct so far?
100% correct. Either I'm good at explaining (*grin*) or you can read
between the lines (more likely *sigh*).
> My natural instinct here is to split the matching down into 8-byte "chunks". This was fairly easy to do and IMHO didn't seem hugely ugly (I'm not going to post my code yet, as it's far from clear that I've understood the requirement). How ugly is your code?
So ugly the code wears a mask!
>Could you post a small self-contained example to show what you've currently tried?
Of course I can:-
//str1 is the string to compare
//str2 is the const comparison string,
//it has six extra bytes, 5 for the masks and 1 that is the actual
value we need (Action)
//DEF_STR_LEN is the length of str2
//Z is the 'mask' character so the strings can match, or if mask
applies can match to original or the 'Z'.
//Like I said, loads of if's and just looks a mess to me (And I wrote
it!). I HAVEN'T profiled it yet (dreading it!),
//but this is called frequently so the fastest it can be (within
reason).
if (!strncmp((char *)str1,
(char *)str2,
(DEF_STR_LEN - 6))
) Action = str2[(DEF_STR_LEN - 1)];
else if (groupMask)
{
j = 0;
do
{
//Frig for strs that aren't a multiple of 8.
if (j == 4) l = 4; else l = 7;
{
k = 0;
do
{
if (*(str1+(j*8)+k ) == str2[((j*8) +
k)]) match = 1;
else if (str2[(DEF_STR_LEN - 2 + j)]
<< k & 0x80) if(*(str1+(j*8) +k) == 'Z') match = 1;
else match = 0;
} while ((k++ < l) && match);
}
} while ((++j < 5) && match);
if (match) action = str2[(DEF_STR_LEN - 1)];
}
>I haven't measured speed in my case. Why do you feel it's slow in yours? What measures have you used? Have you profiled to the code to see where the time is going?
Not profiled (yet), but I can't see how it's going to wizz!
Kind Regards.
On Thu, 24 May 2007 17:31:42 GMT, fe**@off.spamme rs.org wrote:
SNIP
> if (!strncmp((char *)str1,
(char *)str2, (DEF_STR_LEN - 6))
) Action = str2[(DEF_STR_LEN - 1)]; else if (groupMask) { j = 0; do { //Frig for strs that aren't a multiple of 8. if (j == 4) l = 4; else l = 7; { k = 0; do { if (*(str1+(j*8)+k ) == str2[((j*8) + k)]) match = 1; else if (str2[(DEF_STR_LEN - 2 + j)] << k & 0x80) if(*(str1+(j*8) +k) == 'Z') match = 1;
Oooops, cut & paste! That should read:-
else if ((str2[(DEF_STR_LEN - 2 + j)] << k & 0x80) && (*(str1+(j*8)+k )
== 'Z')) match = 1;
> else match = 0; } while ((k++ < l) && match); } } while ((++j < 5) && match); if (match) action = str2[(DEF_STR_LEN - 1)]; }
SNIP
Here's my off-the-cuff version. I like Ben's version for terseness,
but I wanted to produce something which expressed my understanding of
the problem - which is that the masks naturally split the original and
pattern strings into chunks.
I'd start with this, or something like it, and then look at profiling
it to find out where time was being eaten and focus on how that could
be trimmed.
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
int matches(char *original,char *pattern, unsigned char *masks,char
fixed);
int chunkMatches(ch ar *original,char *pattern, unsigned char mask,char
fixed,int chunkSize);
int main(void) {
/* One good and 2 bad strings */
char *original[] = {
"12Z456789ZBCDE F123456789ABCDE F",
"12Z456789ZBCDE F123X56789ABCDE F",
"12Z456789XBCDE F123456789ABCDE F",
NULL
};
char *pattern = "123456789ABCDE F123456789ABCDE F";
/* allow the "fixed" value in char 3 of first 8 byte chunk
char 2 of second 8 byte chunk, but not in the third */
unsigned char masks[3]= { 1 << 5, 1 << 6, 0 };
char fixed = 'Z';
char **o;
for(o=original; *o != NULL;o++) {
printf("[%s] %s match [%s](%c)\n",
*o,matches(*o,p attern,masks,'Z ')?"did":"did
not",pattern,fi xed);
}
}
int matches(char *original,char *pattern, unsigned char *masks,char
fixed) {
int chunks;
int oLen = strlen(original );
int pLen = strlen(pattern) ;
if (oLen != pLen) {
return 0;
}
chunks = oLen/CHAR_BIT + ((oLen % CHAR_BIT) != 0);
while(chunks--) {
int chunkSize = strlen(original );
if (chunkSize CHAR_BIT) {
chunkSize = CHAR_BIT;
}
if (!chunkMatches( original,patter n,*masks,fixed, chunkSize)){
return 0;
}
original += chunkSize;
pattern += chunkSize;
masks += 1;
}
return 1;
}
int chunkMatches(ch ar *original,char *pattern, unsigned char mask,char
fixed,int chunkSize) {
unsigned char currentMask = 1<<(CHAR_BIT - 1);
while(chunkSize--) {
if (*original == *pattern) {
;
} else if ((currentMask & mask) && (*original == fixed)) {
;
} else {
return 0;
}
original++;
pattern++;
currentMask >>= 1;
}
} This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Piotr |
last post by:
Please help me, I can't solve this problem:
I have domain:
www.main_domain.com/index.php?userid=23¶metr1=2¶metr2=3
(and other site files like
www.main_domain.com/page1.php?userid=23&otherparameter=32
www.main_domain.com/page2.php?userid=23&otherparameter=11 etc.)
At this address there is an template site for many users. Personal site
is identified by userid.
|
by: Ginchy |
last post by:
I have uploaded a small 3 page web using MS Publisher 2003 and after
uploading I switched on url masking to cloak the url.
I am certain that it worked fine.
I simply changed the colour scheme in Publisher and reuploaded the page. The
links no longer work?
When I switch off url masking it works as it should?
|
by: Stefan Kowalski |
last post by:
Hi Newsgroup
Does anyone here care to share their thoughts/experiences with using
masking? I am storing categories of books which libraries stock, so that I
might have codes:
015-000-000 = Languages
015-001-000 = European languages
015-001-001 = English
015-001-002 = German
015-001-003 = French
|
by: john sutor |
last post by:
Is there a way in C# to add masking (for a date format) to a textbox?
|
by: Dana Epp |
last post by:
I have a ToolBarButton that when I set it to disabled (button.Enabled =
false;) causes a really ugly gray masking effect to take place. This is
normal and the intended way of the button, but I would like to clean it up.
Instead of using the gray mask, I want to use my own grayscaled high quality
image. I thought by simply setting button.ImageIndex = new_num; (where
new_num is the index of the grayscale image) I could do this. But alas, that...
| |
by: Brian Keating EI9FXB |
last post by:
Hello there,
Does anyone know how to to Bitmap masking in .NET,
In c++ i used,
CreateCompatibleDC
GetBitmap
Then on the bitmap and the bitmap mask i called.
{SelectBitmap
StretchBlt}
|
by: NBB |
last post by:
I can't figure this one out. Here's the situation, should be pretty
for the pros in here.
I have a ListBox that is populated with the DataTextField and
DataValueField flags of the DataSet. Everything populates just fine
and works great.
The DataTextField displays text from a database column called Path,
which is the absolute URL of a file. What I would like it to display
is just the filename, essentially trimming off everything...
|
by: Hai Nguyen |
last post by:
Hi All
I have a textbox want to mask as phone box. How can I do it?
Thanks
|
by: Daniel Smedegaard Buus |
last post by:
Hey all :)
I was wondering about the $error_types (I particularly notice the 's' suffix
when reading the manual) parameter for 'set_error_handler()':
Can be used to mask the triggering of the error_handler function just like
the error_reporting ini setting controls which errors are shown. Without
this mask set the error_handler will be called for every error regardless
to the setting of the error_reporting setting.
|
by: meghla |
last post by:
I can not understand how to implement bit masking on a c code?? while working with array it is taking too much time.................. so i need help.
i need some suggestion and sample code of C dealing with bit masking..............; its argent............. plz help me.
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
| |
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
| |
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |