473,386 Members | 1,702 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Array for speed or...

I have a substantial random stream of (numerical, long int) data (coming
from several SCADA sources) that needs to be passed thru a function that
tallies the occurrences of several different bit patterns in each number
(or in fact 32 bit row, of which a few (leftmost) are bogus).

Speediness of execution is high on the priority list.

I thought of a solution along these lines, avoiding relatively slow
switch/case or several if/elseif combinations:

If the data would have been short (or int) I'd be able to do this:

#include <stdio.h>
#define BOGUS_BIT_MASK SHORT_MAX & (SHORTMAX >3)
//assume 3 bogus bits in case we use short

#define BATCH_SIZE 1024

typedef unsigned long long ull;
typedef FILE* fileptr;

ull frequency[SHORT_MAX];
int batch[BATCH_SIZE];
fileptr stream;

void tally (const ull batch[], const int batch_size, int freq[]);
void zero_freq (ull freq);
int getdata (int batch[], const int BATCH_SIZE, const fileptr fp);

int main (void)
{
// zero all cells in frequency array (omitted)
while (getdata(batch,BATCH_SIZE,stream)){
// function definition omitted here
tally (batch,BATCH_SIZE,frequency);
}
process_results (frequency);
// function definition omitted here
return 0;
}

void tally (const ull batch[], const int batch_size, int freq[]);
{
int i;
for(i=0;i<batch_size;i++) {
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_01]++;
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_02]++;
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_03]++;
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_04]++;
// etc. etc. in reality bitpatterns will be stored in an array
// and bogus mask is applied once each run of the loop.
}
}

Unfortunately, with int32 data using a mostly empty array of the desired
size (largest bitpattern value comes pretty close to 2**29) is
challenging, if not impossible on a given machine/os.

In total, it's likely up to a few hundred bitpatterns may be used, but I
can't guarantee yet it won't expand later.

I quickly abandoned the if/then approach, as it became obvious it would
result in a messy, slow, hard to debug/adapt spaghetti.

Has anyone got a better idea for the huge array ? Is there an easy (to
maintain) way of squeezing out all the non-used cells in the array? Some
kind of mapping perhaps ? I tried figuring out a list approach, but that
more or less stranded. Besides I could not see an easy way to target the
right chain in the list in a way similar to the array approach. If
indeed my approach deserves further developing, do you expect a big
further speed gain from coding the taly function in a OS-specific ASM
version? I know this is platform specific, just like to get a feel of
what your opninions are on how succesful modern compilers are in optimizing.

I am fully aware I may be completely off on the wrong foot here, in
which case I am glad to be steered in the right direction. Please don't
let a possible typo in above semi-code put you off, it's not the real
thing, as it still is in experimental stage, rather more complicated,
partly handed to my by others, and messy and illegible as hell.

Thanks for your help in advance, much appreciated.
Sh.
PS in case you're interested, the final program will be put to use as
data hub for an (in-house developed) logistics planning package and
feedback to both SCADA devices and an accounting database. All efforts
sofar have stranded, now I am asked to burn my fingers... Interesting
yet challenging.
Aug 23 '06 #1
11 1599


Schraalhans Keukenmeester wrote On 08/23/06 17:44,:
[... much snipped ...]
void tally (const ull batch[], const int batch_size, int freq[]);
{
int i;
for(i=0;i<batch_size;i++) {
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_01]++;
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_02]++;
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_03]++;
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_04]++;
// etc. etc. in reality bitpatterns will be stored in an array
// and bogus mask is applied once each run of the loop.
}
}

Unfortunately, with int32 data using a mostly empty array of the desired
size (largest bitpattern value comes pretty close to 2**29) is
challenging, if not impossible on a given machine/os.

In total, it's likely up to a few hundred bitpatterns may be used, but I
can't guarantee yet it won't expand later.
Exactly what are you trying to do? You said you were
trying to "tally the occurrences" of bit patterns, but that
is not quite what the code does. For example, suppose there
is just one bit pattern, BIT_PATTERN_01 = 0x1. Then freq[0]
will count all the even data values and freq[1] will count
all the odd values. The freq[1] count will include not only
the data value 0x1 (matching BIT_PATTERN_01), but also all the
values 0x3, 0x51, 0xffff, ...

Things get even more confused if one BIT_PATTERN_xx is a
"subset" of another: BIT_PATTERN_01 = 0x1, BIT_PATTERN_02 = 0x3,
for example.

Please clarify what kind of "tally" you want to compute.

--
Er*********@sun.com

Aug 23 '06 #2
In article <44**********************@news.xs4all.nl>,
Schraalhans Keukenmeester <fi********************@xsfourall.ennelwrote:
>I have a substantial random stream of (numerical, long int) data (coming
from several SCADA sources) that needs to be passed thru a function that
tallies the occurrences of several different bit patterns in each number
>Speediness of execution is high on the priority list.
>Unfortunately, with int32 data using a mostly empty array of the desired
size (largest bitpattern value comes pretty close to 2**29) is
challenging, if not impossible on a given machine/os.
>In total, it's likely up to a few hundred bitpatterns may be used, but I
can't guarantee yet it won't expand later.
Is it necessary that the bit patterns be easily changable?

Two approaches come to mind:

1) If it is acceptable to spin a new executable when the patterns
change, then write a program that reads the bit patterns,
and spits out code that matches appropriately. When the
pattern file changes, rebuild that one bit of code and relink.

2) If it is not acceptable to spin a new executable, then you might
be able to find a data table representation suitable for loading
a match function from a data file. This might end up
slower than the above, as you might end up "interpreting" a bit.

freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_01]++;
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_02]++;
That doesn't seem right -- for example, the all-0 bit pattern
is going to increment freq[0] for every BIT_PATTERN_* .
If you have "a few hundred" bit patterns, you are definitely going
to end up with these kind of clashes.

Don't you need something closer to:

freq[1] += batch[i] & BIT_MASK_01 == BIT_PATTERN_01;
freq[2] += batch[i] & BIT_MASK_02 == BIT_PATTERN_02;
--
Okay, buzzwords only. Two syllables, tops. -- Laurie Anderson
Aug 23 '06 #3
Eric Sosman wrote:
>
Schraalhans Keukenmeester wrote On 08/23/06 17:44,:
>[... much snipped ...]
void tally (const ull batch[], const int batch_size, int freq[]);
{
int i;
for(i=0;i<batch_size;i++) {
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_01]++;
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_02]++;
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_03]++;
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_04]++;
// etc. etc. in reality bitpatterns will be stored in an array
// and bogus mask is applied once each run of the loop.
}
}

Unfortunately, with int32 data using a mostly empty array of the desired
size (largest bitpattern value comes pretty close to 2**29) is
challenging, if not impossible on a given machine/os.

In total, it's likely up to a few hundred bitpatterns may be used, but I
can't guarantee yet it won't expand later.

Exactly what are you trying to do? You said you were
trying to "tally the occurrences" of bit patterns, but that
is not quite what the code does. For example, suppose there
is just one bit pattern, BIT_PATTERN_01 = 0x1. Then freq[0]
will count all the even data values and freq[1] will count
all the odd values. The freq[1] count will include not only
the data value 0x1 (matching BIT_PATTERN_01), but also all the
values 0x3, 0x51, 0xffff, ...

Things get even more confused if one BIT_PATTERN_xx is a
"subset" of another: BIT_PATTERN_01 = 0x1, BIT_PATTERN_02 = 0x3,
for example.

Please clarify what kind of "tally" you want to compute.
You are right, yet in practice either the actual bitmasks are mutually
exclusive or data not conforming (to some extra requirements) will be
reworked into another form. As I said, it's more complicated in reality.
In the processing stage only the 'cells' indexed by the bitmasks are
used, the rest is discarded. I am fully aware additional cells will be
incremented. I admit, my statement the data was random isn't completely
true.

If the above requirements cannot be met for all data/masks and some
combinations lead to other valid pattern counts being incremented, the
stream will have to be split in substreams. Actually the data is already
sorted in substreams based on identifiers in the raw data stream (which
comes in the form source_identifier, packet_id, packetsize, data, data,
[...,] data, alignment, crc)

Right now I am mainly interested in the usability -in general- of the
suggested algoritm with speed in mind. I hope you can help me with that!

Thanks for your words of wisdom and your time invested!
Rgds
Sh.

Aug 23 '06 #4
Walter Roberson wrote:
In article <44**********************@news.xs4all.nl>,
Schraalhans Keukenmeester <fi********************@xsfourall.ennelwrote:
>I have a substantial random stream of (numerical, long int) data (coming
from several SCADA sources) that needs to be passed thru a function that
tallies the occurrences of several different bit patterns in each number
>Speediness of execution is high on the priority list.
>Unfortunately, with int32 data using a mostly empty array of the desired
size (largest bitpattern value comes pretty close to 2**29) is
challenging, if not impossible on a given machine/os.
>In total, it's likely up to a few hundred bitpatterns may be used, but I
can't guarantee yet it won't expand later.

Is it necessary that the bit patterns be easily changable?

Two approaches come to mind:

1) If it is acceptable to spin a new executable when the patterns
change, then write a program that reads the bit patterns,
and spits out code that matches appropriately. When the
pattern file changes, rebuild that one bit of code and relink.

2) If it is not acceptable to spin a new executable, then you might
be able to find a data table representation suitable for loading
a match function from a data file. This might end up
slower than the above, as you might end up "interpreting" a bit.

> freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_01]++;
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_02]++;

That doesn't seem right -- for example, the all-0 bit pattern
is going to increment freq[0] for every BIT_PATTERN_* .
If you have "a few hundred" bit patterns, you are definitely going
to end up with these kind of clashes.

Don't you need something closer to:

freq[1] += batch[i] & BIT_MASK_01 == BIT_PATTERN_01;
freq[2] += batch[i] & BIT_MASK_02 == BIT_PATTERN_02;
I am afraid it may end up being a necessity indeed, which would impact
speed quite seriously. I had not fully thought through the consequences
of just incrementing based on masking given a large(r) set of masks. Too
bad, can't have it all, but this DOES help me limit the array-size. I
think you helped me here. THANKS! Curious how gcc or VisualC are gonna
code/optimize this...
Aug 23 '06 #5
Schraalhans Keukenmeester wrote:
I have a substantial random stream of (numerical, long int) data (coming
from several SCADA sources) that needs to be passed thru a function that
tallies the occurrences of several different bit patterns in each number
(or in fact 32 bit row, of which a few (leftmost) are bogus).
Has anyone got a better idea for the huge array ? Is there an easy (to
maintain) way of squeezing out all the non-used cells in the array? Some
kind of mapping perhaps ? I tried figuring out a list approach, but that
more or less stranded. Besides I could not see an easy way to target the
right chain in the list in a way similar to the array approach. If
i assume you have some idea what possible bit patterns are coming.
following is a slightly more readable (to some) variant of if/else that
may work. it's fairly easy to add a new pattern to the top (or make it
patnum-- and insert at the bottom).

int patnum = 0;
switch (pattern) {
case pat3:
patnum++;
case pat2:
patnum++;
case pat1:
patnum++;
}
pattern_counter[patnum]++;

Aug 23 '06 #6


Schraalhans Keukenmeester wrote On 08/23/06 19:05,:
Walter Roberson wrote:
>>[...]
Don't you need something closer to:

freq[1] += batch[i] & BIT_MASK_01 == BIT_PATTERN_01;
freq[2] += batch[i] & BIT_MASK_02 == BIT_PATTERN_02;

I am afraid it may end up being a necessity indeed, which would impact
speed quite seriously. I had not fully thought through the consequences
of just incrementing based on masking given a large(r) set of masks. Too
bad, can't have it all, but this DOES help me limit the array-size. I
think you helped me here. THANKS! Curious how gcc or VisualC are gonna
code/optimize this...
Maybe you could just tally the data values as they arrive,
letting the "uninteresting" values increment counters you don't
care about:

for (i = 0; i < value_count; ++i)
freq[ value[i] & BOGUS_BIT_MASK ]++;

for (i = 0; i < pattern_count; ++i)
printf ("Pattern %X occurred %d times\n",
bit_pattern[i], freq[ bit_pattern[i] ]);

Needs a big freq[] array (a big freakin' array?), but it has
the virtue of simplicity.

Take a step back for a moment, though: How much speed do
you actually need? If you've got a large quantity of values
to screen for patterns, they're probably being supplied from
an I/O source of some kind. I/O is usually several orders of
magnitude slower than the CPU (disk service times are in the
milliseconds, CPU cycle times are in the *nano*seconds). If
you squeeze the last drop of performance from your screening
code, you may find that you've done little more than increase
the iteration count of the system idle loop ... It's well to
do things efficiently, but speeding up something that is not
the bottleneck will make only a small overall improvement.

--
Er*********@sun.com

Aug 24 '06 #7

Eric Sosman wrote:
Take a step back for a moment, though: How much speed do
you actually need? If you've got a large quantity of values
to screen for patterns, they're probably being supplied from
an I/O source of some kind. I/O is usually several orders of
magnitude slower than the CPU (disk service times are in the
milliseconds, CPU cycle times are in the *nano*seconds).

Eric's got a good point. CPU's these days are much faster than most
I/O devices.
Disks take milliseconds to find a file, then transfer at anywhere from
10 to 200 megabytes/sec. CPU's can usually keep up with all but the
fastest disks.
But you mentioned SCADA data, which may be coming in from most any
port.

What is the speed of your SCADA data flow? If it's less than a
megabyte a second, the CPU tallying time is probably not the
bottleneck.

Aug 24 '06 #8
Ancient_Hacker wrote:
Eric Sosman wrote:
> Take a step back for a moment, though: How much speed do
you actually need? If you've got a large quantity of values
to screen for patterns, they're probably being supplied from
an I/O source of some kind. I/O is usually several orders of
magnitude slower than the CPU (disk service times are in the
milliseconds, CPU cycle times are in the *nano*seconds).


Eric's got a good point. CPU's these days are much faster than most
I/O devices.
Disks take milliseconds to find a file, then transfer at anywhere from
10 to 200 megabytes/sec. CPU's can usually keep up with all but the
fastest disks.
But you mentioned SCADA data, which may be coming in from most any
port.

What is the speed of your SCADA data flow? If it's less than a
megabyte a second, the CPU tallying time is probably not the
bottleneck.
It's not completely clear right now (to me) what the combined stream
from the data collector will be, I've received some specs of the devices
it gets its data from, I believe the highest POSSIBLE output rate of the
newest single device is 128kb/sec, but as the pilot plant engineers
assured me is never remotely reached. Actual measurements of the
avg/min/max data load on the collector have not been made available to
me yet. Can you spell slow moving organisations, near-pensioners and
bureaucracy ;-)?
The older machines have been used with ancient 14k4 modems over crappy
POTS lines, so that should not be something to fear.

But it would seem, as far as I can judge from here, the collector never
receives as much as 1Mbyte/sec. Have to see the measurements report yet
however on how much comes OUT of the collector, since it adds overhead
to the data adding source identifiers and repackaging the mostly antique
protocols its being talked to into ip packets. 1Mbyte/sec seems quite a
huge amount, it would surely show up as a significant load on the
10mbit/sec ethernet segment the collector's on, and a quick glance at HP
NetView showed it's not very busy at all.

I may indeed have been overwhelmed initially by the amount of probes
throughout the plant, the stack of paper dumped on my desk and the
urgency management shows wanting our response (can/will we do it) .
Thanks for putting my feet back on earth. Now let me go back on the
floor oozing coolness and confidence around the place from every pore...

Rgds
Sh
Aug 25 '06 #9


Schraalhans Keukenmeester wrote On 08/25/06 08:06,:
[various data rate estimates, all fairly inexact]
Sounds like it's time to get better information about
the data rates. Judging from what's known, though, I don't
think it's likely that your CPU will be a bottleneck and I
doubt that you need to worry about "heroic" performance
tricks.

Of course, the only way to be sure is to measure the
performance you actually get, and this raises something of
a chicken-and-egg problem: You won't know how fast the program
runs until you write it, and you won't know whether you should
write simply or heroically until you know how fast it runs!
You might want to try writing just the "core" of the program
and putting it in a test harness that feeds it a lot of data
(10MB of random numbers 100 times, say) and measures how long
it takes with different pattern counts.

Good luck -- but at the moment, I don't think you need
a lot of C help.

--
Er*********@sun.com

Aug 25 '06 #10
Eric Sosman wrote:
>
Schraalhans Keukenmeester wrote On 08/25/06 08:06,:
>[various data rate estimates, all fairly inexact]
Of course, the only way to be sure is to measure the
performance you actually get, and this raises something of
a chicken-and-egg problem: You won't know how fast the program
runs until you write it, and you won't know whether you should
write simply or heroically until you know how fast it runs!
You might want to try writing just the "core" of the program
and putting it in a test harness that feeds it a lot of data
(10MB of random numbers 100 times, say) and measures how long
it takes with different pattern counts.

Good luck -- but at the moment, I don't think you need
a lot of C help.
Point and advice taken! Thx.
Sh.
Aug 25 '06 #11
On Wed, 23 Aug 2006 22:49:14 +0000 (UTC), ro******@ibd.nrc-cnrc.gc.ca
(Walter Roberson) wrote:
In article <44**********************@news.xs4all.nl>,
Schraalhans Keukenmeester <fi********************@xsfourall.ennelwrote:
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_01]++;
freq[batch[i] & BOGUS_BIT_MASK & BIT_PATTERN_02]++;

That doesn't seem right -- for example, the all-0 bit pattern
is going to increment freq[0] for every BIT_PATTERN_* .
If you have "a few hundred" bit patterns, you are definitely going
to end up with these kind of clashes.

Don't you need something closer to:

freq[1] += batch[i] & BIT_MASK_01 == BIT_PATTERN_01;
freq[2] += batch[i] & BIT_MASK_02 == BIT_PATTERN_02;
Probably even closer to
freq[1] += ( batch[i] & BIT_MASK_01 ) == BIT_PATTERN_01;
etc. as without the parentheses you always & with constant 1
(which is almost certainly wrong) or 0 (which is useless).

- David.Thompson1 at worldnet.att.net
Sep 4 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

18
by: deko | last post by:
I have a counter file that records page hits - each hit is a UNIX timestamp in the file. But I'm only interested in page hits in the last 365 days. The below code creates an array from the file...
9
by: Yaroslav Bulatov | last post by:
I made an array of 10 million floats timed how long it takes to sum the elements, here's what I got (millis): gcc -O2: 21 Python with numarray: 104 Python with Numeric: 302...
6
by: supercomputer | last post by:
I am using this function to parse data I have stored in an array. This is what the array looks like: , , , , , , , , , , , , , , , , , , , , , , , ] This is the code to parse the array:
7
by: seia0106 | last post by:
Hello, Writing a program in c++ that should use complex numbers I have two choices before me. 1- define a struct for complex data i.e struct {float real,imag; }ComplexNum; 2-use an array...
21
by: yeti349 | last post by:
Hi, I'm using the following code to retrieve data from an xml file and populate a javascript array. The data is then displayed in html table form. I would like to then be able to sort by each...
18
by: Sam | last post by:
Hi All I'm planing to write an application which allows users dynamically add their points (say you can add upto 30,000) and then draw xy graph. Should I use an array for my coordinate point...
49
by: vfunc | last post by:
If I have a large array 10,000+ elements then how do I reserve memory for this ? Currently I get a segmentation fault. Dynamic reservation is good, but allowing a chunk for the program is an...
4
by: Sandman | last post by:
Hi, So I have 2 arrays: one contains userids. It may look like: user_id =12, user_id =30, user_id =43 The other is a multi-dimensional array with fields like: user_info = Array (
5
by: Jromero | last post by:
I have a problem with printing the array Here is the print out Picture a vehicle number 1 : What's it's max speed?
15
by: DL | last post by:
say, we have the following: // declare an array myArray = ; // short hand? same as myArray = new Array ? // populate it myArray = 0; myArray = 0; myArray = 1; myArray = 0;
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.