473,385 Members | 1,727 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Adaptive datatype (or so...)

Ok, I got a big problem. What I want to do is basically read from a
file. This file contains "symbols". The user specifies how much bytes
one symbols need. For example, in a a text file each symbol would need
one byte. So I want to read into an array, say buf[], were buf[0]
contains the first symbol, b[1] the second, and so on.
How do I realize that? I mean, the program has to be able to handle any
number of bytes per symbol. For example:

file content = "123456789"

example #1. bytes per symbol = 1.
In this case,
buf[0] = 1
buf[1] = 2
buf[2] = 3
etc.

example #2, bytes per symbol = 2.
In this case,
buf[0] = 12
buf[1] = 34
etc.

Any idea how to do this, guys?

Jul 20 '06 #1
15 1108


Wh********@web.de wrote On 07/20/06 12:46,:
Ok, I got a big problem. What I want to do is basically read from a
file. This file contains "symbols". The user specifies how much bytes
one symbols need. For example, in a a text file each symbol would need
one byte. So I want to read into an array, say buf[], were buf[0]
contains the first symbol, b[1] the second, and so on.
How do I realize that? I mean, the program has to be able to handle any
number of bytes per symbol. For example:

file content = "123456789"

example #1. bytes per symbol = 1.
In this case,
buf[0] = 1
buf[1] = 2
buf[2] = 3
etc.

example #2, bytes per symbol = 2.
In this case,
buf[0] = 12
buf[1] = 34
etc.

Any idea how to do this, guys?
Several, but I don't know how to choose among them.
What do you want to do with these "symbols" after they
have been loaded into the array?

--
Er*********@sun.com

Jul 20 '06 #2
Well, I want to insert them into a binary tree. But that's not the
problem.
My problem is to have an array that contains the symbols.

Could you just choose one simple way of doing this job?

Jul 20 '06 #3
Wh********@web.de wrote:
Well, I want to insert them into a binary tree. But that's not the
problem.
My problem is to have an array that contains the symbols.

Could you just choose one simple way of doing this job?

Please quote enough of the previous message for context. See how
everybody else in the group does it.
You haven't explained what you are trying to accomplish. What are these
"symbols"? How will they be used once you create the array?

Brian
Jul 20 '06 #4


Wh********@web.de wrote On 07/20/06 14:50,:
Well, I want to insert them into a binary tree. But that's not the
problem.
My problem is to have an array that contains the symbols.
Please quote enough context so your message can stand on
its own. Message propagation on Usenet is both asynchronous
and uncoordinated, meaning that messages do not arrive at all
news servers at the same time or in the same order. It is
entirely possible for a reply to reach a server before the
message it replies to.
Could you just choose one simple way of doing this job?
(For those just joining: "this job" is to extract some
kind of "symbols" from a file and store them in an array.
We are told that the symbols are sometimes one character
long and sometimes two, and possibly other lengths, but
that the symbol length is fixed during any given program
execution. We are not told whether these symbols can
just be thought of as strings or are something else; we are
not told whether newlines in the file have any importance
or are just parts of symbols; we are not told very much at
all. I asked what Whatever5k wanted to do with the symbols,
because data structures exist to support the operations to
be performed on the data; without knowing what the operations
are, it is impossible to make an intelligent recommendation.
His response was as you see above, so ...)

Here's one simple way: Allocate a big array of characters
and read the entire file into it. The array will then contain
all the "symbols" from the file.

--
Er*********@sun.com

Jul 20 '06 #5

Wh********@web.de wrote:
Ok, I got a big problem. What I want to do is basically read from a
file. This file contains "symbols". The user specifies how much bytes
one symbols need. For example, in a a text file each symbol would need
one byte. So I want to read into an array, say buf[], were buf[0]
contains the first symbol, b[1] the second, and so on.
How do I realize that? I mean, the program has to be able to handle any
number of bytes per symbol. For example:
It's not exactly clear what you want, but here's one
idea.

#include <stdio.h>
#include <stdlib.h>
#define MAX_LENGTH 8

void die(char *a)
{
fprintf(stderr, "%s\n", a);
exit(EXIT_FAILURE);
}
void * xmalloc(size_t size)
{
void *ret;
if ( (ret = malloc(size)) == NULL)
die("out of memory");
return ret;
}

int
main(int argc, char **argv)
{
int symbol_size;
char **symbol_array;
size_t symbol_count;
size_t array_size;
char **next_symbol;

symbol_size = (argc 1) ? atoi(argv[1]) : 1;
if (symbol_size < 0 || symbol_size MAX_LENGTH)
die("Invalid size");

symbol_array = xmalloc(array_size = BUFSIZ);
next_symbol = symbol_array;
*next_symbol = xmalloc(symbol_size);
symbol_count = 0;
while( fread(*next_symbol, symbol_size, 1, stdin) == 1) {
/*
* Need to check and realloc symbol_array
* if necessary! Left as an exercise.
*/
next_symbol++;
*next_symbol = xmalloc(symbol_size);
symbol_count++;
}
return EXIT_SUCCESS;
}
~

Jul 20 '06 #6
On 20 Jul 2006 09:46:00 -0700, Wh********@web.de wrote:
>Ok, I got a big problem. What I want to do is basically read from a
file. This file contains "symbols". The user specifies how much bytes
one symbols need. For example, in a a text file each symbol would need
one byte. So I want to read into an array, say buf[], were buf[0]
contains the first symbol, b[1] the second, and so on.
How do I realize that? I mean, the program has to be able to handle any
number of bytes per symbol. For example:

file content = "123456789"

example #1. bytes per symbol = 1.
In this case,
buf[0] = 1
buf[1] = 2
buf[2] = 3
etc.

example #2, bytes per symbol = 2.
In this case,
buf[0] = 12
buf[1] = 34
etc.

Any idea how to do this, guys?
I recommend a dynamic array of pointers to strings. Once you decide
on the number of bytes per symbol (bps), something like the following
will work (error checking of malloc omitted for brevity):

char **ptr;
int i = 0;
ptr = malloc(n * sizeof *ptr); /*for some initial quantity of
strings*/
while (/*more strings to process*/) {
ptr[i] = malloc(bps+1);
strncpy(ptr[i], /*pointer to starting byte for next
symbol*/, bps);
ptr[i++][bps] = '\0';
}

You will need to include a check for i exceeding the number of
pointers ptr points to. When it does, realloc ptr to point to a
larger number and continue.

If your data is binary rather than text, you can do the same thing
with arrays of unsigned char and use memcpy instead of strncpy. The
extra space for the terminating '\0' would not be needed.

Remove del for email
Jul 21 '06 #7
Thank you for all those replies.
OK, so I have a file, this can be binary or text or anything. What I
want to do is read from that file, symbol by symbol. What I mean by
symbol is just a certain amount of bytes. For example, I want to be
able to read from the size with a symbol size of 2 bytes. This would
mean that at the end I would have an array and each entry would contain
2 bytes of information from the file.
Oh and the file can also be binary. Is it more clear now? What I want
to do with the symbols later on is just count them. I want to see how
many different symbols are in the file.

Thanks.

Jul 21 '06 #8
Barry, your example would not work for a binary file.
OK, here is another example. Let's say I have got a binary file, that
contains the year and month number of today. Now, this would be written
with 0 and 1, but it would look like this: 200607. Ok, we would say
that one symbol occupies 4 bytes. So 2006 would be the first symbol and
07 the second. What I want to have is an array so that ptr[0] = 2006
and ptr[1] = 07.
I don't think that would work with your examples, would it?

Jul 21 '06 #9
Wh********@web.de wrote:
Barry, your example would not work for a binary file.
OK, here is another example. Let's say I have got a binary file, that
contains the year and month number of today. Now, this would be written
with 0 and 1, but it would look like this: 200607.
Your description of the data format is still unclear
(to me, anyhow). Do you mean that the file contains the
number "two hundred thousand six hundred seven" as a
binary integer in the machine's native form (probably four
or eight bytes long)? What does "look like this" mean?
Ok, we would say
that one symbol occupies 4 bytes. So 2006 would be the first symbol and
07 the second.
It sounds like 07xx would be the second, where the x's
are two more bytes. What do you mean when you say 07 is
a four-byte "symbol?"
What I want to have is an array so that ptr[0] = 2006
and ptr[1] = 07.
It seems you don't realize that C supports many different
data types, and can represent 2006 in many different ways.
Some of them are

- As an int. The number two thousand six would look like
...011111010110 in the machine, where the "..." stand
for a machine-dependent number of leading zero bits.

- As another integer type: signed or unsigned long long,
long, int, short, and so on. The value would be as above,
but perhaps with more or fewer leading zeroes.

- As a float. C doesn't prescribe any particular floating-
point format, but on many machines two thousand six would
be represented as {0.9794921875 times two to the twelfth}
and might look like 01000100111110101100000000000000 if
viewed as a sequence of bits.

- As another floating-point type: double or long double.
Again, C doesn't prescribe the exact format, but it is
likely to be somewhat like that shown for float.

- As a string of four digits followed by a fifth all-zero
byte (to mark the end of the string). On many machines
this would look like 00110010 00110000 00110000 00110110
00000000 in five consecutive memory locations.

- As a pointer to a string of the form described above.
Strings are really arrays, and C cannot manipulate arrays
as freely as it handles other kinds of objects, so it is
often desirable to store the strings "elsewhere," leave
them pretty much alone, and work with pointers to them
instead. (Especially given the confusion over the "four-
byte symbol" 07 -- if the symbols actually have different
lengths, it will be cumbersome to work with them directly
as arrays.)

Let me repeat: These are only *some* of the ways you might
represent a "symbol" in a C program. Also, these are variations
on ways to represent just *one* of your "symbols;" there are
additional decisions to be made when you choose how to manage a
collection of many of them. I hope it's clear by now that simply
saying `ptr[0] = 2006' is not an adequate description of what you
are trying to accomplish; you need to be more specific.

--
Eric Sosman
es*****@acm-dot-org.invalid
Jul 21 '06 #10
On Fri, 21 Jul 2006, Eric Sosman wrote:
Wh********@web.de wrote:
>Barry, your example would not work for a binary file.
OK, here is another example. Let's say I have got a binary file, that
contains the year and month number of today. Now, this would be written
with 0 and 1, but it would look like this: 200607.

Your description of the data format is still unclear
(to me, anyhow). Do you mean that the file contains the
number "two hundred thousand six hundred seven" as a
binary integer in the machine's native form (probably four
or eight bytes long)? What does "look like this" mean?
>Ok, we would say
that one symbol occupies 4 bytes. So 2006 would be the first symbol and
^^^^^^^
>07 the second.

It sounds like 07xx would be the second, where the x's
are two more bytes. What do you mean when you say 07 is
a four-byte "symbol?"
>What I want to have is an array so that ptr[0] = 2006
and ptr[1] = 07.

It seems you don't realize that C supports many different
data types, and can represent 2006 in many different ways.
Some of them are

[snipped]
The OP has explicitly requested that his data type is to be
``4 bytes'' (underlined above) which according to the C standard
means an object containing 4 * CHAR_BIT bits. Therefore, the OP
was trying to say this:

char ptr[2][4] = {{'2', '0', '0', '6'}, {'0', '7'}};

when he/she wrote ``ptr[0] = 2006 and ptr[1] = 07''.

Tak-Shing
Jul 21 '06 #11
Wh********@web.de wrote:
Thank you for all those replies.
OK, so I have a file, this can be binary or text or anything. What I
want to do is read from that file, symbol by symbol. What I mean by
symbol is just a certain amount of bytes. For example, I want to be
able to read from the size with a symbol size of 2 bytes. This would
mean that at the end I would have an array and each entry would
contain 2 bytes of information from the file.
So you just want to hold the raw bytes? That's fairly simple, although
I don't really see how it's useful in the larger scheme.

If I were doing this, I'd have two different functions, one for binary
files and the other for text. Presumably in the text case you will be
ignoring newline characters, but perhaps not.
Oh and the file can also be binary. Is it more clear now? What I want
to do with the symbols later on is just count them. I want to see how
many different symbols are in the file.
Personally, I wouldn't store the "symbols" if all I wanted to do was
count. I'd open the file and count.


Brian
Jul 21 '06 #12
Hi Brian,

ok, you want to count, that's fine. But you have to make sure counting
the same objects as one. So, if you had a text file "12235", you would
count 3 symbols which appear once and 1 symbol that appears twice. So
how do you manage to do that?

Jul 21 '06 #13
Wh********@web.de wrote:
Hi Brian,

ok, you want to count, that's fine. But you have to make sure counting
the same objects as one. So, if you had a text file "12235", you would
count 3 symbols which appear once and 1 symbol that appears twice. So
how do you manage to do that?

So you want a frequency, essentially. Now I see where you're coming
from. First thing you need is a data structure. Then an algorithm.
For the data, you need some way of representing in information about a
symbol:

struct node
{
char *symbol;
int count;
};

Then you need to store all the symbol information. A dynamic array is a
possibility, as is a linked list. Are either of those something you
know how to do?

Now you will need a routine to extract a symbol. Your symbols can have
different lengths, so you need a function that takes in the filename
(or a pointer to an open file) and the symbol symbol size.

It's relatively easy to scan the file to put together the symbol.
fgetc() or fscanf() are possibilities. Once you get a symbol assembled,
you need a routine to process the symbol. This will check the data to
see if it already has a record. If so, it increments the count,
otherwise it adds it.

Brian
Jul 21 '06 #14
Wh********@web.de writes:
Hi Brian,

ok, you want to count, that's fine. But you have to make sure counting
the same objects as one. So, if you had a text file "12235", you would
count 3 symbols which appear once and 1 symbol that appears twice. So
how do you manage to do that?
Please provide enough context so your followup makes sense on its own.
Google no longer makes this gratuitously difficult. See
<http://cfaj.freeshell.org/google/for details; see most of the
followups in this newsgroup for examples.

Are you trying to count the number of *distinct* symbols? I don't
think you've mentioned this before.

What exactly do you mean by "symbol"? Is a "symbol" just a sequence
of some specified number of bytes in the input file?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jul 21 '06 #15
Keith Thompson wrote:
Wh********@web.de writes:
Hi Brian,

ok, you want to count, that's fine. But you have to make sure
counting the same objects as one. So, if you had a text file
"12235", you would count 3 symbols which appear once and 1 symbol
that appears twice. So how do you manage to do that?
Are you trying to count the number of distinct symbols? I don't
think you've mentioned this before.

What exactly do you mean by "symbol"? Is a "symbol" just a sequence
of some specified number of bytes in the input file?

I guess you're like me, still have trouble figuring out what the goal
is. Getting information is like pulling teeth.


Brian
Jul 21 '06 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: thepercival | last post by:
Hello, I have a stored procedure and the return data type is number(16) as you can see. but I get it back in the code as a var_numeric and then the precision depends on the value of the...
0
by: Hong Kong Is A Good Place | last post by:
how can i connect my sql server 2000 to Adaptive Sybase Server 11.5 and 12.0? Thx.
8
by: Eternally | last post by:
Hi folks, I've got a program which has a function which uses templates to accept parameters of any type. Works well, but there's one certain datatype which I want to special case and do an...
1
by: jsausten | last post by:
My client has a custom app using a Sybase Adaptive Server Anywhere 7 DB backend. I need to regularly extract read-only data from a couple of tables in this database and I would have thought it...
0
by: SoYouKnowBrig | last post by:
Hi All, I am using Microsoft.ApplicationBlocks.Cache.CacheManager to persist a System.Data.Dataset object. This Dataset object has a DataTable that is created from an existing DataTable using...
3
by: M D | last post by:
I've got a data source with a .cdb extension that appears to be accessed via a runtime engine of adaptive server anywhere v.6.0.3. Can anyone advise me how to connect to this data source? What's...
4
by: Nacho Nachev | last post by:
Hello, AFIAK ASP.NET (1.1) uses a technology called 'Adpative Rendering' to output HTML that is compliant with the client browser or at least stick to HTML 4 specification or IE. This seems...
3
by: Sri | last post by:
In VB, to know the field type of a column stored in a recordset the command I use is If rsQuery.Fields(k).Type = adCurrency Then How will I achieve the same in ASP.net. I could not find a...
4
by: Orchid | last post by:
How can I change a Date datatype to a Number datatype? For example, I want a date 10/31/2006 to show 1031 as Number datatype. But I don't want it becomes 39021. What formula should I use? ...
11
by: BD | last post by:
Hi, all. I'm running 8.2 on Windows. This is a development platform for a project whose production environment is running on a mainframe. I believe that the RI compilation process is not...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.