470,833 Members | 1,407 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,833 developers. It's quick & easy.

pointers to rows in Burrows-Wheeler Transform

To practice some C I have been looking at the Burrows-Wheeler
Transform.

Ive got a small program that builds the initial table of possible
combinations of the original string, used to find the compressed
result, but there are some notes about using BWT without having to
build a table of strings, instead using pointers to the offset in the
original and just sorting those.

My question is how do you return a pointer to "bcda" from "abcd"

I started by doing something like

char * ptr[input_str];

for i =0 ; i < len(input_str); i++) {
ptr[i] = (input_str + i);
}

but then that only gives "abcd" -"bcd " when i = 1.
Jun 27 '08 #1
7 1549
On Wed, May 28, 2008 at 01:56:59AM -0700, ph****@uk-businessdirectory.co.uk wrote:
>To practice some C I have been looking at the Burrows-Wheeler Transform.
[snip]
char * ptr[input_str];
Is input_str a pointer to char or a character array? Then this is unlikely
to do what you might intend. I guess you wanted to write one plus
the maximal expected length of the string. (Or the actual one, in C99)
>
for i =0 ; i < len(input_str); i++) {
Is for a tricky macro, or you just missed a parethesis?
Is len a synonym for strlen? I am just guessing...
ptr[i] = (input_str + i);
}

but then that only gives "abcd" -"bcd " when i = 1.
I wonder how you got that space appended to the end of the string...
By the way: "bcda" never occurs in the memory, so you have to write
it somehow somewhere (either by modifying the original string, or
doing it out of place), just pointer arithmetics won't do that.

You may want to place two adjacent copies of the string in one
character array. And then you may indeed sort the pointers where
the comparison rule that you pass to qsort sensibly accesses the relevant
pair of substrings and calls e.g. memcmp()

Szabolcs

Jun 27 '08 #2
On Wed, May 28, 2008 at 01:56:59AM -0700, ph****@uk-businessdirectory.co.uk wrote:
>To practice some C I have been looking at the Burrows-Wheeler Transform.
[snip]
char * ptr[input_str];
Is input_str a pointer to char or a character array? Then this is unlikely
to do what you might intend. I guess you wanted to write one plus
the maximal expected length of the string. (Or the actual one, in C99)
>
for i =0 ; i < len(input_str); i++) {
Is for a tricky macro, or you just missed a parethesis?
Is len a synonym for strlen? I am just guessing...
ptr[i] = (input_str + i);
}

but then that only gives "abcd" -"bcd " when i = 1.
I wonder how you got that space appended to the end of the string...
By the way: "bcda" never occurs in the memory, so you have to write
it somehow somewhere (either by modifying the original string, or
doing it out of place), just pointer arithmetics won't do that.

You may want to place two adjacent copies of the string in one
character array. And then you may indeed sort the pointers where
the comparison rule that you pass to qsort sensibly accesses the relevant
pair of substrings and calls e.g. memcmp()

Szabolcs

Jun 27 '08 #3
Sorry for double posting. Szabolcs
Jun 27 '08 #4
ph****@uk-businessdirectory.co.uk wrote:
To practice some C I have been looking at the Burrows-Wheeler
Transform.
Ive got a small program that builds the initial table of possible
combinations of the original string, used to find the compressed
result, but there are some notes about using BWT without having to
build a table of strings, instead using pointers to the offset in the
original and just sorting those.
My question is how do you return a pointer to "bcda" from "abcd"
Not directly.
I started by doing something like
char * ptr[input_str];
I have no idea what 'input_str' is ment to be, you use it
here as if it would be some integer value, but in the rest
as if it would be a string. Perhaps you meant

char (*p)[ strlen( input_str ) ];

I will assume that for the rest.
for i =0 ; i < len(input_str); i++) {
I guess you meant 'strlen(input_str)', didn't you?
ptr[i] = (input_str + i);
}
but then that only gives "abcd" -"bcd " when i = 1.
You now have for the input string "abcd" four pointers which
point to strings as in

p[0] - "abcd"
p[1] - "bcd"
p[2] - "cd"
p[3] - "d"

Something like "bcd " can't occur (unless there's some other
mistake in the code you're not showing) since there's no
space in the input string.

By using all combinations of these pointers (and only
using the character pointed to directly, i.e. the first
char of the strings), you can construct all possible com-
binations of the original string successively.

E.g. one possible combiattion could be constructed, by

char r[ strlen( input_str ) + 1 ];

r[0] = *p[1];
r[1] = *p[2];
r[2] = *p[d];
r[3] = *p[0];
r[4] = '\0';

resulting in the 'r' holding the string "bcda". I guess
that's about what you wrote above to intended to do.

Regards, Jens
--
\ Jens Thoms Toerring ___ jt@toerring.de
\__________________________ http://toerring.de
Jun 27 '08 #5
ph****@uk-businessdirectory.co.uk said:
To practice some C I have been looking at the Burrows-Wheeler
Transform.

Ive got a small program that builds the initial table of possible
combinations of the original string, used to find the compressed
result, but there are some notes about using BWT without having to
build a table of strings, instead using pointers to the offset in the
original and just sorting those.

My question is how do you return a pointer to "bcda" from "abcd"

I started by doing something like

char * ptr[input_str];

for i =0 ; i < len(input_str); i++) {
ptr[i] = (input_str + i);
}

but then that only gives "abcd" -"bcd " when i = 1.
This looks very shaky. Where is the memory for the strings? Do you "own"
it, or do the pointers (for char *ptr[input_str] is an array of pointers,
not an array of strings) point to string literals? If they do, modifying
the strings invokes undefined behaviour - all you might usefully do is
reorder the pointers.

If you own memory for a string, you can "rotate" it like this:

#include <string.h>

void rotate_string_one_place(char *s, size_t len)
{
char t = *s;
memmove(s, s + 1, len - 1);
s[len - 1] = t;
}

/* sample driver */
#include <stdio.h>
#include <string.h/* not needed if you're doing this all in one file
because it's already included above */
int main(void)
{
char sample[] = "Yes, I own the memory for this string";
size_t len = strlen(sample);
size_t i;
for(i = 0; i < len; i++)
{
rotate_string_one_place(sample, len);
printf("%s\n", sample);
}
return 0;
}

Sample output:

es, I own the memory for this stringY
s, I own the memory for this stringYe
, I own the memory for this stringYes
I own the memory for this stringYes,
I own the memory for this stringYes,
own the memory for this stringYes, I
own the memory for this stringYes, I
wn the memory for this stringYes, I o
n the memory for this stringYes, I ow
the memory for this stringYes, I own
the memory for this stringYes, I own
he memory for this stringYes, I own t
e memory for this stringYes, I own th
memory for this stringYes, I own the
memory for this stringYes, I own the
emory for this stringYes, I own the m
mory for this stringYes, I own the me
ory for this stringYes, I own the mem
ry for this stringYes, I own the memo
y for this stringYes, I own the memor
for this stringYes, I own the memory
for this stringYes, I own the memory
or this stringYes, I own the memory f
r this stringYes, I own the memory fo
this stringYes, I own the memory for
this stringYes, I own the memory for
his stringYes, I own the memory for t
is stringYes, I own the memory for th
s stringYes, I own the memory for thi
stringYes, I own the memory for this
stringYes, I own the memory for this
tringYes, I own the memory for this s
ringYes, I own the memory for this st
ingYes, I own the memory for this str
ngYes, I own the memory for this stri
gYes, I own the memory for this strin
Yes, I own the memory for this string
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Jun 27 '08 #6
ph****@uk-businessdirectory.co.uk writes:
To practice some C I have been looking at the Burrows-Wheeler
Transform.
<snip>
there are some notes about using BWT without having to
build a table of strings, instead using pointers to the offset in the
original and just sorting those.
Usually a good idea, yes.
My question is how do you return a pointer to "bcda" from "abcd"
You've have lots of C answers, so I won't repeat them, but there is
one algorithm question that does impact on the C. You have to decide
how to represent the "end marker" because this has to included in the
sort. C gives you an obvious choice (the terminating null of the
string) but if you go that route you must remember that the result of
the BWT will (most often) not be a string, so care will be needed when
do anything with it.

--
Ben.
Jun 27 '08 #7
On May 28, 3:56*pm, phi...@uk-businessdirectory.co.uk wrote:
To practice some C I have been looking at the Burrows-Wheeler
Transform.

My question is how do you return a pointer to "bcda" from "abcd"
Start by duplicating the input string, i.e. get
"abcdabcd".
The substrings won't be null-terminated but
you won't need them to be.

If you wish you can look at my old implementation
http://james.fabpedigree.com/bwt.htm
though I see it has stylistic problems.

James
Jun 27 '08 #8

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by warnockg90 | last post: by
3 posts views Thread by Christoph | last post: by
2 posts views Thread by Hervé Piedvache | last post: by
1 post views Thread by Nathan Gilbert | last post: by
6 posts views Thread by joelperr | last post: by
68 posts views Thread by Martin Joergensen | last post: by
8 posts views Thread by cman | last post: by
13 posts views Thread by arnuld | last post: by
3 posts views Thread by repairman2003 | last post: by
reply views Thread by mihailmihai484 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.