429,480 Members | 771 Online
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,480 IT Pros & Developers. It's quick & easy.

# string letter distribution algorithm

 P: n/a Hi, everyone, I have a simple problem, which bothers me for some while. I will try to explain it - There is some string, whith different letters in it. Is it possible to analyse this string and the reffer to it knowing that it has some certain letters in it using only some integer value? I.e if a string is 'catpow', there is 5 unique letters 'c', 'a', 't', 'p', 'o' and 'w'. word evaluates to some integer value, lets say 1234. I would like to filter out all words except those having letter 't' in it using single integer value and simple arithmetical expression. Thank you in advance. Cheers., Ulterior Jun 20 '07 #1
9 Replies

 P: n/a Ulterior wrote: I have a simple problem, which bothers me for some while. I will try to explain it - There is some string, whith different letters in it. Is it possible to analyse this string and the reffer to it knowing that it has some certain letters in it using only some integer value? I.e if a string is 'catpow', there is 5 unique letters 'c', 'a', 't', 'p', 'o' and 'w'. Just curious, out of the SIX letters, which one is non-unique? word evaluates to some integer value, lets say 1234. I would like to filter out all words except those having letter 't' in it using single integer value and simple arithmetical expression. This is not really a C++ language question, is it? To determine if a word has a certain letter all you need is a bit mask with each bit representing the letter. An unsigned long would do. Then use the bitwise AND operator to see if the bit designated to 't' is set. V -- Please remove capital 'A's when replying by e-mail I do not respond to top-posted replies, please don't ask Jun 20 '07 #2

 P: n/a Ulterior for( std::size_t in(0); in < wrd.size(); ++in){ if( int( wrd.at( in ) ) == int( 't' ) ){ std::cout<<"letter \'t\' found"< I would like to filter out all words except those having letter 't' in it using single integer value and simple arithmetical expression. Good luck with that. { std::vector words.push_back("hello"); words.push_back("world"); words.push_back("catpow"); words.push_back("clarity"); std::string FindThis( "t" ); for( std::size_t in(0); in < words.size(); ++in ){ if( words.at( in ).find( FindThis ) != std::string::npos ){ std::cout<

 P: n/a On Jun 20, 8:10 am, Ulterior , and its to_ulong member for your value. Try comp.programming. - Anand Jun 20 '07 #4

 P: n/a "Ulterior" #include #include int WordLetters( const char* Word ) { int Result = 0; for ( int i = 0; Word[i] != 0; ++i ) { unsigned char LetterValue = static_cast( toupper( Word[i] ) ); // Ignore special characters, only treat characters if ( LetterValue >= 'A' && LetterValue <= 'Z' ) Result |= static_cast( std::pow( 2, LetterValue - 'A' ) ); } return Result; } bool LetterInWord( char Letter, int WordValue ) { unsigned char LetterValue = static_cast( toupper( Letter ) ); // Ignore special characters, only treat characters if ( LetterValue >= 'A' && LetterValue <= 'Z' ) return (WordValue & static_cast( std::pow(2, LetterValue - 'A' ))) != 0; else return false; } int main() { if ( sizeof(int) < 4 ) { std::cout << "int is not big enough. try long int or something" << "\n"; return 1; } int Catpow = WordLetters( "Catpow" ); std::cout << "X in Catpow: " << LetterInWord( 'X', Catpow ) << "\n"; std::cout << "T in Catpow: " << LetterInWord( 'T', Catpow ) << "\n"; std::cout << "C in Catpow: " << LetterInWord( 'C', Catpow ) << "\n"; std::cout << "W in Catpow: " << LetterInWord( 'w', Catpow ) << "\n"; return 0; } Basically all I'm doing is converting each letter 'A' through 'Z' to a number 1 thorugh 26. Then I take it as a binary place. Which means that I need an ordinal value with at least 26 bits, for my system int fits that since it has 32 bits. Then I simply set the bit on for that place. Jun 20 '07 #5

 P: n/a Hi, guys, thanks for your kind answers, but, if you are eager to help me further, I will try to clarify my problem. The point is to evaluate a string to some single value, store it in database field and the, using simple sql compare, retrieve only those records with fields, which have 't' letter in it. There must be no functions, as there is 6mil records and a fetch must be quick. Rgrds, Ulterior Jun 21 '07 #6

 P: n/a "Ulterior"

 P: n/a On Jun 21, 12:47 am, Ulterior

 P: n/a In article <11**********************@o61g2000hsh.googlegroups .com>, le***************@gmail.com says... [ ... ] The point is to evaluate a string to some single value, store it in database field and the, using simple sql compare, retrieve only those records with fields, which have 't' letter in it. There must be no functions, as there is 6mil records and a fetch must be quick. Using (or not using) functions won't make any real difference here. For what you're doing, CPU time is virtually guaranteed to mean nothing compared to I/O time -- if you use a dozen functions to avoid even a little bit of I/O, it'll be a win. Since SQL (as such) is off-topic, I'll discuss this in C++ terms, and leave it to you to translate over to SQL (though the translation is _mostly_ fairly trivial). I'm going to assume that you're willing to consume some time up-front to build an index that gives you really fast retrieval later. In that case, let's represent your database as a vector of records: class record { std::string key; void *data; // this just represents whatever other // data goes with your key. }; std::vector(std::cout, "\n")); Now, as to the conversion to SQL: the main thing is that you don't have record numbers, so you typically include a field that acts as a unique key. You'll want to tell your database to index on that field. IIRC, SQL doesn't have a convenient way of representing a set like I've used either. You can do that pretty easily with a two-column table, with the letter in one column and a single record number in the right column. With the left column indexed, retrieving the set of record numbers with a particular letter is roughly equivalent to what I've written above. I should also point out that if a given query is likely to retrieve most of the records in the database, the index may not do much good. Its whole point is to read only records you care about. The more records it avoids reading, the more good it does. If you retrieve all or most of the records anyway, the index just adds overhead. -- Later, Jerry. The universe is a figment of its own imagination. Jun 23 '07 #9

 P: n/a In article , jc*****@taeus.com says... [ ... ] for (int i=0; i

### This discussion thread is closed

Replies have been disabled for this discussion.