446,131 Members | 2,140 Online
Need help? Post your question and get tips & solutions from a community of 446,131 IT Pros & Developers. It's quick & easy.

# a question on google's algorithm

 P: n/a The title is not exactly a disnomer, so pleaes let me explain... Say I have a large number of 0-1 vectors containing only bits corresponding to 0 or 1. Now given an arbitrary 0-1 vector, I wish to find all the 0-1 vectors that are subsumed by that query bit vector. For example, given bit vector query as 01100, i wish to find all bit vectors that whose 1st, 4th, 5th bits are also zero, leaving the 2nd, 3rd bits as either 0 or 1. What I wish to do is to build an index on the bit vectors, so that I could avoid checking each 0-1 vector in sequence. In another word, I hope this index could help me quickly filter away those vectors that guarunteed not to be subsumed by the given 0-1 vector. I am not sure this is the right place to ask, but the index could be used to solve a search engine problem: the 0-1 vector corresponds to my keywords search (a 0 means that keyword NOT in my query, while a 1 means the keyword is in), and all the webpages corresponds to 0-1 vectors. I wish to find all the webpages that subsumes my query. If anyone happen to know how Google solves this problem, and it is not confidential, pls let me know, thanks a bunch!!!! Feb 21 '07 #1
6 Replies

 P: n/a * %NAME%: [interesting question, cross-posted to numerous groups] No C++ content means OFF-TOPIC in [comp.lang.c++]. Thank you. FUT: [comp.programming] -- A: Because it messes up the order in which people normally read text. Q: Why is it such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? Feb 21 '07 #2

 P: n/a %NAME%

 P: n/a On Feb 20, 6:56 pm, "%NAME%"

 P: n/a In article <11**********************@a75g2000cwd.googlegroups .com>, hb*******@gmail.com says... The title is not exactly a disnomer, so pleaes let me explain... Say I have a large number of 0-1 vectors containing only bits corresponding to 0 or 1. Now given an arbitrary 0-1 vector, I wish to find all the 0-1 vectors that are subsumed by that query bit vector. For example, given bit vector query as 01100, i wish to find all bit vectors that whose 1st, 4th, 5th bits are also zero, leaving the 2nd, 3rd bits as either 0 or 1. Assuming your bit-vectors were stored in an integer type, you could use a comparison like this: if ((bit_vector | pattern) == pattern) // found else // not found If you need to store larger bit-vectors than will fit in any integer type (only 32 are guaranteed), you can look at std::bit_vector and/or std::vector. IIRC, bit_vector supports boolean operations directly, but for vector

 P: n/a Jerry Coffin wrote: hb*******@gmail.com says... >The title is not exactly a disnomer, so pleaes let me explain...Say I have a large number of 0-1 vectors containing only bitscorresponding to 0 or 1. Now given an arbitrary 0-1 vector,I wish to find all the 0-1 vectors that are subsumed by thatquery bit vector.For example, given bit vector query as 01100, i wish to findall bit vectors that whose 1st, 4th, 5th bits are also zero,leaving the 2nd, 3rd bits as either 0 or 1. Assuming your bit-vectors were stored in an integer type, you could use a comparison like this: if ((bit_vector | pattern) == pattern) // found else // not found If you need to store larger bit-vectors than will fit in any integer type (only 32 are guaranteed), you can look at std::bit_vector and/or std::vector. IIRC, bit_vector supports boolean operations directly, but for vector you'll probably need to write your own. The fault is basically that of the OP, but please do not post off-topic information on c.l.c. It is easy to overlook the foolish cross-posting, but you should not answer in any group where the answer (and question) are off topic. F'ups set. -- "A man who is right every time is not likely to do very much." -- Francis Crick, co-discover of DNA "There is nothing more amazing than stupidity in action." -- Thomas Matthews Feb 21 '07 #6

 P: n/a On Tue, 20 Feb 2007 18:56:19 -0600, NAME% wrote (in article <11**********************@a75g2000cwd.googlegroups .com>): The title is not exactly a disnomer, so pleaes let me explain... What in the world is a disnomer? Is this another one of those "we made up a word because the mood struck us" deals? -- Randy Howard (2reply remove FOOBAR) "The power of accurate observation is called cynicism by those who have not got it." - George Bernard Shaw Mar 6 '07 #7

### This discussion thread is closed

Replies have been disabled for this discussion.