By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
428,591 Members | 629 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 428,591 IT Pros & Developers. It's quick & easy.

inverted index, postings lists, and linked lists

P: 2
I was reading about the use of inverted indexes and I was trying to build a simple example to better understand how they work (I am most familar with PHP/MySQL). However, I am still unclear what the table structure would look like.

As far as I understand, there is a table called "dictionary" that has a column with each term being indexed (e.g., cat, dog, rat) and another with a reference to a posting list for each term. Fuurther that the posting list points to specific docIDs for each document where the term appears and that it is structured as a linked list (e.g., value/next-pointer chains).

- I assume each term/posting list would be its own table, correct?
- I am very confused about what columns would be in the posting list table. I know docID is one but what else, why? A next-pointer column? why?

I am leaving out term-frequency and position-in-doc stuff to simplify the example.
Feb 29 '12 #1
Share this Question
Share on Google+
2 Replies


Expert 100+
P: 1,035
After reading a bit about 'inverted indexes' here:
http://en.wikipedia.org/wiki/Inverted_index

That page gives a good example, to understand how it works.

Overy posting has an index attached to it like in the example:
Expand|Select|Wrap|Line Numbers
  1. "it":     {(0, 0), (0, 3), (1, 2), (2, 0)} 
i Understand the word "it" is stored, and the list "{(0, 0), (0, 3), (1, 2), (2, 0)} " is linked to that word.

So a table could look like this:
Expand|Select|Wrap|Line Numbers
  1. CREATE TABLE `NewTable` (
  2. `word`  varchar(20) NOT NULL ,
  3. `docnr`  mediumint NOT NULL ,
  4. `position`  mediumint NOT NULL ,
  5. PRIMARY KEY (`word`, `docnr`, `position`)
  6. )
  7. ;
Because for every 'word' you should store in which document it is found ('docnr'), and at what position ('position')
Mar 31 '12 #2

Expert 100+
P: 1,035
since you want to know about inverted-indexes, and someone already create a project about it:
http://code.google.com/p/inverted-index/
Mar 31 '12 #3

Post your reply

Sign in to post your reply or Sign up for a free account.