vic wrote:
My manager wants me to develop a search program, that would work like they
have it at edorado.com.
She made up her requirements after having compared how search works at
different websites, like eBay, Yahoo and others.
This is what she wants my program to be able to do:
(try this test at different websites just for fun).
At eBay:
- enter the word 'television' in a search field à you will get 2155 items.
- enter the word 'televisions' (in plural) à you will get only 60 items
- enter the word 'tv' à you will get 5147 items.
In other words - if entering different variations of the same word, you
will be getting different results .
My manager showed me one website (www.edorado.com ), where the above problem
didn't occur during her testing. When searched for 3 different forms of the
word 'tv' she was always getting the same results.
And another important thing: the result set came very precise - televisions,
not other things, speaking statistically - 'low noise', or in other words -
very low percentage of unrelated items.
Could anyone from those who work on development of search engines offer me
any advise on how I should approach to the design of my algorithm. I don't
have much experience in programming this kind of search, and my manager is a
real snake. According to her, if search engine brings hundreds of thousands
results, the user would not be able to browse through all of them, so what
she wants my program to do - is to bring less results, and only those that
are most relevant to the search term.
Could anyone help me with advise?
Yes, I can.
I think the problem of sexual missatisfaction in between managers,
cause releafing their ambit missatisfaction to their subordinated
workers.
When you see the viper is going to you, better escape.
Do you sacre to lose your job ?
Then what for you are working ?
Well, my advice:
Try to explain her,
that she's absolutely right, if search sites gave only links which user
want to see
it would be great.
But that relevance of search engines is big problem.
Russian guy proposed a bit better principle of search,
and found top search engine.
And even it does not do everythng what you may expect.
Internet has very big volume of information, and probaly most effective
search in all of it with existed algorythms,
even if all sand on the earth was turned to search engine,
could take more time than black energy distroy this universe.
Say her that you can code classic search engine.
1. A stranger which walk the internet and define what words happen
in first 1 to 5 kilobytes of each page, and put em into database.
2. Search engine which just look in this data base.
Synonims (different wods with same meaning) also the problem,
you of course may make list of synomims.
It's good for "tv" and "television",
but how to be with "oil" and "shell" or "machine oil" and "machine
shell" ?
Yes there are AI solutions but they all need big processor resourses.
And grammatics.
In most of western languages pluralism of noun marked with end "-s",
"-es" and "-en".
Of what I can remind, at the monet, to the end of words may be added:
"-ing" (-n') (english danish norwegian)
"-ed" (english)
"-en" (dutch german english)
"-t" (dutch german)
"-'s" (english danish)
"-e" (danish)
http://www.geocities.com/tsca.geo/dansk/
At the beging: may be added:
"ge-" (dutch german)
Finally words came in english from franch may not have
last mute sylabe in most of ceses:
"-que" replaces with "-c"
republique republic
technique technics
Optionally you may make grammar sequention.
Also some exotical letters may be simplified:
Glyphs "è","é","ê","ë" may be changed to "e", and so on.
Also "ae" "æ" "ä" may be replace by each other and all may be repaced by
"e".
"å"
my be replced by "aa".
"ö" "ø" "oe"
"ü" "ue" "e"
"ij" "y" "u" "i" "ï" and even "ae" "æ"
In american english "a" and "u" also may replace each other.
More difficult technique I do not recoment you to implemet.
Because nobody need yet another gray search site,
you may spend your forces for what has no perspective.
Just let her to see what does she can, that may be a goal of your job.
That all.
Bye.
--Michaelo Mitrofanov