473,386 Members | 1,842 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

warning for google api users

the google webservices (aka google API) is not even close for any kind
of real use yet

if you search for the same term 10 times, you get 3 mixed totals. 2
mixed result order. and one or two "502 bad gateway"

i did an extensive match agains the API and the regular search
service. the most average set of results:

results 1-10; total: 373000
results 11-20; total: 151000
results 21-30; total: 151000
results 31-40; total: 373000
results 41-50; total: 373000
results 51-60; total: 373000
results 61-70; total: 151000
( 502 bad gateway. retry)
results 71-80; total: 373000
results 81-90; total: 151000
( 502 bad gateway. retry)
results 91-100; total: 373000

on the regular google search, total: 2,050,000 (for every page, of
course)

besides that, the first and third result on the regular google search,
does not apear in the 100 results from the API in this query, but this
is not average, more like 1 chance in 10 :-/

So, no matter how much google insists that this parrot is sleeping,
it's simply dead.
now, what i presume that is happening, is that they have a dozen of
machine pools, and each one has a broken snapshot of the production
index (probably they have some process to import the index and or it
explode in some point or they simply kill it after some time). and
they obviously don't run that process very often.

Now... anyone has some implementation of pygoogle.py that scraps the
regular html service instead of using SOAP? :)

Gabriel B.
Feb 21 '06 #1
1 1355
Isn't this because the index that the api uses is (a lot) older than
the index used by www.google.com? total results are always estimated,
so they are not reliable (seen the variance)

Gabriel B. schreef:
the google webservices (aka google API) is not even close for any kind
of real use yet

if you search for the same term 10 times, you get 3 mixed totals. 2
mixed result order. and one or two "502 bad gateway"

i did an extensive match agains the API and the regular search
service. the most average set of results:

results 1-10; total: 373000
results 11-20; total: 151000
results 21-30; total: 151000
results 31-40; total: 373000
results 41-50; total: 373000
results 51-60; total: 373000
results 61-70; total: 151000
( 502 bad gateway. retry)
results 71-80; total: 373000
results 81-90; total: 151000
( 502 bad gateway. retry)
results 91-100; total: 373000

on the regular google search, total: 2,050,000 (for every page, of
course)

besides that, the first and third result on the regular google search,
does not apear in the 100 results from the API in this query, but this
is not average, more like 1 chance in 10 :-/

So, no matter how much google insists that this parrot is sleeping,
it's simply dead.
now, what i presume that is happening, is that they have a dozen of
machine pools, and each one has a broken snapshot of the production
index (probably they have some process to import the index and or it
explode in some point or they simply kill it after some time). and
they obviously don't run that process very often.

Now... anyone has some implementation of pygoogle.py that scraps the
regular html service instead of using SOAP? :)

Gabriel B.


Feb 22 '06 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Dave | last post by:
I have this sample HTML code: <html> <head> <script type="text/javascript"> var WinHttpReq = new ActiveXObject("WinHttp.WinHttpRequest.5.1"); </script> </head>
14
by: Alastair | last post by:
Hi Guys, I am new to Google news groups, so hello. I am an embedded C programmer of some experience, but I have come accross a build error, while using MCC68K (Microtec C Compiler for 68000...
4
by: lucavilla | last post by:
If you go to http://europe.nokia.com/A4305060, fill the "Enter your product code:" field with the value "0523183" and press "Go" (the ending page URL varies because there's a variable session-ID in...
28
by: Dave Stafford | last post by:
I have a macro that I use across the board for freeing ram. I'd like to clean up my code so I don't get these warnings. #define sfree(x) _internal_sfree((void **)&x) #define _internal_sfree(x)...
92
by: Heinrich Pumpernickel | last post by:
what does this warning mean ? #include <stdio.h> int main() { long l = 100; printf("l is %li\n", l * 10L);
20
by: somenath | last post by:
Hi All, I have one question regarding the code. #include<stdio.h> char *f1(void); char *f1(void) { char *abc ="Hello";
29
by: Bob | last post by:
Hi, I have been trying to use some inventive alternative idioms for infinite loops in my code, rather than the same old for(;;) and while(1) - this could be a nice amusing Easter-egg for any...
13
by: Rex Mottram | last post by:
I'm using an API which does a lot of callbacks. In classic callback style, each routine provides a void * pointer to carry user-defined data. Sometimes, however, the user-defined pointer is not...
2
by: david | last post by:
I've noticed that the following compiles (as C) under both VS8 and gcc with no warnings, even though there's a possibility of data truncation from enum to unsigned char. It does generate a warning...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.