By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,453 Members | 3,183 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,453 IT Pros & Developers. It's quick & easy.

Can I write a crawler in Javascript?

P: n/a
In addition to the question in the subject line, if the answer is yes,
is it possible to locate keywords as part of the functionality of said
crawler (bot, spider)?

Basically, I would like to write a stand-alone form (javascript app.)
to perform a site-specific keyword search.

Can I do the aforementioned in Javascript?

Thanks.
Jan 4 '08 #1
Share this Question
Share on Google+
7 Replies


P: n/a
Lee
bd*******@gmail.com said:
>
In addition to the question in the subject line, if the answer is yes,
is it possible to locate keywords as part of the functionality of said
crawler (bot, spider)?

Basically, I would like to write a stand-alone form (javascript app.)
to perform a site-specific keyword search.

Can I do the aforementioned in Javascript?
1. Don't assume that people reading your post can see the subject line.
State your entire question in the body.

2. What you think of as Javascript is almost certainly client-side
code, and it cannot see anything on the server. It is possible on
some servers to execute Javascript on the server, but it's not
something you're likely to want to try.
--

Jan 4 '08 #2

P: n/a
On Jan 4, 1:25 pm, bdy120...@gmail.com wrote:
In addition to the question in the subject line, if the answer is yes,
is it possible to locate keywords as part of the functionality of said
crawler (bot, spider)?

Basically, I would like to write a stand-alone form (javascript app.)
to perform a site-specific keyword search.

Can I do the aforementioned in Javascript?

Thanks.
My question is "why?" Is it because you're familiar with JavaScript
and not server-side languages like PHP, C#, or Ruby?

The typical solution would be to have a PHP crawler on the server. You
can get the info to the server via a form or a JavaScript ajax call.

Even if you could do it in Javascript from the client, cross-site
security in the browser may block your attempts.
Jan 4 '08 #3

P: n/a
Lee wrote on 04 jan 2008 in comp.lang.javascript:
2. What you think of as Javascript is almost certainly client-side
code, and it cannot see anything on the server. It is possible on
some servers to execute Javascript on the server, but it's not
something you're likely to want to try.
Why?

Writing serverside javascript is a joy.

Many functions can be written for clientside
and serverside use without any conversion,
like dual input verification of data.

Or do you mean not wanting to try writing a crawler?
--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Jan 5 '08 #4

P: n/a
Lee
Evertjan. said:
>
Lee wrote on 04 jan 2008 in comp.lang.javascript:
>2. What you think of as Javascript is almost certainly client-side
code, and it cannot see anything on the server. It is possible on
some servers to execute Javascript on the server, but it's not
something you're likely to want to try.

Why?

Writing serverside javascript is a joy.

Many functions can be written for clientside
and serverside use without any conversion,
like dual input verification of data.

Or do you mean not wanting to try writing a crawler?
I mean that somebody who has to ask whether it's possible
to write a crawler in Javascript probably doesn't want to
try to write server-side Javascript.

At least not until they've had more experience in writing
client-side Javascript and have gained working knowledge
of client/server differences.
--

Jan 5 '08 #5

P: n/a
Lee wrote on 05 jan 2008 in comp.lang.javascript:
Evertjan. said:
>>
Lee wrote on 04 jan 2008 in comp.lang.javascript:
>>2. What you think of as Javascript is almost certainly client-side
code, and it cannot see anything on the server. It is possible on
some servers to execute Javascript on the server, but it's not
something you're likely to want to try.

Why?

Writing serverside javascript is a joy.

Many functions can be written for clientside
and serverside use without any conversion,
like dual input verification of data.

Or do you mean not wanting to try writing a crawler?

I mean that somebody who has to ask whether it's possible
to write a crawler in Javascript probably doesn't want to
try to write server-side Javascript.

At least not until they've had more experience in writing
client-side Javascript and have gained working knowledge
of client/server differences.
Agree!
--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Jan 5 '08 #6

P: n/a
On Jan 4, 4:25*pm, bdy120...@gmail.com wrote:
In addition to the question in the subject line, if the answer is yes,
is it possible to locate keywords as part of the functionality of said
crawler (bot, spider)?

Basically, I would like to write a stand-alone form (javascript app.)
to perform a site-specific keyword search.

Can I do the aforementioned in Javascript?

Thanks.
(I'm assuming that you want something that will run completely inside
of your web browser, and not use Adobe AIR, a Java applet, Firefox
plugin, or anything like that.) I am certain that you can do this.
You'd have to have the web crawler/search logic in one window/frame,
and have it pilot a second window/frame to various web pages, and
search their contents. This probably wouldn't bee too fast, but if you
were only searching a limited number of pages, it'd probably be fast
enough. For bonus points you could try building an index of crawled
content, and searching that.

James Tikalsky
Jan 6 '08 #7

P: n/a
bd*******@gmail.com wrote:
Thank you for all yoru answers. OK, then, which language would be the
easiest to write such an application?
Look at Java because the standard class libraries contain most of the
code needed for following URLs as well as accessing and parsing HTML.

I had no trouble writing a URL checker for my set of private reference
pages using these classes. Its a sort of primitive crawler that parses a
page, extracts the URLs from anchor tags and checks whether the target
object exists.
--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |
Jan 7 '08 #8

This discussion thread is closed

Replies have been disabled for this discussion.