473,404 Members | 2,195 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,404 software developers and data experts.

a problem help me

hello everyone
Do you have any information how to generate a tool using .net which is
used to translate the web page contents to html format.

Plz reply me asap

Thanks in advance

Dhananjay

Nov 28 '06 #1
8 1551
Given that most web pages are *in* HTML (or a variant), it wouldn't
have a lot to do...

Can you clarify what you mean?

Marc

Nov 28 '06 #2

Marc Gravell wrote:
Given that most web pages are *in* HTML (or a variant), it wouldn't
have a lot to do...

Can you clarify what you mean?

Marc
hello marc
my problem is
first thing i have to import a client to a website(specified website
,and there may may be more than one website) then i have to generate a
tool which has to convert web page contents to html format save this
html format to a database(sql server).
how to achieve this
could you plz help me to do this.

Reply me asap

Thanks
Dhananjay

Nov 28 '06 #3
You don't need to "convert" anything here. The website is probably in
HTML already. If it isn't you won't be able to do it. You may be able
to use WebClient here to simply download the text (see MSDN2) - but
even then it won't be a usable static copy, as all images, scripts,
cookies, links etc will probably be dead if you *just* use the HTML in
isolation of the other stuff.

You could try and use WebBrowser to export as an mht; never tried it -
might work. Alternatively if it is for later reference you might try
using tools like HTMLDOC to create a standalone PDF of the page (hence
including the images but not scripts).

Alternatively you can find lots of crawlers on google to do this for
you.

It really depends on what *exactly* you need. And unless this is
somehow a C# issue you may find other groups more useful.

Marc

Nov 28 '06 #4

Marc Gravell wrote:
You don't need to "convert" anything here. The website is probably in
HTML already. If it isn't you won't be able to do it. You may be able
to use WebClient here to simply download the text (see MSDN2) - but
even then it won't be a usable static copy, as all images, scripts,
cookies, links etc will probably be dead if you *just* use the HTML in
isolation of the other stuff.

You could try and use WebBrowser to export as an mht; never tried it -
might work. Alternatively if it is for later reference you might try
using tools like HTMLDOC to create a standalone PDF of the page (hence
including the images but not scripts).

Alternatively you can find lots of crawlers on google to do this for
you.

It really depends on what *exactly* you need. And unless this is
somehow a C# issue you may find other groups more useful.

Marc

hello marc
anyway thanks for spending time on me.
what you have suggested i tried it but its not working, its saying
namespace problem.i think this feature is different. i am using vs2005
C#
will you tell me one thing either my problem will be solved by creating
windows appln or web appln.first i have to import client to a website
and then generate a tool to convert webpage contents to html format
save it to sql server databse.
first i was doing with vb.net i have generated a tool which converts
webpage contents to html format , but same thing its not working
inC#.net.

plz reply me
Thanks
waiting for your reply asap
Dhananjay

Nov 28 '06 #5
Hi Dhananjay,

Ok Working on the HttpRequest, and Response objects, These are very
help full for u. Simple give a Request to the specifc URL by using the
HTTPRequest or HttpWebrequest, and then save the content stream of the
response in to the U R DataBase. this part will be simple get the page
u put the request, But u r aim is to get the whole Website, so search
the other links in the Main response stream and form a URL and process
same way...

On Nov 28, 10:45 am, "Dhananjay" <dhananjay...@yahoo.co.inwrote:
hello everyone
Do you have any information how to generate a tool using .net which is
used to translate the web page contents to html format.

Plz reply me asap

Thanks in advance

Dhananjay
Nov 28 '06 #6
Hello Dhananjay,

First off, your English is vague. This leads to some misunderstanding.
More on that below.

Secondly, it is not clear what BUSINESS PROBLEM you are trying to solve.
Before you jump to "what is wrong with my solution," please help us to
understand what problem your code is trying solve. There may be a better
way than writing code!

Thirdly, if you have written code, and it is not working, please post it.
That provides a great deal of information for us to help you.

Now, back to your request.

You said:
>first i have to import client to a website
and then generate a tool to convert webpage contents to html format
save it to sql server databse.
1. I do not know what this phrase means "import client to a website" I have
no idea what you are trying to accomplish. Can you use different words to
describe what you mean?

2. I do not know what is difficult about this: "convert webpage contents to
html format" since nearly all web pages are already in HTML format. That is
the nature of the web. All browsers begin by reading HTML. Note that if
the HTML in your target web page is constructed on the fly using Javascript,
then you are going to have a TOUGH time emulating that in C# code.

3. You want to "save it to a sql server database". What is "it" that you
are saving? Each page? Each element on a page? The content of the page?
Why save it to SQL? Do you intend to look up pages using SQL queries? Why
not save it as a web site and use HTTP to get the pages?

I want to help. But until you answer some of these questions, I won't be
terribly helpful.

Note: Are you looking for something like WinHTTrack? This tool is useful
for visiting a web site and creating, on your hard drive, a complete copy of
the site with links intact. It's fairly friendly and easy to use.
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
"Dhananjay" <dh**********@yahoo.co.inwrote in message
news:11**********************@14g2000cws.googlegro ups.com...
>
Marc Gravell wrote:
>You don't need to "convert" anything here. The website is probably in
HTML already. If it isn't you won't be able to do it. You may be able
to use WebClient here to simply download the text (see MSDN2) - but
even then it won't be a usable static copy, as all images, scripts,
cookies, links etc will probably be dead if you *just* use the HTML in
isolation of the other stuff.

You could try and use WebBrowser to export as an mht; never tried it -
might work. Alternatively if it is for later reference you might try
using tools like HTMLDOC to create a standalone PDF of the page (hence
including the images but not scripts).

Alternatively you can find lots of crawlers on google to do this for
you.

It really depends on what *exactly* you need. And unless this is
somehow a C# issue you may find other groups more useful.

Marc


hello marc
anyway thanks for spending time on me.
what you have suggested i tried it but its not working, its saying
namespace problem.i think this feature is different. i am using vs2005
C#
will you tell me one thing either my problem will be solved by creating
windows appln or web appln.first i have to import client to a website
and then generate a tool to convert webpage contents to html format
save it to sql server databse.
first i was doing with vb.net i have generated a tool which converts
webpage contents to html format , but same thing its not working
inC#.net.

plz reply me
Thanks
waiting for your reply asap
Dhananjay

Nov 28 '06 #7

Nick Malik [Microsoft] wrote:
Hello Dhananjay,

First off, your English is vague. This leads to some misunderstanding.
More on that below.

Secondly, it is not clear what BUSINESS PROBLEM you are trying to solve.
Before you jump to "what is wrong with my solution," please help us to
understand what problem your code is trying solve. There may be a better
way than writing code!

Thirdly, if you have written code, and it is not working, please post it.
That provides a great deal of information for us to help you.

Now, back to your request.

You said:
first i have to import client to a website
and then generate a tool to convert webpage contents to html format
save it to sql server databse.

1. I do not know what this phrase means "import client to a website" I have
no idea what you are trying to accomplish. Can you use different words to
describe what you mean?

2. I do not know what is difficult about this: "convert webpage contents to
html format" since nearly all web pages are already in HTML format. That is
the nature of the web. All browsers begin by reading HTML. Note that if
the HTML in your target web page is constructed on the fly using Javascript,
then you are going to have a TOUGH time emulating that in C# code.

3. You want to "save it to a sql server database". What is "it" that you
are saving? Each page? Each element on a page? The content of the page?
Why save it to SQL? Do you intend to look up pages using SQL queries? Why
not save it as a web site and use HTTP to get the pages?

I want to help. But until you answer some of these questions, I won't be
terribly helpful.

Note: Are you looking for something like WinHTTrack? This tool is useful
for visiting a web site and creating, on your hard drive, a complete copy of
the site with links intact. It's fairly friendly and easy to use.
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
"Dhananjay" <dh**********@yahoo.co.inwrote in message
news:11**********************@14g2000cws.googlegro ups.com...

Marc Gravell wrote:
You don't need to "convert" anything here. The website is probably in
HTML already. If it isn't you won't be able to do it. You may be able
to use WebClient here to simply download the text (see MSDN2) - but
even then it won't be a usable static copy, as all images, scripts,
cookies, links etc will probably be dead if you *just* use the HTML in
isolation of the other stuff.

You could try and use WebBrowser to export as an mht; never tried it -
might work. Alternatively if it is for later reference you might try
using tools like HTMLDOC to create a standalone PDF of the page (hence
including the images but not scripts).

Alternatively you can find lots of crawlers on google to do this for
you.

It really depends on what *exactly* you need. And unless this is
somehow a C# issue you may find other groups more useful.

Marc

hello marc
anyway thanks for spending time on me.
what you have suggested i tried it but its not working, its saying
namespace problem.i think this feature is different. i am using vs2005
C#
will you tell me one thing either my problem will be solved by creating
windows appln or web appln.first i have to import client to a website
and then generate a tool to convert webpage contents to html format
save it to sql server databse.
first i was doing with vb.net i have generated a tool which converts
webpage contents to html format , but same thing its not working
inC#.net.

plz reply me
Thanks
waiting for your reply asap
Dhananjay
================================================== ===========
hello nick
As you have asked some questions.in a simple way i am trying to achieve
this:-
my plan on building a Cache System. It will import content from
different Dhananjay-Sites, translate the dhananjay-Code into HTML and
republish it in a specific format on a file system.

now will you plz guide me how to proceed so that i can achieve it
or have u developed something like this previously then send me the
resources, so that i acn easily proceed towards the target
or u want in more detail ? let me know

plz reply me asap
Thanks
Dhananjay

Nov 28 '06 #8
Hello Dhananjay,
hello nick
As you have asked some questions.in a simple way i am trying to achieve
this:-
my plan on building a Cache System. It will import content from
different Dhananjay-Sites, translate the dhananjay-Code into HTML and
republish it in a specific format on a file system.

now will you plz guide me how to proceed so that i can achieve it
or have u developed something like this previously then send me the
resources, so that i acn easily proceed towards the target
or u want in more detail ? let me know

You are building a cache system. I assume from your statement that the goal
is for a person, using their web browser, to be able to visit a web site
while online, cache the site, and then visit it again when offline. Is this
true? (Are you aware that this is built-in functionality in the IE browser?
Simply add the site to favorites and check the "make available offline"
check box.)

I will assume, given the fact that this is trivial for an individual user,
that you intend for this cache to be visited by more than one user.
Therefore, I assume that the source sites are somehow more 'difficult' to
reach or less reliable than your cache server. In that case, you need to
provide what is called a 'proxy cache' in that the users will hit your site,
looking for the web pages that they want, and your app will get the data
from the remote system, update the local cache, and serve the pages.

Of course, there is no need to write code for any of this. Simply install
ISA server. http://www.microsoft.com/isaserver/default.mspx

On the off chance that you posted on a developer forum because you'd rather
develop software than install existing stuff (;-), then perhaps the code on
this link would be helpful. It is not a proxy server. It is, instead, a
web site spider. That actually sounds more like what you are saying you
want. This link provides complete C# source code for downloading web sites
to a local hard drive: See open source code at
http://www.codeproject.com/useritems/ZetaWebSpider.asp

For a more full-featured system that caches web sites, but one that is not
written in C# (to the best of my knowledge) but is still free, check out
HTTrack. The windows version is WinHTTrack? (www.httrack.com)

I hope this helps,
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
Nov 28 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

21
by: Dave | last post by:
After following Microsofts admonition to reformat my system before doing a final compilation of my app I got many warnings/errors upon compiling an rtf file created in word. I used the Help...
9
by: Tom | last post by:
A question for gui application programmers. . . I 've got some GUI programs, written in Python/wxPython, and I've got a help button and a help menu item. Also, I've got a compiled file made with...
6
by: wukexin | last post by:
Help me, good men. I find mang books that introduce bit "mang header files",they talk too bit,in fact it is my too fool, I don't learn it, I have do a test program, but I have no correct doing...
3
by: Colin J. Williams | last post by:
Python advertises some basic service: C:\Python24>python Python 2.4.1 (#65, Mar 30 2005, 09:13:57) on win32 Type "help", "copyright", "credits" or "license" for more information. >>> With...
7
by: Corepaul | last post by:
Missing Help Files When I enter "recordset" as the keyword and search the Visual Basic Help index, I get many topics of interest in the resulting list. But there isn't any information available...
5
by: Steve | last post by:
I have written a help file (chm) for a DLL and referenced it using Help.ShowHelp My expectation is that a developer using my DLL would be able to access this help file during his development time...
8
by: Mark | last post by:
I have loaded Visual Studio .net on my home computer and my laptop, but my home computer has an abbreviated help screen not 2% of the help on my laptop. All the settings look the same on both...
10
by: JonathanOrlev | last post by:
Hello everybody, I wrote this comment in another message of mine, but decided to post it again as a standalone message. I think that Microsoft's Office 2003 help system is horrible, probably...
1
by: trunxnirvana007 | last post by:
'UPGRADE_WARNING: Array has a new behavior. Click for more: 'ms-help://MS.VSCC.v80/dv_commoner/local/redirect.htm?keyword="9B7D5ADD-D8FE-4819-A36C-6DEDAF088CC7"' 'UPGRADE_WARNING: Couldn't resolve...
0
by: hitencontractor | last post by:
I am working on .NET Version 2003 making an SDI application that calls MS Excel 2003. I added a menu item called "MyApp Help" in the end of the menu bar to show Help-> About. The application...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.