473,324 Members | 2,541 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

wikipedia

There is a free encyclopedia called wikipedia
(http://wikimediafoundation.org/). Does anyone knows how to use it in order
to get various articles for diplaying them in my application?
Oct 11 '06 #1
7 1399
"Pitaridis Aristotelis" <pi*******@hotmail.comwrote in
news:1160551474.444714@athnrd02:
There is a free encyclopedia called wikipedia
(http://wikimediafoundation.org/). Does anyone knows how to use it in
order to get various articles for diplaying them in my application?

If you have .NET 2.0, you can use the web browser control to fetch web
pages.

In .NET 1.1 you'll have to use a wrapper class to access web pages.

Wikipedia also provides content in XML files ... so you could render the
content using the XML files too.
Oct 11 '06 #2
Both ways are very interesting. Where can I find information about this,
especially for the XML way???

Ï "Spam Catcher" <sp**********@rogers.comÝãñáøå óôï ìÞíõìá
news:Xn**********************************@127.0.0. 1...
>
"Pitaridis Aristotelis" <pi*******@hotmail.comwrote in
news:1160551474.444714@athnrd02:
>There is a free encyclopedia called wikipedia
(http://wikimediafoundation.org/). Does anyone knows how to use it in
order to get various articles for diplaying them in my application?


If you have .NET 2.0, you can use the web browser control to fetch web
pages.

In .NET 1.1 you'll have to use a wrapper class to access web pages.

Wikipedia also provides content in XML files ... so you could render the
content using the XML files too.

Oct 12 '06 #3
"Pitaridis Aristotelis" <pi*******@hotmail.comwrote in
news:1160634267.217299@athprx04:
Both ways are very interesting. Where can I find information about this,
especially for the XML way???
AFAIK, Wikipedia provides data in XML files which are dumped daily? They're
massive - couple gigabytes.

There is also a per page export utilty:

http://en.wikipedia.org/wiki/Special:Export/

I'm sure Wikipedia has other "hidden" APIs too :-)

Oct 13 '06 #4
Spam Catcher <sp**********@rogers.comwrote in
news:Xn**********************************@127.0.0. 1:
"Pitaridis Aristotelis" <pi*******@hotmail.comwrote in
news:1160634267.217299@athprx04:
>Both ways are very interesting. Where can I find information about
this, especially for the XML way???

AFAIK, Wikipedia provides data in XML files which are dumped daily?
They're massive - couple gigabytes.

There is also a per page export utilty:

http://en.wikipedia.org/wiki/Special:Export/

I'm sure Wikipedia has other "hidden" APIs too :-)

More info here:

http://en.wikipedia.org/wiki/Help:Export

http://en.wikipedia.org/wiki/Wikiped...abase_download

http://download.wikimedia.org/
Oct 13 '06 #5
I found a way to get the xml result, but I have no idea about xml. Can
someone write a function which will get only the text which contains the
article?
Thanks

Ï "Spam Catcher" <sp**********@rogers.comÝãñáøå óôï ìÞíõìá
news:Xn**********************************@127.0.0. 1...
>
Spam Catcher <sp**********@rogers.comwrote in
news:Xn**********************************@127.0.0. 1:
>"Pitaridis Aristotelis" <pi*******@hotmail.comwrote in
news:1160634267.217299@athprx04:
>>Both ways are very interesting. Where can I find information about
this, especially for the XML way???

AFAIK, Wikipedia provides data in XML files which are dumped daily?
They're massive - couple gigabytes.

There is also a per page export utilty:

http://en.wikipedia.org/wiki/Special:Export/

I'm sure Wikipedia has other "hidden" APIs too :-)


More info here:

http://en.wikipedia.org/wiki/Help:Export

http://en.wikipedia.org/wiki/Wikiped...abase_download

http://download.wikimedia.org/

Oct 13 '06 #6
"Pitaridis Aristotelis" <pi*******@hotmail.comwrote in
news:1160724301.213283@athprx03:
I found a way to get the xml result, but I have no idea about xml. Can
someone write a function which will get only the text which contains the
article?
Thanks

Yes. Take a look at System.XML namespace.
Oct 13 '06 #7
I have used the following code but it does not work

Doc.Load("http://en.wikipedia.org/wiki/Special:Export/test")
If Doc.SelectNodes("/mediawiki/page/revision/text").Count 0 Then
Dim output As String =
Doc.SelectNodes("/mediawiki/page/revision/text").Item(0).InnerText
End If

The structure of the xml file is like this

<mediawiki xmlns="http://www.mediawiki.org/xml/export-0.3/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.3.xsd"
version="0.3" xml:lang="en">
<siteinfo>
<sitename>Wikipedia</sitename>
<base>http://en.wikipedia.org/wiki/Main_Page</base>
<generator>MediaWiki 1.9alpha</generator>
<case>first-letter</case>
<namespaces>
<namespace key="-2">Media</namespace>
<namespace key="-1">Special</namespace>
<namespace key="0" />
<namespace key="1">Talk</namespace>
<namespace key="2">User</namespace>
<namespace key="3">User talk</namespace>
<namespace key="4">Wikipedia</namespace>
<namespace key="5">Wikipedia talk</namespace>
<namespace key="6">Image</namespace>
<namespace key="7">Image talk</namespace>
<namespace key="8">MediaWiki</namespace>
<namespace key="9">MediaWiki talk</namespace>
<namespace key="10">Template</namespace>
<namespace key="11">Template talk</namespace>
<namespace key="12">Help</namespace>
<namespace key="13">Help talk</namespace>
<namespace key="14">Category</namespace>
<namespace key="15">Category talk</namespace>
<namespace key="100">Portal</namespace>
<namespace key="101">Portal talk</namespace>
</namespaces>
</siteinfo>
<page>
<title>Play</title>
<id>22962</id>
<revision>
<id>79448292</id>
<timestamp>2006-10-04T13:00:19Z</timestamp>
<contributor>
<username>JonHarder</username>
<id>629503</id>
</contributor>
<comment>revert: not seeing why this article should be exempt from citing
sources.</comment>
<text xml:space="preserve">XXXXXX this is the part that I want to get
XXXXXX</text>
</revision>
</page>
</mediawiki>

Ï "Spam Catcher" <sp**********@rogers.comÝãñáøå óôï ìÞíõìá
news:Xn**********************************@127.0.0. 1...
>
"Pitaridis Aristotelis" <pi*******@hotmail.comwrote in
news:1160724301.213283@athprx03:
>I found a way to get the xml result, but I have no idea about xml. Can
someone write a function which will get only the text which contains the
article?
Thanks


Yes. Take a look at System.XML namespace.

Oct 14 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Claudio Grondi | last post by:
Is there an already available script/tool able to extract records and generate proper HTML code out of the data stored in the Wikipedia SQL data base? e.g. converting all occurences of ] to <a...
4
by: Claudio Grondi | last post by:
I need to unpack on a Windows 2000 machine some Wikipedia media .tar archives which are compressed with TAR 1.14 (support for long file names and maybe some other features) . It seems, that...
24
by: Luis M. González | last post by:
For those interested in the simplest, easiest and most pythonic web framework out there, there's a new page in Wikipedia: http://en.wikipedia.org/wiki/Karrigell
9
by: AES | last post by:
I fairly often make PDF copies of web pages or sites by copying the web page link from the web page itself and pasting it into the Acrobat 7.0 Standard "Create PDF From Web Page" command. (Not...
6
by: Cain | last post by:
How would I get this Wikipedia XML: http://en.wikipedia.org/wiki/Special:Export/Alastair_Ralphs into a PHP variable (without copy and pasting it manually)? Thanks, Cain.
1
by: soeren.auer | last post by:
Hi all, we thought it could be interesting to experiment with equipping Wikipedia with some AJAX features. Result is a user interface for Wikipedia, which is largely based on AJAX, thus enabling...
7
by: Jorge Vargas | last post by:
Hi I just hit this page in wikipedia BDFL and it redirected me to Guido's wikipedia entry now without causing any troubles (read flamewar) shouldn't a) that page have an explanation of what...
2
by: John Nagle | last post by:
For some reason, Python's parser for "robots.txt" files doesn't like Wikipedia's "robots.txt" file: False The Wikipedia robots.txt file passes robots.txt validation, and it doesn't disallow...
15
by: Andreas Prilop | last post by:
If you have a browser that supports user stylesheets (like Firefox), then you can write html, body, #globalWrapper { font-size: 100% !important } into your own stylesheet (e.g....
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.