|
by: Phong Ho |
last post by:
Hi everyone,
I try to write a simple web crawler. It has to do the following:
1) Open an URL and retrieve a HTML file.
2) Extract news headlines from the HTML file
3) Put the headlines into a RSS file.
For example, I want to go to this site and extract the headlines:
www.unstrung.com/section.asp?section_id=86
|
by: mbbx6spp |
last post by:
Hi All,
I already searched this newsgroup and google groups to see if I could
find a Python equivalent to Perl's Template::Extract, but didn't find
anything leading to a Python module that had similar functionality. I
am a big fan of Python as an OO language and use it for many system
admin utilities, webdev and even MS Excel AddIn...
|
by: mark4 |
last post by:
Hello,
Are there any utilities to help me extract Content from HTML ?
I'd like to store this data in a database.
The HTML consists of about 10,000 files with a total size of
about 160 Mb. Each file is a thread from a message forum. Each
thread has several contributions. The threads are in linear
order of date posted with filenames such...
|
by: Ori |
last post by:
Hi,
I have a HTML text which I need to parse in order to extract data from
it.
My html contain a table contains few rows and two columns. I want to
extract the data from the 2nd column in the most efficient way (using
Reg Ex.) either than using the "indexOf" function of String.
Thanks,
|
by: Brian Hanson |
last post by:
Hi,
I have an unusual problem that just showed its ugly head at a pretty
bad time. I have an asp.net (VB) app that takes data from an Excel
sheet and puts it into SQL Server. I get the data out of Excel using
OleDB, and suddenly, some of the data was not being extracted from
Excel.
I use OleDb for the extract into a DataTable and from...
|
|
by: mandibdc |
last post by:
I need to extract some elements from a very large XML file. Because of
the size, I'd like to work with it on my Linux machine as a text file.
Basically, I am going to have a list of specific strings I'm searching
for. For each string, I need to search through the XML file, and when
I find that string (in the tag <code>), copy the entire...
|
by: manishabh77 |
last post by:
I will be obliged if anybody can help me with this problem:
I am trying to extract data from an excel sheet that matches IDs given in column 4 of the excel sheet.I have stored those query IDs in an array (@names). After I look for the match in this section of the code:
if ($value=~/^$names$/), I want to write out only those rows that satisfy the...
|
by: SteveB |
last post by:
I have posted this question in the Visual Basic 2005 and Visual
Basic .Net 2005 discussion groups, also.
Hi. I am developing an application/web page with VB.Net that will
populate a SQL database from text extracted from PDF documents.
However, I am having a difficult time finding or developing the
appropriate code to convert the PDF...
|
by: NaraN |
last post by:
I am new to perl scripting. I am having some problem to write a program.
I have a number of files containing same type of data with same header.
File is like this.
SN. Cities temperature Humidity rainfall
1 abc 33 66 23
2 ghi 36 83 12
3 xyz 23 78 11
......
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language...
|
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
|