473,490 Members | 2,487 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Awe Forget it

This is a bunch of bull cr*p. I have tried copying tables out on the web
and there are so many variations that its not feasable to write a single
regex for every situation.

So, I give up.
Dec 31 '06 #1
5 874
Hello...

Please try to keep your related posts together in one thread ;)

And now a sugestion:

Try the HTML-DOM and look at the tags there... They have a property of inner
text, which can be used to extract the text out of any HTML-Node or even the
whole document... Or you can examine all tables or table.row or tabledata
fields and extract the information from there. But scrapping information
from websites is usually quite hard ;)

Dec 31 '06 #2
This will get all the tables: Set IgnoreCase and SingleLine options. Use
groups.

<table .*?</table>
"Just Me" <news.microsoft.comwrote in message
news:%2****************@TK2MSFTNGP06.phx.gbl...
This is a bunch of bull cr*p. I have tried copying tables out on the web
and there are so many variations that its not feasable to write a single
regex for every situation.

So, I give up.

Dec 31 '06 #3

"Just Me" <news.microsoft.comwrote in message
news:%2****************@TK2MSFTNGP06.phx.gbl...
This is a bunch of bull cr*p. I have tried copying tables out on the web
and there are so many variations that its not feasable to write a single
regex for every situation.
Well shux, why don't you just read the file one char at a time and use use
"if" statements and comparison operators?
It won't be a minimal task, but it won't be that tough, either.

Dec 31 '06 #4
HTML Parser

http://www.codeproject.com/dotnet/apmilhtml.asp
"Just Me" <news.microsoft.comwrote in message
news:%2****************@TK2MSFTNGP06.phx.gbl...
This is a bunch of bull cr*p. I have tried copying tables out on the web
and there are so many variations that its not feasable to write a single
regex for every situation.

So, I give up.

Jan 1 '07 #5
Just Me,

Why than using Regex, MSHTML is much easier to get information about
webdocuments. Be aware that a page can exist from more documents (frames)

http://www.vb-tips.com/dbpages.aspx?...f-56dbb63fdf1c

Be aware that our website is extremely in reconstruction these weeks.

I hope this helps,

Cor

"Just Me" <news.microsoft.comschreef in bericht
news:%2****************@TK2MSFTNGP06.phx.gbl...
This is a bunch of bull cr*p. I have tried copying tables out on the web
and there are so many variations that its not feasable to write a single
regex for every situation.

So, I give up.

Jan 1 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1799
by: el | last post by:
Hello, I wrote a tool which read an outlook folder and save the mails to the hard disk (in .msg files) Here below one of the main sub of my tool. I don't know why but sometimes, this...
1
1621
by: Ames Andreas (MPA/DF) | last post by:
Hi, sorry for following-up to myself. Some remarks: 1) Please excuse the bogus original message. I wrote it a minute before I knocked off work and I promise to never do so again :-) 2)...
5
1832
by: Kai Grossjohann | last post by:
On unload of a page, I store the current scrollbar position (ie, window.pageXOffset and window.pageYOffset) into a cookie. On load of that same page, I fetch the information from that cookie and...
6
2268
by: gnu | last post by:
Rationale to use Linux ======================= - I can't afford paying for $199 for the license of an OS that's arguably better thank Linux for each of 10 computers I have. - I want to be...
7
12015
by: bazley | last post by:
I've been tearing my hair out over this: #ifndef MATRIX2_H #define MATRIX2_H #include <QVector> template<class T> class Matrix2 { public:
19
6762
by: lihua | last post by:
Hi, Group! I got one question here: We all know that fclose() must be called after file operations to avoid unexpected errors.But there are really cases when you forget to do that!Just like...
5
4186
by: Stephen Barrett | last post by:
I have read many threads related to async fire and forget type calls, but none have addressed my particular problem. I have a webpage that instantiates a BL object and makes a method call. The...
0
1066
by: =?Utf-8?B?QW1qYWQ=?= | last post by:
I have one question about forget password control with asp.net 2.0 we are using here Groupwise . how can i send email using group wise instead of SMPT and how to change the setting for or i have to...
3
1694
by: vimalankvk80 | last post by:
what will happen, if we forget to return ostream referance ? class A { Private: int _a; int _b;
2
1775
harshadd
by: harshadd | last post by:
I have 3000+ exchange users and most of the time they forget the password and we reset it for them. again problem is this happens on phone. so no guaranty that user it self has requested to...
0
6974
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7146
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7183
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
6852
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7356
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
4878
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4573
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3084
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
1
628
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.