473,397 Members | 2,099 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,397 software developers and data experts.

robust regex needet

Hi there

first of - i have to explain something; I have to grab some data out of
a phpBB in order to do some field reseach. I need the data out of a
forum that is runned by a user community. I need the data to analyze the
discussions.

to give an example - let us take this forum here. How can
i grab all the data out of this forum - and get it local and then after
wards put it in a local database
of a phpBB-forum
is this possible"?!"? to give an example - let us take this forum - am
i able to grabb and harvest data out of
ordinary phpBB board (/eg phpBB.com). How can i do that.

What i have in mind - Nothing harmeful - nothing bad - nothing serious
and dangerous. But the issue is. i have to get the data - so what?
I need to to take out forum messages and other data (foum topics, users)
into database. Purpose: create forum copy for text analysis. Does anyone
have approximate solution?

It is needed to get data through HTTP for further analysis - in need to
get the data through HTTP and put it into CSV - in order to get a dump
that can fill a local database of a phpBB-board.

I need the data in a allmost full and complete formate. So i need all
the data like

username .-
forum
thread
topic
text of the posting and so on and so on.

how to do that?

i need some kind of a grabbing tool - can i do it with that kind of
tool. How do i sove the storing-issue into the local mysql-database.

Well you see that is a tricky work - and i am pretty sure taht i am
getting help here. So for any and all help i am very very thankful many
many thanks in advance
.- need to have a regex to parse the things

cheers
*** Sent via Developersdex http://www.developersdex.com ***
Aug 27 '06 #1
0 776

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Martin Smith | last post by:
Hi, Does anyone know a good (c++)example of the implementation of a "robust" list iterator (ie an iterator whereby insert/delete operations does not have any effect on the list)? I am looking...
9
by: Tim Conner | last post by:
Is there a way to write a faster function ? public static bool IsNumber( char Value ) { if (Regex.IsMatch( Value.ToString(), @"^+$" )) { return true; } else return false; }
20
by: jeevankodali | last post by:
Hi I have an .Net application which processes thousands of Xml nodes each day and for each node I am using around 30-40 Regex matches to see if they satisfy some conditions are not. These Regex...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
6
by: Extremest | last post by:
I have a huge regex setup going on. If I don't do each one by itself instead of all in one it won't work for. Also would like to know if there is a faster way tried to use string.replace with all...
7
by: Extremest | last post by:
I am using this regex. static Regex paranthesis = new Regex("(\\d*/\\d*)", RegexOptions.IgnoreCase); it should find everything between parenthesis that have some numbers onyl then a forward...
33
by: Matt Kruse | last post by:
I'm seeking the most robust and backwards-compatible (ie, no instanceof) isArray function. Here's what I have: function defined(o) { return typeof(o)!="undefined"; } function isArray(o) {...
3
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def...
15
by: morleyc | last post by:
Hi, i would like to remove a number of characters from my string (\t \r \n which are throughout the string), i know regex can do this but i have no idea how. Any pointers much appreciated. Chris
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.