Hello together,
I'm having extreme difficulties using RegExps for a specific problem
and would really appreciate any help and hope somebody will read
through my "long" posting...
1.
<?php
// Find all blocks containing the postal code, a minimum of 50
characters and a maximum of 200 characters before and after.
//This should me all blocks containing postal code and city.
$arrParsedBlock s = getDataUsingReg exp("'(.{50,250 })".preg_quote( $arrDaten['Plz'])."\s+".preg_qu ote($arrDaten['Ort'])."(.{50,250} )
'is",$content) ;
function getDataUsingReg exp($strRegexp, $string)
{
global $arrDaten;
preg_match_all( $strRegexp, $string, $matches);
$arrListe = array();
for ($i=0; $i< count($matches[0]); $i++)
{
$strData = trim($matches[1][$i].$arrDaten['Plz']."
".$arrDaten['Ort'].$matches[2][$i]);
$arrListe[] = $strData;
}
return $arrListe;
?>
Question:
---------
* How can I extract 3 lines before and after postal code + city?
(instead of a specific number of characters)
2.
<?php
$string = "Kontakt
<br>
Bill Jones
Dr. Bill
Jones<br>
Internet & Webdesign<br>
Examplestreet 9<br>
87354 Munich<br>
Germany<br>
Tel. (0 8 9) 1234 <br>
Handy (0173) 111 <br>
Internet: http://www.foo.com<br>
E-Mail: in**@foo.com";
echo $string;
$output_array = getDataUsingReg exp('#Tel(.*?)< br>#m',$string) ;
var_dump($outpu t_array);
$output_array = getDataUsingReg exp('#Handy(.*? )<br>#m',$strin g);
var_dump($outpu t_array);
?>
Questions:
------------
* I want to extract following data out of a string into an assoziative
array (see above example) e.g.
Array( [Name] => "Bill Jones Dr. Bill Jone"s [Company Name] =>
"Internet & Webdesign" [Street] => "Examplestr eet 9" [City] =>
"87354 Munich" [Country] => "Germany" [Tel] => "(0 8 9) 1234
<br>")
* As a basis I can use a postal code and the city name, with which I
extracted the blocks containing these in step one.
Lines with a telephone number can be identified including words such
as telefon, tel., fon or telephone.
Lines with a fax number can be identified including words such as fax
or telefax.
Lines with a cellural number can be identified including words such as
handy or mobile.
The patterns in my above example are actually very specific and
designed for special cases and are not global at all.
The line above the line holding postal code and city is assumed
holding the street data.
The 2 lines above the line holding the street data are assumed holding
the company name.
Lines between postal code+city and tel. are assumed holding the
country name, where as this is optional. Sometimes there may not even
be any country information available.
I define the separation of lines not only by the separator new line
(/n or <br>) but also strings/characters such as <br> or , or - or :
or ; or |
Since an address can be written in one line, like
Bill Jones | Internet & Webdesign | Examplestreet 9 | 87354 Munich |
1. Company Name
2. Company Name
3. Street Name
3. Postal Code + City name
4. Country Name (optional)
5. Tel.
6. Fax.
7. Handy
5. to 7. can of course differ in order
=> Somehow all sounds simple, but performing a regular expression
pattern is another side of the story... :(
Is there any RegExp professionell out there who could help out? I
would also appreciate detailed explanations, since I'm here to learn!
Thanks a lot!
Rania