David wrote:
Hi,
Could PHP be used to take a txt file (or set of txt files) and add a
string of characters every X number of words or characters?
$text = file_get_conten ts('/path/to/text.txt');
$text = chunk_split($te xt,5000,$string _to_add);
Say a txt file with 50,000 characters/5,000 words how would you go
about adding a string of characters every 5,000 characters or 500
words.
For characters it's easy, see above.
For words, it's a little bit harder. One could fiddle around with
str_word_count( ), but I would not think that the best solution.
If it does not have to be an exact:
preg_match_all( '/(?:(?:^|\W*)\w* ){0,500}/s',$text,$match es);
$text = implode($matche s[0],$string_to_add );
To improve on this I'd want to if using characters as the guide to use
a space or better yet a line break as the point to add the string of
characters. So 5,000 characters to the nearest line break.
********** TRY 1 *************** *************** ***********
/* settings */
$string_to_add = 'Hey, this is added!!!!!!!';
$char_to_split = "\n";
$charcount_to_s plit = 200;
/* match char_to_split */
$char_to_split = preg_quote($cha r_to_split);
preg_match_all( '/'.$char_to_spli t .'/',$text,$matche s,PREG_OFFSET_C APTURE);
/* add difference to desired position, and which occurance */
$available_line _breaks = $matches[0];
function diffs(&$value,$ key,$number){
$occ = round($value[1]/$number,0);
$value['occ'] = $occ;
$value['diff'] = abs($value[1] - ($occ * $number));
}
array_walk($ava ilable_line_bre aks,'diffs',$ch arcount_to_spli t);
/* determine which line-break is closest */
$closest = array();
function closest(&$value ,$key,&$closest ){
if(!isset($clos est[$value['occ']]) || $closest[$value['occ']]['diff'] >
$value['diff']){
$closest[$value['occ']] = array('diff' =$value['diff'],'offset' =>
$value[1]);
}
}
array_walk($ava ilable_line_bre aks,'closest',& $closest);
array_walk($clo sest, create_function ('&$a','$a = $a["offset"];'));
/* this code means that if there are no available line-breaks around, there
will be no value. To illustrate: */
$not_set =
array_diff(rang e(1,floor(strle n($text)/$charcount_to_s plit)),array_ke ys($cl
osest));
echo "For the following repeats of $charcount_to_s plit, no linebreaks were
found:".implode (',',$not_set);
/* you could search for a word-boundary (\W) in that region, I've left that
out */
/* Let's add the string, form last to first, otherwise our offset is off...
*/
krsort($closest );
foreach($closes t as $target){
$text =
substr_replace( $text,$string_t o_add,$target+s trlen($char_to_ split),0);
}
*************** *************** *************** **********
But offcourse, this is bullsh*t.
********** TRY 2 *************** *************** ***********
$text = text to adapt.
$string = string to add.
$count = preferred number of characters.
$split = string to split on.
$variance = the number of characters to search left and right.
function replace_text_se veral_times($te xt,$insert,$cou nt,$split,$vari ance =
50){
$split = preg_quote($spl it,'/');
$regex =
'/(.{'.($count-$variance).','. ($count+$varian ce).'})('.$spli t.')/si';
return preg_replace($r egex,'$1$2'.$in sert,$text);
}
The code above will not be near the exact number of characters, but will
nevertheless repeat the string as often as you like provided your $split
occurs.
--
Rik Wasmus