473,327 Members | 1,920 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,327 software developers and data experts.

HTML Diff problem

Hi,

I am working on an HTML WYSISYG Wiki and need to display a diff page like
WikiPedia does if two people edit a file at the same time to give the second
user the diff. Basically with additions in red and deletions in red strike
though.

There seem to be several in Perl and Python and many diff programs which all
seem to be line based and work on text written in PHP.

So I am after either existing PHP code to do an HTML diff or some help
forming an algorithm todo this.

Many thanks in advance,

Aaron
Oct 22 '08 #1
6 3267
On 22 Oct, 18:54, "Aaron Gray" <ang.use...@gmail.comwrote:
Hi,

I am working on an HTML WYSISYG Wiki and need to display a diff page like
WikiPedia does if two people edit a file at the same time to give the second
user the diff. Basically with additions in red and deletions in red strike
though.

There seem to be several in Perl and Python and many diff programs which all
seem to be line based and work on text written in PHP.

So I am after either existing PHP code to do an HTML diff or some help
forming an algorithm todo this.

Many thanks in advance,
Obvious solution would be to use 'diff' from popen / exec etc

Another approach would be to hack the code from an existing
implementation (like DokuWiki).

But the very first hit for 'php diff' in Google provides what you
describe.

C.
Oct 23 '08 #2
"C. (http://symcbean.blogspot.com/)" <co************@gmail.comwrote in
message
news:60**********************************@d70g2000 hsc.googlegroups.com...
On 22 Oct, 18:54, "Aaron Gray" <ang.use...@gmail.comwrote:
>Hi,

I am working on an HTML WYSISYG Wiki and need to display a diff page like
WikiPedia does if two people edit a file at the same time to give the
second
user the diff. Basically with additions in red and deletions in red
strike
though.

There seem to be several in Perl and Python and many diff programs which
all
seem to be line based and work on text written in PHP.

So I am after either existing PHP code to do an HTML diff or some help
forming an algorithm todo this.

Many thanks in advance,

Obvious solution would be to use 'diff' from popen / exec etc
This is not HTML based.
Another approach would be to hack the code from an existing
implementation (like DokuWiki).
This is not HTML based.
But the very first hit for 'php diff' in Google provides what you
describe.
This is a straight diff not an HTML diff. The problem is comparing two HTML
files, not wiki text files.

Thanks anyway,

Aaron
Oct 23 '08 #3
On Oct 23, 6:39*pm, "Aaron Gray" <ang.use...@gmail.comwrote:
"C. (http://symcbean.blogspot.com/)" <colin.mckin...@gmail.comwrote in
messagenews:60**********************************@d 70g2000hsc.googlegroups..com...
On 22 Oct, 18:54, "Aaron Gray" <ang.use...@gmail.comwrote:
Hi,
I am working on an HTML WYSISYG Wiki and need to display a diff page like
WikiPedia does if two people edit a file at the same time to give the
second
user the diff. Basically with additions in red and deletions in red
strike
though.
There seem to be several in Perl and Python and many diff programs which
all
seem to be line based and work on text written in PHP.
So I am after either existing PHP code to do an HTML diff or some help
forming an algorithm todo this.
Many thanks in advance,
Obvious solution would be to use 'diff' from popen / exec etc

This is not HTML based.
Another approach would be to hack the code from an existing
implementation (like DokuWiki).

This is not HTML based.
But the very first hit for 'php diff' in Google provides what you
describe.

This is a straight diff not an HTML diff. The problem is comparing two HTML
files, not wiki text files.

Thanks anyway,

Aaron
There is no such thing as a HTML diff, or a wiki diff, or a
thingywosit diff. They all use the same algorithm. They scan a
string and look for differences on a line by line basis. You can feed
HTML data into them just as easily as any other kind of data.

For example, using http://www.holomind.de/phpnet/diff2.src.php:

Input 1:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/
TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
<script type="text/javascript" src="/js/jquery/jquery.js" />
</head>
<body>
<pLorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec
sed purus ac
elit tempor lacinia. Fusce viverra! Duis elementum nisl nec metus.
Sed quis arcu.
Morbi tortor! Maecenas ac nisl sit amet felis euismod elementum. Sed
non lacus
sed mi vehicula congue. Vestibulum non felis egestas leo ultricies
ultricies.
Donec ornare scelerisque leo! Vivamus sed nunc quis augue
consectetuer eleifend.
Praesent cursus. Phasellus cursus, nunc eu placerat dictum, metus
leo volutpat
arcu, ac convallis tellus nunc in elit. Maecenas dignissim tincidunt
pede. Vivamus
eget metus. Integer luctus est interdum nibh. Curabitur condimentum
faucibus enim.
Nunc tincidunt ipsum et odio. </p>
<pDonec sed nunc. Fusce accumsan, felis dignissim faucibus cursus,
diam neque
pharetra libero, at rhoncus enim ipsum nec nulla. Donec sed metus.
Suspendisse
potenti. Nam suscipit vehicula risus. Integer vel arcu. Ut nec enim
pulvinar magna
tristique laoreet. Quisque viverra tellus a sapien. Praesent
fringilla. Duis mi
risus, tempus ut, venenatis quis, malesuada eget; velit. Cras nisi.
Nam et ligula.
Duis feugiat lorem at urna. </p>
<pSuspendisse potenti. Integer interdum, dolor sed ullamcorper
dictum; est augue
aliquet eros, at molestie augue est et nunc. Pellentesque habitant
morbi tristique
senectus et netus et malesuada fames ac turpis egestas. Aenean ac
metus? Duis
vel nibh vitae odio hendrerit vulputate. Ut tortor. Ut fermentum
pellentesque
nulla. Sed eros dui, volutpat nec, ultrices feugiat, semper sit
amet, nunc. Pellentesque
habitant morbi tristique senectus et netus et malesuada fames ac
turpis egestas.
Morbi hendrerit nibh. Pellentesque lacinia urna. Sed nisi ligula,
interdum quis,
volutpat ac, placerat commodo, metus. Curabitur venenatis venenatis
quam. Vivamus
cursus, dolor vitae rutrum tempus, erat ipsum volutpat pede, sed
cursus nunc mauris
in ligula. In eget justo id justo pulvinar malesuada. Etiam nisi
est, auctor vitae,
blandit vel, aliquet quis, metus. Cras dui magna, commodo posuere,
dictum convallis,
laoreet vitae, lorem. Praesent luctus, ante at dictum vestibulum, mi
urna pulvinar
velit, ut ultricies erat purus luctus sapien. Praesent molestie
turpis. Pellentesque
bibendum rutrum est. </p>
<pMauris adipiscing ante id neque. Nulla sit amet massa. Nam
consectetuer lorem
sed augue. Nullam in dui! Integer rutrum venenatis nisl.
Pellentesque habitant
morbi tristique senectus et netus et malesuada fames ac turpis
egestas. Suspendisse
id sem eget est rhoncus lobortis. Cras consequat ligula a mi. In hac
habitasse
platea dictumst. Cras vel sem. Quisque adipiscing. Suspendisse
convallis justo
ac nulla. Vivamus tincidunt. In rutrum consequat lorem. </p>
</body>
</html>

Input 2:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://
www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Lorem ipsum</title>
</head>
<body>
<pLorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec
sed purus ac
elit tempor lacinia. Fusce viverra! Duis elementum nisl nec metus.
Sed quis arcu.
Morbi tortor! Maecenas ac nisl sit amet felis euismod elementum. Sed
non lacus
sed mi vehicula congue. Vestibulum non felis egestas leo ultricies
ultricies.
Donec ornare scelerisque leo. Vivamus sed nunc quis augue
consectetuer eleifend.
Praesent cursus. Phasellus cursus, nunc eu placerat dictum, metus
leo volutpat
arcu, ac convallis tellus nunc in elit. Maecenas dignissim tincidunt
pede. Vivamus
eget metus. Integer luctus est interdum nibh. Curabitur condimentum
faucibus enim.
Nunc tincidunt ipsum et odio. </p>
<pSuspendisse potenti. Integer interdum, dolor sed ullamcorper
dictum; est augue
aliquet eros, at molestie augue est et nunc. Pellentesque habitant
morbi tristique
senectus et netus et malesuada fames ac turpis egestas. Aenean ac
metus? Duis
vel nibh vitae odio hendrerit vulputate. Ut tortor. Ut fermentum
pellentesque
nulla. Sed eros dui, volutpat nec, ultrices feugiat, semper sit
amet, nunc. Pellentesque
habitant morbi tristique senectus et netus et malesuada fames ac
turpis egestas.
Morbi hendrerit nibh. Pellentesque lacinia urna. Sed nisi ligula,
interdum quis,
volutpat ac, placerat commodo, metus. Curabitur venenatis venenatis
quam. Vivamus
cursus, dolor vitae rutrum tempus, erat ipsum volutpat pede, sed
cursus nunc mauris
in ligula. In eget justo id justo pulvinar malesuada. Etiam nisi
est, auctor vitae,
blandit vel, aliquet quis, metus. Cras dui magna, commodo posuere,
dictum convallis,
laoreet vitae, lorem. Praesent luctus, ante at dictum vestibulum, mi
urna pulvinar
velit, ut ultricies erat purus luctus sapien. Praesent molestie
turpis. Pellentesque
bibendum rutrum est. </p>
<pLorem ipsum dolor sit amet, consectetuer adipiscing elit.
Vestibulum pede ipsum,
accumsan quis, porttitor ut, blandit sed, neque. Phasellus feugiat
commodo erat.
Praesent pulvinar augue sollicitudin nibh. Nunc tellus. Etiam ac
nunc. Praesent
eget lacus eu dolor fringilla vehicula. Nunc ac nulla nec dui
imperdiet cursus?
Sed dictum nunc nec erat. Donec laoreet magna nec est. Ut vel magna
vel est scelerisque
varius! In nulla leo, luctus sed, rhoncus vel, semper in, eros?
Vestibulum quam
dui, ultrices vitae, consectetuer in, aliquam nec, mauris. Morbi
congue pulvinar
quam! Vivamus tortor. Ut a leo et tortor accumsan pretium. Vivamus
mi. Curabitur
nulla lacus, commodo ut, iaculis vel, mollis eget, felis!
Pellentesque habitant
morbi tristique senectus et netus et malesuada fames ac turpis
egestas. Vestibulum
ut diam. Maecenas facilisis semper mi. </p>
<pMauris adipiscing ante id neque. Nulla sit amet massa. Nam
consectetuer lorem
sed augue. Nullam in dui! Integer rutrum venenatis nisl.
Pellentesque habitant
morbi tristique senectus et netus et malesuada fames ac turpis
egestas. Suspendisse
id sem eget est rhoncus lobortis. Cras consequat ligula a mi. In hac
habitasse
platea dictumst. Cras vel sem. Quisque adipiscing. Suspendisse
convallis justo
ac nulla. Vivamus tincidunt. In rutrum consequat lorem. </p>
<pUt blandit rhoncus tellus. Integer condimentum, turpis sit amet
tempor viverra;
tellus neque mollis dui, sit amet lacinia neque felis a enim. Nulla
ac velit sit
amet erat elementum dignissim. Cras suscipit, felis nec aliquet
auctor, augue
nisi ultrices purus, non convallis tortor odio ac nisi. Nam non
mauris eget nunc
congue euismod. Nam eget sapien at magna tempor aliquam. Nullam
bibendum tempor
velit. Phasellus in elit at mi vestibulum convallis. Integer sit
amet nulla. Pellentesque
habitant morbi tristique senectus et netus et malesuada fames ac
turpis egestas.
Mauris nec magna id nibh sagittis posuere. Nulla a metus vitae
tortor tempus lobortis.
Praesent augue dolor, pulvinar vitae, accumsan vel, bibendum
faucibus, eros? Sed
a ipsum id dui varius blandit! In aliquet suscipit pede. </p>
</body>
</html>

Result:

1c1
< <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/
TR/xhtml11/DTD/xhtml11.dtd">
---
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
5,6c5
< <title>Untitled Document</title>
< <script type="text/javascript" src="/js/jquery/jquery.js" />
---
<title>Lorem ipsum</title>
13c12
< Donec ornare scelerisque leo! Vivamus sed nunc quis augue
consectetuer eleifend.
---
Donec ornare scelerisque leo. Vivamus sed nunc quis augue consectetuer eleifend.
18,23d16
< <pDonec sed nunc. Fusce accumsan, felis dignissim faucibus cursus,
diam neque
< pharetra libero, at rhoncus enim ipsum nec nulla. Donec sed metus.
Suspendisse
< potenti. Nam suscipit vehicula risus. Integer vel arcu. Ut nec
enim pulvinar magna
< tristique laoreet. Quisque viverra tellus a sapien. Praesent
fringilla. Duis mi
< risus, tempus ut, venenatis quis, malesuada eget; velit. Cras
nisi. Nam et ligula.
< Duis feugiat lorem at urna. </p>
37a31,41
<pLorem ipsum dolor sit amet, consectetuer adipiscing elit. Vestibulum pede ipsum,
accumsan quis, porttitor ut, blandit sed, neque. Phasellus feugiat commodo erat.
Praesent pulvinar augue sollicitudin nibh. Nunc tellus. Etiam ac nunc. Praesent
eget lacus eu dolor fringilla vehicula. Nunc ac nulla nec dui imperdietcursus?
Sed dictum nunc nec erat. Donec laoreet magna nec est. Ut vel magna velest scelerisque
varius! In nulla leo, luctus sed, rhoncus vel, semper in, eros? Vestibulum quam
dui, ultrices vitae, consectetuer in, aliquam nec, mauris. Morbi conguepulvinar
quam! Vivamus tortor. Ut a leo et tortor accumsan pretium. Vivamus mi. Curabitur
nulla lacus, commodo ut, iaculis vel, mollis eget, felis! Pellentesque habitant
morbi tristique senectus et netus et malesuada fames ac turpis egestas.Vestibulum
ut diam. Maecenas facilisis semper mi. </p>
43a48,57
<pUt blandit rhoncus tellus. Integer condimentum, turpis sit amet tempor viverra;
tellus neque mollis dui, sit amet lacinia neque felis a enim. Nulla ac velit sit
amet erat elementum dignissim. Cras suscipit, felis nec aliquet auctor,augue
nisi ultrices purus, non convallis tortor odio ac nisi. Nam non mauris eget nunc
congue euismod. Nam eget sapien at magna tempor aliquam. Nullam bibendum tempor
velit. Phasellus in elit at mi vestibulum convallis. Integer sit amet nulla. Pellentesque
habitant morbi tristique senectus et netus et malesuada fames ac turpisegestas.
Mauris nec magna id nibh sagittis posuere. Nulla a metus vitae tortor tempus lobortis.
Praesent augue dolor, pulvinar vitae, accumsan vel, bibendum faucibus, eros? Sed
a ipsum id dui varius blandit! In aliquet suscipit pede. </p>
Oct 24 '08 #4
"Gordon" <go**********@ntlworld.comwrote in message
news:c4**********************************@k30g2000 hse.googlegroups.com...
>There is no such thing as a HTML diff, or a wiki diff, or a
thingywosit diff. They all use the same algorithm. They scan a
string and look for differences on a line by line basis. You can feed
HTML data into them just as easily as any other kind of data.
Well, I think there needs to be one.

You cannot presume HTML is always line based for starters.

Plain Diff does not comeup with the correct goods !

They do exist PERL has one, Python has one.

I need to code one in PHP !

Regards,

Aaron
Oct 24 '08 #5
"Aaron Gray" <an********@gmail.comwrote in message
news:6m************@mid.individual.net...
Hi,

I am working on an HTML WYSISYG Wiki and need to display a diff page like
WikiPedia does if two people edit a file at the same time to give the
second user the diff. Basically with additions in red and deletions in red
strike though.

There seem to be several in Perl and Python and many diff programs which
all seem to be line based and work on text written in PHP.

So I am after either existing PHP code to do an HTML diff or some help
forming an algorithm todo this.
Doing Diff's via DOM tree's look attractive. Maybe it could be done in XSLT
?

Aaron
Oct 24 '08 #6
"Aaron Gray" <an********@gmail.comwrote in message
news:6m************@mid.individual.net...
Hi,

I am working on an HTML WYSISYG Wiki and need to display a diff page like
WikiPedia does if two people edit a file at the same time to give the
second user the diff. Basically with additions in red and deletions in red
strike though.

There seem to be several in Perl and Python and many diff programs which
all seem to be line based and work on text written in PHP.

So I am after either existing PHP code to do an HTML diff or some help
forming an algorithm todo this.
I have been looking into using XHTML/XML and DOM, this gives the tree
walking mechanism, but the actual diff still looks difficult :)

Aaron
Oct 24 '08 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Dan Gass | last post by:
The difflib.py module and the diff.py tools script in Python 2.4 alpha 3 now support generating side by side (with intra line differences) in HTML format. I have found this useful for performing...
5
by: anilby | last post by:
Hi, Could anyone tell me which are all the BEST HTML difference finding tools available. I mean the visual difference and give a report file either in text or HTML format. I used...
9
by: Ching-Lung | last post by:
Hi all, I try to create a tool to check the delta (diff) of 2 binaries and create the delta binary. I use binary formatter (serialization) to create the delta binary. It works fine but the...
4
by: ddd | last post by:
I am trying to build a diff tool that allows me to compare two HTML files. I am looking for resources on how to achive this. The main problem is that I do not want to simply highlight the line of...
4
by: Andreas Kasparek | last post by:
Hola! I'm preparing my master thesis about a XML Merge Tool implementation and was wondering if there is any open standard for XML diff regarding topics like: - is a diff result computed on...
0
by: manuel.reil | last post by:
Hello, currently i am developing a very small cms using python and cheetah. very early i have noticed that i was lacking the method to extract/recover the contents (html,text) from the html that...
3
by: =?Utf-8?B?UG9vamE=?= | last post by:
Hi I have been using Microsoft XmlDiffPatch to compare 2 XML files. I wanted to know if there is any Microsoft Tool which can be used to compare two HTML files in the similar manner or any...
8
by: irek.szczesniak | last post by:
Hi, I have table pairs that I need to compare, and produce another table that shows differences. I can't just open them in separate browser and look for differences, because I have many such...
1
by: Andy Fish | last post by:
hi, I am looking for a library (i.e. not a standalone GUI program) that can do diff and merge of HTML or XML, preferably in C# or at least that can be called from C# anyone know of such a...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.