473,387 Members | 1,493 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

a commandline tool to drop css and javascript?

Hello. I am looking for a commandline tool to take an html document (or
html document segment, a.k.a. without beginign
"<html><head>..</head><body>") and process it by removing all css style
settings and javascripts, and output a clean html/xhtml.

Optionally, it would be nice if this tool can take an
acceptable tag list and remove all tags not in this list.

I need such a tool to process a lot of static html document I am working
on. Do you happen to know such a tool? I am still googling around ;) I
tried tidy but there seems not to be an option to remove css.

Thanks a lot!
Jan 28 '07 #1
2 2910
Gazing into my crystal ball I observed Zhang Weiwu
<zh********@realss.comwriting in news:41lt84-nrt2.ln1
@exupery.realss.com:
Hello. I am looking for a commandline tool to take an html document
(or
html document segment, a.k.a. without beginign
"<html><head>..</head><body>") and process it by removing all css
style
settings and javascripts, and output a clean html/xhtml.

Optionally, it would be nice if this tool can take an
acceptable tag list and remove all tags not in this list.

I need such a tool to process a lot of static html document I am
working
on. Do you happen to know such a tool? I am still googling around ;) I
tried tidy but there seems not to be an option to remove css.

Thanks a lot!
Can you use search and replace? How about looking for style=" . Seems
to me search and replace will be what you want to do. Google for a good
search and replace tool, or I am sure someone will be around shortly to
tell you another way.
--
Adrienne Boswell at Home
Arbpen Web Site Design Services
http://www.cavalcade-of-coding.info
Please respond to the group so others can share

Jan 31 '07 #2
On Jan 28, 6:22 am, Zhang Weiwu <zhangwe...@realss.com>
wrote:
Hello. I am looking for a commandline tool to take an
html document (or html document segment, a.k.a. without
beginign "<html><head>..</head><body>") and process it by
removing all css style settings and javascripts, and
output a clean html/xhtml.

Optionally, it would be nice if this tool can take an
acceptable tag list and remove all tags not in this list.

I need such a tool to process a lot of static html
document I am working on. Do you happen to know such a
tool? I am still googling around ;) I tried tidy but
there seems not to be an option to remove css.
Unless your source HTML is so tag-soupy no sane HTML parser
can grok it, XSLT is great for this kind of stuff. Of
course, you'll also need an XSLT processor that can
transform HTML documents (libxslt can do that, and probably
many others).

pavel@debian:~/dev/xslt$ cat raw.html
<!DOCTYPE HTML PUBLIC
"-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>Test</title>
<style type="text/css">
body { font-family : monospace ; }
</style>
<script type="text/javascript">
function oink ( ) { alert ( 'oink!' ) ; }
</script>
</head>
<body>
<div style=" color : blue ;">
<span style=" font-style : italic ; "
onclick=" oink ( ) ; ">oink!</span>
</div>
</body>
</html>

pavel@debian:~/dev/xslt$ cat strip_jscss.xsl
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="style|script|@style|@onclick"/>
</xsl:stylesheet>

pavel@debian:~/dev/xslt$ xsltproc -html strip_jscss.xsl
raw.html
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8">
<title>Test</title>
</head>
<body><div>
<span>oink!</span>
</div></body>
</html>

Naturally, you'll want to tinker with xsl:output to get
valid HTML as an output, and you'll need to fine-tune the
exclusion template to handle all the event handler
attributes etc. xsltproc is a command-line utility that
comes with libxslt, but as I said, I'd expect most of XSLT
processors capable of transforming HTML as well.

--
Pavel Lepin

Feb 1 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

42
by: Steven O. | last post by:
I am seeking some kind of tool that I can use for GUI prototyping. I know how to use Visual Basic, but since a lot of software is being coded in Java or C++, I'd like to learn a Java or C++ -based...
3
by: KK | last post by:
Drop-down menus are the hottest thing since Wonder Bread but . . . 1. Alot of people put them in the they-look-nice-but-you-cant-code-them-right-so-they-always-look-messed-up category (a la...
1
by: Dan | last post by:
This is one that has me stumped and I need an expert's input. Any ideas why the values from the second script-generated drop down list isn't recognized by the script to add time values to the...
4
by: nick | last post by:
Hi all: In the winform i want to run commandline with return value(string), then display this string in winform textbox. anyone has idea about that? Thanks Nick
2
by: Christoph Borger | last post by:
Hello! I have wrote a windows service in vb.net. This service monitors the running processes with WMI and the Win32_Process class. Till last month all seems ok. But since the begin of september...
3
by: PSI_Orion | last post by:
I am writing a database which I edit locally using PHP / MySQL / JavaScript. What I currently do is to list all my DVD covers from a single folder into a drop box but what I would LIKE to do is to...
0
by: axlq | last post by:
While trying to learn the ins and outs of the php CURL library, I decided to write a php script that posts a form on the Chicago Board of Options (CBOE) web site, which returns an ASCII text file. ...
2
by: andreas.huesgen | last post by:
Hello, I am writing a commandline tool in Python which is often feed with lots of commandline arguments. In practice, the commandline already reached a critical length which is to long for Windows...
7
by: coolnags | last post by:
Hi, As we know IE SELECTdrop down list has limitations on tool tip on individual option items and also horizontal scroll bar. Can any one please tell any other alternative to show the tool...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.