Ruby regex for removing C/Java-style /* ... */ comments
Question posted by: beatTheDevil
(Newbie)
on
July 31st, 2007 06:52 PM
Hey guys,
As the title says I'm trying to make a regular expression (regex/regexp) for use in removing the comments from code. In this case, this particular regex is meant to match /* ... */ comments.
I'm using Ruby v.1.8.6
Here's my regex:
- multiline_comments = /\/\*(.*?)\*\//
When I try
- myStr.gsub(multiline_comments, "")
I see no effect. The string has big fat comments in it too. I tried using this regex in irb with a couple test strings, and it works perfectly. This leads me to think I don't understand some subtlety of file io, so here's all my code (cop-out, I know). I'm trying to write a very simple JavaScript compacter, but I want to preserve readability so I'm only getting rid of unnecessary newlines, spaces in between tokens, and comments. I DON'T want the whole file on one line like some compactors do it. Anyway, here goes:
-
# Non-destructive JavaScript Packer
-
# =================================
-
#
-
# Reduces overall script filesize by removing comments
-
# and unecessary whitespace. Does not affect variable naming,
-
# indentation, or line-by-line formatting in order to maximize
-
# readability.
-
-
def pack_line(file_line)
-
-
return '' unless file_line
-
-
#puts "The next line: " + file_line
-
-
#kill one-line (//...) comments
-
line_comments = /(\S*)\s*\/\/.*/
-
intermed = file_line.gsub(line_comments, '\1')
-
intermed += "\n" if intermed[intermed.length - 1] != "\n"
-
-
#puts "\tAfter one-liner removal: " + intermed
-
-
#kill unnecessary whitespace
-
extra_whitespace = /([^(var|function|return|\s*)])[ \t]+(.*?)/
-
intermed = intermed.gsub(extra_whitespace, '\1\2')
-
-
#puts "\tAfter extra whitespace removal: " + intermed
-
-
intermed
-
end
-
-
#performs the packing operation, returns a single string
-
#representing the packed document
-
def pack(file)
-
lines = Array.new
-
-
file.each_line do |line|
-
lines.push pack_line(line)
-
end
-
-
intermed = lines.join
-
-
#puts "\tBefore multi-liner removal: " + intermed
-
-
#kill multi-line (/* ... */) comments
-
multiline_comments = /\/\*(.*?)\*\//
-
intermed = intermed.gsub(multiline_comments, '')
-
-
#puts "\tAfter multi-liner removal: " + intermed
-
-
#kill extra new lines
-
extra_newlines = /(\r?\n){2,}/
-
intermed = intermed.gsub(extra_newlines, "\n")
-
-
#puts "\tFinally: " + intermed + "\n"
-
-
intermed
-
end
-
-
#open file for reading and pass it to pack()
-
def init(in_file, out_file)
-
file = File.new(in_file, "r")
-
-
newDoc = pack(file)
-
-
file.close
-
-
if out_file then
-
file = File.new(out_file, "w")
-
-
file.puts(newDoc)
-
-
file.close
-
else
-
puts newDoc
-
end
-
end
-
-
#start the script with the command-line arg file name
-
puts init(ARGV[0], ARGV[1])
Any ideas? Thanks for all your help.
2
Answers Posted
More specifically, even though it works fine on one-line strings, I think I've found that it's unable to match this style of comments across new lines ("\n"). Is there a way to get around this? I thought the '.' matched any character whatsoever...
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/11137
|
|
|
What is Bytes?
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over 197,040 network members.
Top Ruby / Rails Contributors
|