Gustaf a écrit :
Hi all,
Just for fun, I'm working on a script to count the number of lines in
source files. Some lines are auto-generated (by the IDE) and shouldn't
be counted. The auto-generated part of files start with "Begin VB.Form"
and end with "End" (first thing on the line). The "End" keyword may
appear inside the auto-generated part, but not at the beginning of the
line.
I imagine having a flag variable to tell whether you're inside the
auto-generated part, but I wasn't able to figure out exactly how. Here's
the function, without the ability to skip auto-generated code:
# Count the lines of source code in the file
def count_lines(f):
file = open(f, 'r')
1/ The param name is not very explicit.
2/ You're shadowing the builtin file type.
3/ It migh be better to pass an opened file object instead - this would
make your function more generic (ok, perhaps a bit overkill here, but
still a better practice IMHO).
rows = 0
Shouldn't that be something like 'line_count' ?
for line in file:
rows = rows + 1
Use augmented assignment instead:
rows += 1
return rows
You forgot to close the file.
How would you modify this to exclude lines between "Begin VB.Form" and
"End" as described above?
Here's a straightforward solution:
def count_loc(path):
loc_count = 0
in_form = False
opened_file = open(path)
try:
# striping lines, and skipping blank lines
for line in opened_file:
line = line.strip()
# skipping blank lines
if not line:
continue
# skipping VB comments
# XXX: comment mark should not be hardcoded
if line.startswith(';'):
continue
# skipping autogenerated code
if line.startswith("Begin VB.Form"):
in_form = True
continue
elif in_form:
if line.startswith("End"):
in_form = False
continue
# Still here ? ok, we count this one
loc_count += 1
finally:
opened_file.close()
return loc_count
HTH
PS : If you prefer a more functional approach
(warning: the following code may permanently damage innocent minds):
def chain(*predicates):
def _chained(arg):
for p in predicates:
if not p(arg):
return False
return True
return _chained
def not_(predicate):
def _not_(arg):
return not predicate(arg)
return _not_
class InGroupPredicate(object):
def __init__(self, begin_group, end_group):
self.in_group = False
self.begin_group = begin_group
self.end_group = end_group
def __call__(self, line):
if self.begin_group(line):
self.in_group = True
return True
elif self.in_group and self.end_group(line):
self.in_group = False
return True # this one too is part of the group
return self.in_group
def count_locs(lines, count_line):
return len(filter(
chain(lambda line: bool(line), count_line),
map(str.strip,lines)
))
def count_vb_locs(lines):
return count_locs(lines, chain(
not_(InGroupPredicate(
lambda line: line.startswith('Begin VB.Form'),
lambda line: line.startswith('End')
)),
lambda line: not line.startswith(';')
))
# and finally our count_lines function, greatly simplified !-)
def count_lines(path):
f = open(path)
try:
return count_vb_locs(f)
finally:
f.close()
(anyone on doing it with itertools ?-)