Beautification Script - Regular Expressions

410

Expert 256MB

Hi all,

I am working on a VHDL code beautifier with Perl. I've come to this part of the beautification process and I got really stuck. Assume for example the following piece of VHDl code:

CODE

Expand|Select|Wrap|Line Numbers

entity JK_FF is

  port( clock : in std_logic;

         J, K : in std_logic;

        reset : in std_logic;

        Q, Qbar : out std_logic);

end JK_FF;

Well I'm trying to figure out the regular expressions to transform it to that:

Expand|Select|Wrap|Line Numbers

entity JK_FF is

  port( clock : in  std_logic;

        J     : in  std_logic;

        K     : in  std_logic;

        reset : in  std_logic;

        Q     : out std_logic;

        Qbar  : out std_logic);

end JK_FF;

Hence, briefly,
i. Place all words between 'port(' and ');' in columns.
ii. Separate
<signal_name_1>, <signal_name_2>,...,<signal_name_n> : <direction> <type>;

to

<signal_name_1> : <direction> <type>;
<signal_name_2> : <direction> <type>;
...
<signal_name_n> : <direction> <type>;

Any help, suggestion is more than welcomed

Thanks in advance!

This can be done using a split() function to split on commas for each line in the file, then splitting the last element of the resulting array on colon to get the last two fields.
It would be good to know what you have tried so far!

Mar 24 '08 #2

This can be done using a split() function to split on commas for each line in the file, then splitting the last element of the resulting array on colon to get the last two fields.
It would be good to know what you have tried so far!

Yes you're right! Sorry I forgot to mention, but I've tried this lot already:

Expand|Select|Wrap|Line Numbers

 
use strict;

#use warnings;
 
my @data;

#push @data, [split (/\s+/, $_)] for <DATA>;

push @data, [split (' ', $_)] for <DATA>;
 
foreach my $row(0..8) {

foreach my $col(0..(@data-1)) {

printf("%-15s", $data[$row][$col]);

}

print "\n";

}
 
__DATA__

clk         : in std_logic;

areset        : in std_logic;

busy : out std_logic;

writeEnable : in std_logic;

readEnable : in std_logic;

write    : in std_logic_vector(wordSize-1 downto 0);

read    : out std_logic_vector(wordSize-1 downto 0);

addr : in std_logic_vector(maxAddrBit downto minAddrBit));

eventhough <wordSize-1 downto> has a space separator, for some reason I get this:

write : in std_logic_vector(wordSize-1downto 0);

Any ideas?

Mar 24 '08 #3

410

Expert 256MB

I feel the problem lies in this line:

Expand|Select|Wrap|Line Numbers

foreach my $col(0..(@data-1))

In your script, @data is changing dynamically(also the number of elements). But the range cannot be varying inside foreach() loop. Hence, the range will take number of elements in first @data(first line) as upperlimit of the range.
Therefore, the column count would end after "std_logic_vector(wordSize-1".
Also, you are unconditionally splitting on spaces,though you require exactly 4 fields to be aligned/ formatted. For this purpose, you can make use of third argument in split() function. This number would tell the exact number of splits to be made. The string after these many delimiter characters would become the last element of the array.
Use:

Expand|Select|Wrap|Line Numbers

push @data, [split (/\s+/, $_,4)] for <DATA>;

Mar 24 '08 #4

Bingo! That worked exactly the way I want it! Thanks a lot!

Now the tricky part (for me it is!), is how to:

Separate
<signal_name_1>, <signal_name_2>,...,<signal_name_n> : <direction> <type>;

to

<signal_name_1> : <direction> <type>;
<signal_name_2> : <direction> <type>;
...
<signal_name_n> : <direction> <type>;

Well, the above special case, may or may not exist, so some sort of detection is required...I can roughly think a way of using split() (switch rows to col etc.), concat the end ;, and multiple if's, but looks quite dodgy. I'd rather prefer a better more neat way of doing it. Could you suggest anything? Especially for the detection part!

Mar 24 '08 #5

KevinADC

4,059

Expert 2GB

Bingo! That worked exactly the way I want it! Thanks a lot!

Now the tricky part (for me it is!), is how to:

Separate
<signal_name_1>, <signal_name_2>,...,<signal_name_n> : <direction> <type>;

to

<signal_name_1> : <direction> <type>;
<signal_name_2> : <direction> <type>;
...
<signal_name_n> : <direction> <type>;

Well, the above special case, may or may not exist, so some sort of detection is required...I can roughly think a way of using split() (switch rows to col etc.), concat the end ;, and multiple if's, but looks quite dodgy. I'd rather prefer a better more neat way of doing it. Could you suggest anything? Especially for the detection part!

And it will be fairly tricky. You could use a hash of arrays.

<direction> will be (it appears) one of two values (boolean) "in" or "out". <type> looks like it could be just about anything but i assume its everything after the <direction> indicator. You would use those two pieces of information as hash keys. Then you would push the <signal_name> indicator into the approrpiate array. One possible draw back is the loss of order of the data, but if the original order is not important then that is not a problem.

Mar 24 '08 #6

And it will be fairly tricky. You could use a hash of arrays.

<direction> will be (it appears) one of two values (boolean) "in" or "out". <type> looks like it could be just about anything but i assume its everything after the <direction> indicator. You would use those two pieces of information as hash keys. Then you would push the <signal_name> indicator into the approrpiate array. One possible draw back is the loss of order of the data, but if the original order is not important then that is not a problem.

Could you spare me an example? Little something to start feedling with! Hope am not asking too much.

Mar 24 '08 #7

410

Expert 256MB

Could you spare me an example? Little something to start feedling with! Hope am not asking too much.

Using hash of arrays is a good approach. But, from your initial description, I assume you need to retain the order.This can be done using array of arrays itself, though bit lengthy.
The following code would do the job:

Expand|Select|Wrap|Line Numbers

 
use strict;
 
my @data;

for (<DATA>) {

###checking for commas. Otherwise even these signals can be split

##into separate elements if there is space before/after comma

unless(/,/) {           

push @data, [split (/\s+/, $_,4)];

} else {

  $_=~s/\s*,\s*/,/;

  push @data, [split (/\s+/, $_,4)]; 

}

}
 
foreach my $row(0..8) {

my @signals;my @other;

my $multi;

foreach my $col(0..(@data-1)) {

if($data[$row][$col]=~/,/) {

  $multi=1;

  @signals=split(/,/,$data[$row][$col]); ##separate out signals

  until($col==(@data-1)) {

      $col++;

      ##take out corresponding type and direction

       push @other,$data[$row][$col]; 

       }

foreach(@signals) {

print "\n";

printf("%-15s",$_); 

printf("%-15s", $_) foreach(@other);

}

last;

} 

else {

printf("%-15s", $data[$row][$col]);

}

}

print "\n";

}
 
__DATA__

clk         : in std_logic;

areset,reset        : in std_logic;

busy : out std_logic;

writeEnable : in std_logic;

readEnable, modifyEnable : in std_logic;

write, copy    : in std_logic_vector(wordSize-1 downto 0);

read  ,clock    : out std_logic_vector(wordSize-1 downto 0);

addr : in std_logic_vector(maxAddrBit downto minAddrBit));

Mar 25 '08 #8

Thanks a lot, yes thats the idea more or less. Now, probably should have justified that from start, but I dont really want to print the formatted text, but collect it into a buffer in order to replace the original part with the formatted one. Any ideas how to achieve that in your existing code?

Mar 25 '08 #9

410

Expert 256MB

Thanks a lot, yes thats the idea more or less. Now, probably should have justified that from start, but I dont really want to print the formatted text, but collect it into a buffer in order to replace the original part with the formatted one. Any ideas how to achieve that in your existing code?

If you want to modify the file containing data according to format, all you need to do is to read from that file, write into a temporary file and later change the temporary file to data file.You can use this example:

Expand|Select|Wrap|Line Numbers

 
use strict;
 
my @data;

open(DATA,"data.txt") or die "read failed:$!";

open(TEMP,">temp.txt") or die "write failed:$!";  ##open temporary file

for (<DATA>) {

s/^\s*//;       ## trim out spaces from beginning of line

###checking for commas. Otherwise even these signals can be split

##into separate elements if there is space before/after comma

unless(/,/) {           

push @data, [split (/\s+/, $_,4)];

} else {

  $_=~s/\s*,\s*/,/;

  push @data, [split (/\s+/, $_,4)]; 

}

}
 
foreach my $row(0..(@data-1)) {   ### upto last row

my @signals;my @other;

my $multi;

foreach my $col(0..(@{$data[$row]}-1)) {   ### upto last element in the row

if($data[$row][$col]=~/,/) {

  $multi=1;

  @signals=split(/,/,$data[$row][$col]); ##separate out signals

  until($col==(@data-1)) {

      $col++;

      ##take out corresponding type and direction

       push @other,$data[$row][$col]; 

       }

foreach(@signals) {

print TEMP "\n";

printf TEMP ("%-15s",$_); 

printf TEMP ("%-15s", $_) foreach(@other);

}

last;

} 

else {

printf TEMP ("%-15s", $data[$row][$col]);

}

}

print TEMP "\n";

}

close(DATA); 

close(TEMP);

##change temp.txt to data.txt

rename("temp.txt","data.txt") or die "rename failed:$!";

Also, note the change in range used for $row and $column. This should be the range you need to ideally use to parse through all rows and all columns in each row.

Mar 26 '08 #10

KevinADC

4,059

Expert 2GB

"rharsh" on Tek-Tips has already written you a 99% working solution. "nithinpes " code is largely a duplication of that code. You seem a bit disengenuous to me by not informing either forum you are also getting help from another forum.

Mar 26 '08 #11

removing content between specified tokens using java script

"rharsh" on Tek-Tips has already written you a 99% working solution. "nithinpes " code is largely a duplication of that code. You seem a bit disengenuous to me by not informing either forum you are also getting help from another forum.

You probably mean 'disingenuous'...Well it never crossed my mind that querying multiple sources possesses a form of disingenuousness! Based on that, I should be reading one and only one, say Perl book, but not two or more even worse, cause this would make me disingenuous to the author of the first book! Nevertheless, I truly apologize if that insulted you in any way!

Mar 26 '08 #12

by: Kenneth McDonald | last post by:

I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make...

Python

by: rajarao | last post by:

hi I want to remove the content embedded in <script> and </script> tags submitted via text box. My java script should remove the content embedded between <script> and </script> tag. my current...

Javascript

Regular Expressions

by: Sehboo | last post by:

Hi, I have several regular expressions that I need to run against documents. Is it possible to combine several expressions in one expression in Regex object. So that it is faster, or will I...

Visual Basic .NET

Using regular expressions in LIKE

by: Együd Csaba | last post by:

Hi All, I'd like to "compress" the following two filter expressions into one - assuming that it makes sense regarding query execution performance. .... where (adate LIKE "2004.01.10 __:30" or...

PostgreSQL Database

Trying to find regex for any script in an html source

by: 28tommy | last post by:

Hi, I'm trying to find scripts in html source of a page retrieved from the web. I'm trying to use the following rule: match = re.compile('<script + src=+>') I'm testing it on a page that...

Python

Regular expression optimization

by: Billa | last post by:

Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...

.NET Framework

Dynamic list of regular expressions, find the one that matches.

by: Allan Ebdrup | last post by:

I have a dynamic list of regular expressions, the expressions don't change very often but they can change. And I have a single string that I want to match the regular expressions against and find...

C# / C Sharp

Java Script Regular Expressions

by: Dempsey.Jeff | last post by:

I am a wreck at regular expressions so I could use a little help. Say I have a url that looks something like this. http://test.com/test/test2/tabid/656/Default.aspx. I need to be able to pull...

Javascript

Stripping scripts from HTML with regular expressions

by: Michel Bouwmans | last post by:

Hey everyone, I'm trying to strip all script-blocks from a HTML-file using regex. I tried the following in Python: testfile = open('testfile') testhtml = testfile.read() regex =...

Python

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...