473,396 Members | 1,997 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Beautification Script - Regular Expressions

Hi all,

I am working on a VHDL code beautifier with Perl. I've come to this part of the beautification process and I got really stuck. Assume for example the following piece of VHDl code:

CODE
Expand|Select|Wrap|Line Numbers
  1. entity JK_FF is
  2.   port( clock : in std_logic;
  3.          J, K : in std_logic;
  4.         reset : in std_logic;
  5.         Q, Qbar : out std_logic);
  6. end JK_FF;
  7.  
Well I'm trying to figure out the regular expressions to transform it to that:

Expand|Select|Wrap|Line Numbers
  1. entity JK_FF is
  2.   port( clock : in  std_logic;
  3.         J     : in  std_logic;
  4.         K     : in  std_logic;
  5.         reset : in  std_logic;
  6.         Q     : out std_logic;
  7.         Qbar  : out std_logic);
  8. end JK_FF;
  9.  
Hence, briefly,
i. Place all words between 'port(' and ');' in columns.
ii. Separate
<signal_name_1>, <signal_name_2>,...,<signal_name_n> : <direction> <type>;

to

<signal_name_1> : <direction> <type>;
<signal_name_2> : <direction> <type>;
...
<signal_name_n> : <direction> <type>;

Any help, suggestion is more than welcomed

Thanks in advance!
Mar 24 '08 #1
11 1463
nithinpes
410 Expert 256MB
Hi all,

I am working on a VHDL code beautifier with Perl. I've come to this part of the beautification process and I got really stuck. Assume for example the following piece of VHDl code:

CODE
Expand|Select|Wrap|Line Numbers
  1. entity JK_FF is
  2.   port( clock : in std_logic;
  3.          J, K : in std_logic;
  4.         reset : in std_logic;
  5.         Q, Qbar : out std_logic);
  6. end JK_FF;
  7.  
Well I'm trying to figure out the regular expressions to transform it to that:

Expand|Select|Wrap|Line Numbers
  1. entity JK_FF is
  2.   port( clock : in  std_logic;
  3.         J     : in  std_logic;
  4.         K     : in  std_logic;
  5.         reset : in  std_logic;
  6.         Q     : out std_logic;
  7.         Qbar  : out std_logic);
  8. end JK_FF;
  9.  
Hence, briefly,
i. Place all words between 'port(' and ');' in columns.
ii. Separate
<signal_name_1>, <signal_name_2>,...,<signal_name_n> : <direction> <type>;

to

<signal_name_1> : <direction> <type>;
<signal_name_2> : <direction> <type>;
...
<signal_name_n> : <direction> <type>;

Any help, suggestion is more than welcomed

Thanks in advance!
This can be done using a split() function to split on commas for each line in the file, then splitting the last element of the resulting array on colon to get the last two fields.
It would be good to know what you have tried so far!
Mar 24 '08 #2
This can be done using a split() function to split on commas for each line in the file, then splitting the last element of the resulting array on colon to get the last two fields.
It would be good to know what you have tried so far!
Yes you're right! Sorry I forgot to mention, but I've tried this lot already:

Expand|Select|Wrap|Line Numbers
  1. use strict;
  2. #use warnings;
  3.  
  4. my @data;
  5. #push @data, [split (/\s+/, $_)] for <DATA>;
  6. push @data, [split (' ', $_)] for <DATA>;
  7.  
  8. foreach my $row(0..8) {
  9. foreach my $col(0..(@data-1)) {
  10. printf("%-15s", $data[$row][$col]);
  11. }
  12. print "\n";
  13. }
  14.  
  15. __DATA__
  16. clk         : in std_logic;
  17. areset        : in std_logic;
  18. busy : out std_logic;
  19. writeEnable : in std_logic;
  20. readEnable : in std_logic;
  21. write    : in std_logic_vector(wordSize-1 downto 0);
  22. read    : out std_logic_vector(wordSize-1 downto 0);
  23. addr : in std_logic_vector(maxAddrBit downto minAddrBit));
  24.  
eventhough <wordSize-1 downto> has a space separator, for some reason I get this:

write : in std_logic_vector(wordSize-1downto 0);

Any ideas?
Mar 24 '08 #3
nithinpes
410 Expert 256MB
I feel the problem lies in this line:
Expand|Select|Wrap|Line Numbers
  1. foreach my $col(0..(@data-1))
  2.  
In your script, @data is changing dynamically(also the number of elements). But the range cannot be varying inside foreach() loop. Hence, the range will take number of elements in first @data(first line) as upperlimit of the range.
Therefore, the column count would end after "std_logic_vector(wordSize-1".
Also, you are unconditionally splitting on spaces,though you require exactly 4 fields to be aligned/ formatted. For this purpose, you can make use of third argument in split() function. This number would tell the exact number of splits to be made. The string after these many delimiter characters would become the last element of the array.
Use:
Expand|Select|Wrap|Line Numbers
  1. push @data, [split (/\s+/, $_,4)] for <DATA>;
  2.  
Mar 24 '08 #4
Bingo! That worked exactly the way I want it! Thanks a lot!

Now the tricky part (for me it is!), is how to:

Separate
<signal_name_1>, <signal_name_2>,...,<signal_name_n> : <direction> <type>;

to

<signal_name_1> : <direction> <type>;
<signal_name_2> : <direction> <type>;
...
<signal_name_n> : <direction> <type>;

Well, the above special case, may or may not exist, so some sort of detection is required...I can roughly think a way of using split() (switch rows to col etc.), concat the end ;, and multiple if's, but looks quite dodgy. I'd rather prefer a better more neat way of doing it. Could you suggest anything? Especially for the detection part!
Mar 24 '08 #5
KevinADC
4,059 Expert 2GB
Bingo! That worked exactly the way I want it! Thanks a lot!

Now the tricky part (for me it is!), is how to:

Separate
<signal_name_1>, <signal_name_2>,...,<signal_name_n> : <direction> <type>;

to

<signal_name_1> : <direction> <type>;
<signal_name_2> : <direction> <type>;
...
<signal_name_n> : <direction> <type>;

Well, the above special case, may or may not exist, so some sort of detection is required...I can roughly think a way of using split() (switch rows to col etc.), concat the end ;, and multiple if's, but looks quite dodgy. I'd rather prefer a better more neat way of doing it. Could you suggest anything? Especially for the detection part!
And it will be fairly tricky. You could use a hash of arrays.

<direction> will be (it appears) one of two values (boolean) "in" or "out". <type> looks like it could be just about anything but i assume its everything after the <direction> indicator. You would use those two pieces of information as hash keys. Then you would push the <signal_name> indicator into the approrpiate array. One possible draw back is the loss of order of the data, but if the original order is not important then that is not a problem.
Mar 24 '08 #6
And it will be fairly tricky. You could use a hash of arrays.

<direction> will be (it appears) one of two values (boolean) "in" or "out". <type> looks like it could be just about anything but i assume its everything after the <direction> indicator. You would use those two pieces of information as hash keys. Then you would push the <signal_name> indicator into the approrpiate array. One possible draw back is the loss of order of the data, but if the original order is not important then that is not a problem.
Could you spare me an example? Little something to start feedling with! Hope am not asking too much.
Mar 24 '08 #7
nithinpes
410 Expert 256MB
Could you spare me an example? Little something to start feedling with! Hope am not asking too much.
Using hash of arrays is a good approach. But, from your initial description, I assume you need to retain the order.This can be done using array of arrays itself, though bit lengthy.
The following code would do the job:
Expand|Select|Wrap|Line Numbers
  1. use strict;
  2.  
  3. my @data;
  4. for (<DATA>) {
  5. ###checking for commas. Otherwise even these signals can be split
  6. ##into separate elements if there is space before/after comma
  7. unless(/,/) {           
  8. push @data, [split (/\s+/, $_,4)];
  9. } else {
  10.   $_=~s/\s*,\s*/,/;
  11.   push @data, [split (/\s+/, $_,4)]; 
  12. }
  13. }
  14.  
  15. foreach my $row(0..8) {
  16. my @signals;my @other;
  17. my $multi;
  18. foreach my $col(0..(@data-1)) {
  19. if($data[$row][$col]=~/,/) {
  20.   $multi=1;
  21.   @signals=split(/,/,$data[$row][$col]); ##separate out signals
  22.   until($col==(@data-1)) {
  23.       $col++;
  24.       ##take out corresponding type and direction
  25.        push @other,$data[$row][$col]; 
  26.        }
  27. foreach(@signals) {
  28. print "\n";
  29. printf("%-15s",$_); 
  30. printf("%-15s", $_) foreach(@other);
  31. }
  32. last;
  33. else {
  34. printf("%-15s", $data[$row][$col]);
  35. }
  36. }
  37. print "\n";
  38. }
  39.  
  40. __DATA__
  41. clk         : in std_logic;
  42. areset,reset        : in std_logic;
  43. busy : out std_logic;
  44. writeEnable : in std_logic;
  45. readEnable, modifyEnable : in std_logic;
  46. write, copy    : in std_logic_vector(wordSize-1 downto 0);
  47. read  ,clock    : out std_logic_vector(wordSize-1 downto 0);
  48. addr : in std_logic_vector(maxAddrBit downto minAddrBit));
  49.  
  50.  
Mar 25 '08 #8
Thanks a lot, yes thats the idea more or less. Now, probably should have justified that from start, but I dont really want to print the formatted text, but collect it into a buffer in order to replace the original part with the formatted one. Any ideas how to achieve that in your existing code?
Mar 25 '08 #9
nithinpes
410 Expert 256MB
Thanks a lot, yes thats the idea more or less. Now, probably should have justified that from start, but I dont really want to print the formatted text, but collect it into a buffer in order to replace the original part with the formatted one. Any ideas how to achieve that in your existing code?
If you want to modify the file containing data according to format, all you need to do is to read from that file, write into a temporary file and later change the temporary file to data file.You can use this example:
Expand|Select|Wrap|Line Numbers
  1. use strict;
  2.  
  3. my @data;
  4. open(DATA,"data.txt") or die "read failed:$!";
  5. open(TEMP,">temp.txt") or die "write failed:$!";  ##open temporary file
  6. for (<DATA>) {
  7. s/^\s*//;       ## trim out spaces from beginning of line
  8. ###checking for commas. Otherwise even these signals can be split
  9. ##into separate elements if there is space before/after comma
  10. unless(/,/) {           
  11. push @data, [split (/\s+/, $_,4)];
  12. } else {
  13.   $_=~s/\s*,\s*/,/;
  14.   push @data, [split (/\s+/, $_,4)]; 
  15. }
  16. }
  17.  
  18. foreach my $row(0..(@data-1)) {   ### upto last row
  19. my @signals;my @other;
  20. my $multi;
  21. foreach my $col(0..(@{$data[$row]}-1)) {   ### upto last element in the row
  22. if($data[$row][$col]=~/,/) {
  23.   $multi=1;
  24.   @signals=split(/,/,$data[$row][$col]); ##separate out signals
  25.   until($col==(@data-1)) {
  26.       $col++;
  27.       ##take out corresponding type and direction
  28.        push @other,$data[$row][$col]; 
  29.        }
  30. foreach(@signals) {
  31. print TEMP "\n";
  32. printf TEMP ("%-15s",$_); 
  33. printf TEMP ("%-15s", $_) foreach(@other);
  34. }
  35. last;
  36. else {
  37. printf TEMP ("%-15s", $data[$row][$col]);
  38. }
  39. }
  40. print TEMP "\n";
  41. }
  42. close(DATA); 
  43. close(TEMP);
  44. ##change temp.txt to data.txt
  45. rename("temp.txt","data.txt") or die "rename failed:$!";  
  46.  
Also, note the change in range used for $row and $column. This should be the range you need to ideally use to parse through all rows and all columns in each row.
Mar 26 '08 #10
KevinADC
4,059 Expert 2GB
"rharsh" on Tek-Tips has already written you a 99% working solution. "nithinpes " code is largely a duplication of that code. You seem a bit disengenuous to me by not informing either forum you are also getting help from another forum.
Mar 26 '08 #11
"rharsh" on Tek-Tips has already written you a 99% working solution. "nithinpes " code is largely a duplication of that code. You seem a bit disengenuous to me by not informing either forum you are also getting help from another forum.
You probably mean 'disingenuous'...Well it never crossed my mind that querying multiple sources possesses a form of disingenuousness! Based on that, I should be reading one and only one, say Perl book, but not two or more even worse, cause this would make me disingenuous to the author of the first book! Nevertheless, I truly apologize if that insulted you in any way!
Mar 26 '08 #12

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: Kenneth McDonald | last post by:
I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make...
11
by: rajarao | last post by:
hi I want to remove the content embedded in <script> and </script> tags submitted via text box. My java script should remove the content embedded between <script> and </script> tag. my current...
2
by: Sehboo | last post by:
Hi, I have several regular expressions that I need to run against documents. Is it possible to combine several expressions in one expression in Regex object. So that it is faster, or will I...
4
by: Együd Csaba | last post by:
Hi All, I'd like to "compress" the following two filter expressions into one - assuming that it makes sense regarding query execution performance. .... where (adate LIKE "2004.01.10 __:30" or...
4
by: 28tommy | last post by:
Hi, I'm trying to find scripts in html source of a page retrieved from the web. I'm trying to use the following rule: match = re.compile('<script + src=+>') I'm testing it on a page that...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
1
by: Allan Ebdrup | last post by:
I have a dynamic list of regular expressions, the expressions don't change very often but they can change. And I have a single string that I want to match the regular expressions against and find...
3
by: Dempsey.Jeff | last post by:
I am a wreck at regular expressions so I could use a little help. Say I have a url that looks something like this. http://test.com/test/test2/tabid/656/Default.aspx. I need to be able to pull...
3
by: Michel Bouwmans | last post by:
Hey everyone, I'm trying to strip all script-blocks from a HTML-file using regex. I tried the following in Python: testfile = open('testfile') testhtml = testfile.read() regex =...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.