473,498 Members | 1,714 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Text File Parsing - List Unique Column Values

1 New Member
Am new to perl language , would really help if some of you assist me how to use a regex say for example this is my log file

000046571|1000025|CUSTOMER|27-JUN-2007 06:27:59|005|DEFAULT
000046572|1000026|ACTIVATE|16-JUL-2007 12:33:13|013|DEFAULT
000046572|1000026|ACTIVATE|16-JUL-2007 12:33:13|018|MENU

i want to take only the 6th field(DEFAULT) by using my following perl srcipt

cut -d \| -f6 <filename> | sort | uniq

so that i take only the unique fields(DEFAULT, MENU. etc.) from the log file.

When i run the above command i get duplicate fields also which should not be the case.I guess at the end of every row there is a space so am not able to take unique fields.

Reallly apprecaite if anyof you could help me from this?
Aug 13 '07 #1
3 4864
AndyHunt
4 New Member
A couple of things:
- that works fine here (on Solaris 8)! What OS are you on?
- I assume that you're seeing the following output:
DEFAULT
DEFAULT
MENU
is that right? If not, what output are you seeing?
(I see:
DEFAULT
MENU
)

- Also, this isn't anything to do with perl - I'm new here too, so I'm not sure how strict people are, but you may get better help in the Linux / Unix / BSD forum.

HTH

Andy
Aug 13 '07 #2
numberwhun
3,509 Recognized Expert Moderator Specialist
Hello! Just as a note, when you post to the forum, be sure to post the code that you have tried thus far, that way, we can help you work out any errors/issues you are experiencing.

As for your issue, I am in a code writing mood today and have whipped up something really quick.

You mentioned using a regular expression to pull out the 6th field. Sure, you could do that, but that's kind of like using a hammer to open the door by breaking the window when you have the key in your pocket.

Instead, since your fields are "|" (pipe) delimited, why not feed each line into the split function using a while loop, and pull out the 6th field (or element [5] of the array)? To me, this was much easier. Save the regex's for when you really need them, of course, don't let me stop you from trying that route as it is great to learn regex's if you haven't already.

Here is the code:

Expand|Select|Wrap|Line Numbers
  1. use strict;
  2. use warnings;
  3.  
  4. open(FILE, "<Text1.txt");
  5.  
  6. while(<FILE>)
  7. {
  8.     chomp($_);
  9.     my @line = split(/\|/, $_);
  10.  
  11.     print("$line[5]\n");
  12. }
  13.  

Also, your line of code:

Expand|Select|Wrap|Line Numbers
  1. cut -d \| -f6 <filename> | sort | uniq
  2.  
is a line from a shell script, not Perl. Sure, you could put that inside back tics or a system() function, but if you are going to code in Perl, then do so.

Regards,

Jeff
Aug 13 '07 #3
miller
1,089 Recognized Expert Top Contributor
As has already been stated, this is not truly a perl issue. However, it can be solved fairly easily with perl.

The following code prints out only the unique values for the 6th column in the DATA file handle:

Expand|Select|Wrap|Line Numbers
  1. my %seen = ();
  2. while (<DATA>) {
  3.     chomp;
  4.     my @columns = split '\|';
  5.     print "$columns[5]\n" if ! $seen{$columns[5]}++;
  6. }
  7.  
  8. __DATA__
  9. 000046571|1000025|CUSTOMER|27-JUN-2007 06:27:59|005|DEFAULT
  10. 000046572|1000026|ACTIVATE|16-JUL-2007 12:33:13|013|DEFAULT
  11. 000046572|1000026|ACTIVATE|16-JUL-2007 12:33:13|018|MENU
  12.  
This technique is documented here:

perlfaq4 Data Manipulation - How can I remove duplicate elements from a list or array?

- Miller
Aug 14 '07 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

18
2523
by: Wade Leftwich | last post by:
Every couple of months I have a use for the experimental 'scanner' object in the re module, and when I do, as I did this morning, it's really handy. So if anyone is counting votes for making it a...
13
1856
by: Nickolay Kolev | last post by:
Hi all, I am currently writing some simple functions in the process of learning Python. I have a task where the program has to read in a text file and display some statistics about the tokens in...
26
45369
by: Agoston Bejo | last post by:
I want to enforce such a constraint on a column that would ensure that the values be all unique, but this wouldn't apply to NULL values. (I.e. there may be more than one NULL value in the column.)...
5
1964
by: Marie | last post by:
Access97 I have a table containing addresses with a separate field for State. Is there a way to create a query that returns an unique list of the states in that table and still be updateable? I...
1
2816
by: tHeRoBeRtMiTcHeLL | last post by:
Below is an earlier post to an Excel Group.. ....but I thought that there might be a way to do this in Access by importing data and then creating append and/or update query. I would most certainly...
2
2787
by: Frantic | last post by:
I'm working on a list of japaneese entities that contain the entity, the unicode hexadecimal code and the xml/sgml entity used for that entity. A unicode document is read into the program, then the...
22
2030
by: yang__lee | last post by:
Hi, I hope you may help me. Please check the attached text file. Actually its a report file with some headers information and them report is in tabular format. I want to parse each row and...
11
3918
by: sqlservernewbie | last post by:
Hi Everyone, Here is a theoretical, and definition question for you. In databases, we have: Relation a table with columns and rows
8
5083
by: Sham | last post by:
I am trying to perform the following query on a table that has been indexed using Full Text Search. The table contains multiple columns than have been indexed. (Below, all xml columns are...
0
7125
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7002
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7205
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
4910
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4590
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3093
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1419
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
656
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
291
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.