473,799 Members | 2,822 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Text File Parsing - List Unique Column Values

1 New Member
Am new to perl language , would really help if some of you assist me how to use a regex say for example this is my log file

000046571|10000 25|CUSTOMER|27-JUN-2007 06:27:59|005|DE FAULT
000046572|10000 26|ACTIVATE|16-JUL-2007 12:33:13|013|DE FAULT
000046572|10000 26|ACTIVATE|16-JUL-2007 12:33:13|018|ME NU

i want to take only the 6th field(DEFAULT) by using my following perl srcipt

cut -d \| -f6 <filename> | sort | uniq

so that i take only the unique fields(DEFAULT, MENU. etc.) from the log file.

When i run the above command i get duplicate fields also which should not be the case.I guess at the end of every row there is a space so am not able to take unique fields.

Reallly apprecaite if anyof you could help me from this?
Aug 13 '07 #1
3 4882
AndyHunt
4 New Member
A couple of things:
- that works fine here (on Solaris 8)! What OS are you on?
- I assume that you're seeing the following output:
DEFAULT
DEFAULT
MENU
is that right? If not, what output are you seeing?
(I see:
DEFAULT
MENU
)

- Also, this isn't anything to do with perl - I'm new here too, so I'm not sure how strict people are, but you may get better help in the Linux / Unix / BSD forum.

HTH

Andy
Aug 13 '07 #2
numberwhun
3,509 Recognized Expert Moderator Specialist
Hello! Just as a note, when you post to the forum, be sure to post the code that you have tried thus far, that way, we can help you work out any errors/issues you are experiencing.

As for your issue, I am in a code writing mood today and have whipped up something really quick.

You mentioned using a regular expression to pull out the 6th field. Sure, you could do that, but that's kind of like using a hammer to open the door by breaking the window when you have the key in your pocket.

Instead, since your fields are "|" (pipe) delimited, why not feed each line into the split function using a while loop, and pull out the 6th field (or element [5] of the array)? To me, this was much easier. Save the regex's for when you really need them, of course, don't let me stop you from trying that route as it is great to learn regex's if you haven't already.

Here is the code:

Expand|Select|Wrap|Line Numbers
  1. use strict;
  2. use warnings;
  3.  
  4. open(FILE, "<Text1.txt");
  5.  
  6. while(<FILE>)
  7. {
  8.     chomp($_);
  9.     my @line = split(/\|/, $_);
  10.  
  11.     print("$line[5]\n");
  12. }
  13.  

Also, your line of code:

Expand|Select|Wrap|Line Numbers
  1. cut -d \| -f6 <filename> | sort | uniq
  2.  
is a line from a shell script, not Perl. Sure, you could put that inside back tics or a system() function, but if you are going to code in Perl, then do so.

Regards,

Jeff
Aug 13 '07 #3
miller
1,089 Recognized Expert Top Contributor
As has already been stated, this is not truly a perl issue. However, it can be solved fairly easily with perl.

The following code prints out only the unique values for the 6th column in the DATA file handle:

Expand|Select|Wrap|Line Numbers
  1. my %seen = ();
  2. while (<DATA>) {
  3.     chomp;
  4.     my @columns = split '\|';
  5.     print "$columns[5]\n" if ! $seen{$columns[5]}++;
  6. }
  7.  
  8. __DATA__
  9. 000046571|1000025|CUSTOMER|27-JUN-2007 06:27:59|005|DEFAULT
  10. 000046572|1000026|ACTIVATE|16-JUL-2007 12:33:13|013|DEFAULT
  11. 000046572|1000026|ACTIVATE|16-JUL-2007 12:33:13|018|MENU
  12.  
This technique is documented here:

perlfaq4 Data Manipulation - How can I remove duplicate elements from a list or array?

- Miller
Aug 14 '07 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

18
2553
by: Wade Leftwich | last post by:
Every couple of months I have a use for the experimental 'scanner' object in the re module, and when I do, as I did this morning, it's really handy. So if anyone is counting votes for making it a standard part of the module, here's my vote: +1 -- Wade Leftwich Ithaca, NY
13
1885
by: Nickolay Kolev | last post by:
Hi all, I am currently writing some simple functions in the process of learning Python. I have a task where the program has to read in a text file and display some statistics about the tokens in that file. The text I have been feeding it is Dickens' David Copperfield. It is really simple - it reads the file in memory, splits it on whitespace, strips punctuation characters and transforms all remaining
26
45453
by: Agoston Bejo | last post by:
I want to enforce such a constraint on a column that would ensure that the values be all unique, but this wouldn't apply to NULL values. (I.e. there may be more than one NULL value in the column.) How can I achieve this? I suppose I would get the most-hated "table/view is changing, trigger/function may not see it" error if I tried to write a trigger that checks the uniqueness of non-null values upon insert/update.
5
1995
by: Marie | last post by:
Access97 I have a table containing addresses with a separate field for State. Is there a way to create a query that returns an unique list of the states in that table and still be updateable? I tried setting the unique values property to Yes but that gave me a recordset that was not updateable. Thanks! Marie
1
2849
by: tHeRoBeRtMiTcHeLL | last post by:
Below is an earlier post to an Excel Group.. ....but I thought that there might be a way to do this in Access by importing data and then creating append and/or update query. I would most certainly need to use the right type of table join and criteria in the query to perform the task, and don't find myself an expert or up to par as far as I'm concerned. *******************************************************************
2
2804
by: Frantic | last post by:
I'm working on a list of japaneese entities that contain the entity, the unicode hexadecimal code and the xml/sgml entity used for that entity. A unicode document is read into the program, then the program sorts out every doublet and the hexadecimal unicode code is extracted, but I dont know a way to find the xml or sgml-entity equivalent to the unicode code. Anyone who could give me a pointer? Best regards
22
2071
by: yang__lee | last post by:
Hi, I hope you may help me. Please check the attached text file. Actually its a report file with some headers information and them report is in tabular format. I want to parse each row and get the values. I think strtok won't work here.
11
3962
by: sqlservernewbie | last post by:
Hi Everyone, Here is a theoretical, and definition question for you. In databases, we have: Relation a table with columns and rows
8
5118
by: Sham | last post by:
I am trying to perform the following query on a table that has been indexed using Full Text Search. The table contains multiple columns than have been indexed. (Below, all xml columns are indexed). dbo.maintable(ProfileID int pk) dbo.fts_table(ProfileID int pk fk, col1 xml, col2 xml, col3 xml) I want to perform a query that will return any rows that contain ‘x’ and ‘y’ in any columns. I.e. ‘x’ could be in col1 and ‘y’ could be in
0
9689
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9550
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
10248
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10032
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9085
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6811
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5469
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5597
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4148
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.