473,378 Members | 1,504 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

Filtering out Duplicate IDs

76
hi all,
i would really appreciate if you could let me know how i could get just the first row of each of the unique Accession in the tab-delimited file like one below.

Accession TC# TCAcc Evalue %ID Score
2005490039 3.A.1.2.9 Q7BSH4 9.00E-18 24.78991597 289
2005490039 3.A.1.111.2 P33116 1.00E-15 28.94736842 289
2005490048 3.A.1.107.1 P30962 1.00E-17 35.34482759 31
2005490048 9.B.14.2.1 P29961 2.00E-16 27.97202797 31

thanks a lot.
Apr 10 '07 #1
4 1071
KevinADC
4,059 Expert 2GB
what have you tried so far?
Apr 10 '07 #2
idorjee
76
this is what i did and it doesn't do anything, just gets the same input file.

Expand|Select|Wrap|Line Numbers
  1. while (<INFILE>) {
  2.     if ($_ =~ /(\S+)\t(.+)/) {
  3.         my $qa = $1;
  4.         my $rest = $2;
  5.         my $lowest = $qa;
  6.         $lowest = $qa if $qa ne $lowest;
  7.         print OUTFILE "$lowest\t$rest\n";
  8.     }
  9. }
  10.  
thanks
Apr 10 '07 #3
KevinADC
4,059 Expert 2GB
You need to use a hash to keep track of what you have "seen" so you don't repeat it:

Expand|Select|Wrap|Line Numbers
  1. my %seen = ();
  2. while(<INFILE>){
  3.    if (/^(\S+)\t/) {
  4.      next if ++$seen{$1} > 1;
  5.    }
  6.    print OUTFILE;
  7. }   
Apr 10 '07 #4
idorjee
76
thanks alot Kevin,
that worked fine.
^ ^*
Apr 11 '07 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

3
by: Alex Ayzin | last post by:
Hi, I have a problem that might be easy to solve(possibly, I've just overlooked an easy solution). Here we go: I have a dataset with 2 datatables in it. Now, I need to do the following: if...
3
by: Jason | last post by:
I am trying to filter records in a primary form based on records in related tables. The data in the related tables is being displayed in the primary form through subforms. To be more specific, I...
2
by: Vic | last post by:
Dear All, I am getting the following error message : "You cannot assign a value to this object" ("Me.filter =" is highlighted) I have two comboboxes (ByGenes and BySpecies)with lists in them...
7
by: | last post by:
Hello, Does anyone have an idea on how I can filter the data in the gridview control that was returned by an sql query? I have a gridview that works fine when I populate it with data. Now I...
2
by: JUAN ERNESTO FLORES BELTRAN | last post by:
Hi you all, I am developping a python application which connects to a database (postresql) and displays the query results on a treeview. In adittion to displaying the info i do need to implement...
7
by: ucfcpegirl06 | last post by:
Hello, I have a dilemma. I am trying to flag duplicate messages received off of a com port. I have a software tool that is supposed to detect dup messages and flag and write the text "DUP" on...
2
by: google | last post by:
I have a query that outputs the following, A | Name1 | Date1 A | Name2 | Date2 B | Name1 | Date1 B | Name2 | Date2 I would like to see, A | Name1 | Date1
3
by: Harry Haller | last post by:
Hello, I want to implement a generic list which will be used to display 7 columns in a GridView. One should be able to sort, filter and page each of the 7 columns. Ideally the filter should be...
0
by: Eric | last post by:
I have noticed an odd problem and I'm wondering if anyone can shed light on it. Perhaps it is a bug, or maybe it is a characteristic of the QuickSort feature that I wasn't aware of. I have a...
0
by: ticktack | last post by:
Hi there, I am trying to do a UNION with slightly different queries, it appears to work but it now bring in another problem. The second part of the UNION need to have a an extra field value...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.