473,398 Members | 2,088 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,398 software developers and data experts.

hashing a sequence

50
Hi
i have a sequence of characters that I need to perform analysis on. I want to be able to generate a table with unique substrings of length n and their positions. A hash table was used in perl, not sure of the python equivelent .

sequence = 'gtccaaagtt'

trying to get the output into a table with substring in one column and position(s). for the above sequence this could be something like;
aa 4,5

i have the following code:

Expand|Select|Wrap|Line Numbers
  1. def split_len(seq, length):
  2.       return [seq[i:i+length] for i in range(0, len(seq), length)]
but it doesnt store locations or take into consideration repeats.

been searching google for two days now Sad please help
thanks
Mar 31 '07 #1
4 1188
ghostdog74
511 Expert 256MB
Hi
i have a sequence of characters that I need to perform analysis on. I want to be able to generate a table with unique substrings of length n and their positions. A hash table was used in perl, not sure of the python equivelent .

sequence = 'gtccaaagtt'

trying to get the output into a table with substring in one column and position(s). for the above sequence this could be something like;
aa 4,5

i have the following code:

Expand|Select|Wrap|Line Numbers
  1. def split_len(seq, length):
  2.       return [seq[i:i+length] for i in range(0, len(seq), length)]
but it doesnt store locations or take into consideration repeats.

been searching google for two days now Sad please help
thanks
dictionaries are the equivalent of hashes in Perl. your piece of code actually splits the seq according to length given
eg
Expand|Select|Wrap|Line Numbers
  1. >>> length=3
  2. >>> [seq[i:i+length] for i in range(0, len(seq), length)]
  3. ['gtc', 'caa', 'agt', 't']
  4.  
can you describe the output you want again?
Mar 31 '07 #2
kdt
50
dictionaries are the equivalent of hashes in Perl. your piece of code actually splits the seq according to length given
eg
Expand|Select|Wrap|Line Numbers
  1. >>> length=3
  2. >>> [seq[i:i+length] for i in range(0, len(seq), length)]
  3. ['gtc', 'caa', 'agt', 't']
  4.  
can you describe the output you want again?
thanks for the quick reply. i was wanting the sequence to be split, but i think im going about it the wrong way. i want to take a sequence, break it up substrings of length n and record the position occurrence of the substrings in the query string. So for example for the sequence "attgccatgc", specifying n=2 i would like a dictionary with:
string position
at 1,7
tt 2
tg 3,8
gc 4,9
etc...

hope this helps

thanks
Mar 31 '07 #3
bvdet
2,851 Expert Mod 2GB
thanks for the quick reply. i was wanting the sequence to be split, but i think im going about it the wrong way. i want to take a sequence, break it up substrings of length n and record the position occurrence of the substrings in the query string. So for example for the sequence "attgccatgc", specifying n=2 i would like a dictionary with:
string position
at 1,7
tt 2
tg 3,8
gc 4,9
etc...

hope this helps

thanks
Like this?
Expand|Select|Wrap|Line Numbers
  1. import re
  2.  
  3. def indexList(s, item, i=0):
  4.     i_list = []
  5.     while True:
  6.         try:
  7.             i = s.index(item, i)
  8.             i_list.append(i)
  9.             i += 1
  10.         except:
  11.             break
  12.     return i_list
  13.  
  14. s = 'Having the base plate subassembly mark at each column will do two things - reference the large scale base plate details and provide the checker with a check of the column bases in the model without having to check each column in the model.'
  15.  
  16. sub_length = 4
  17.  
  18. subList = [s[i:i+sub_length] for i in range(len(s)) if re.match(r'[a-z]{%s}' % sub_length, s[i:i+sub_length])]
  19.  
  20. dd = {}
  21. for subi in subList:
  22.     if not dd.has_key(subi):
  23.         dd[subi] = indexList(s, subi)
  24.  
  25. for key in dd:
  26.     print '%s = %s' % (key, dd[key])
Expand|Select|Wrap|Line Numbers
  1. >>> ence = [80]
  2. renc = [79]
  3. fere = [77]
  4. scal = [95]
  5. thin = [66]
  6. colu = [47, 164, 219]
  7. ving = [2, 200]
  8. mode = [184, 233]
  9. ovid = [126]
  10. ..........................
Mar 31 '07 #4
kdt
50
thanks btdev, works a treat :)
Apr 1 '07 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

2
by: Pat | last post by:
I want to look for some one-to-one hashing function. In C++, any one-to-one hashing function?
1
by: snowteo | last post by:
Hi,I have to do this exercises can you help me: 1)Write a program to implement exetendible hashing.If the table is small enough to fin in main memory,how does its performance compare with open and...
11
by: Wm. Scott Miller | last post by:
Hello all! We are building applications here and have hashing algorithms to secure secrets (e.g passwords) by producing one way hashes. Now, I've read alot and I've followed most of the advice...
10
by: Dino M. Buljubasic | last post by:
Hi, I am using MD5 to hash my passwords and add them to database as hashed. I have noticed though that some passwords don't get recognized and I suppose that it happen because hashing might...
19
by: Ole Nielsby | last post by:
How does the GetHashCode() of an array object behave? Does it combine the GetHashCode() of its elements, or does it create a sync block for the object? I want to use readonly arrays as...
8
by: Maya | last post by:
Hello all, I'm using MD5 hashing in my application to give unique values to huge list of items my application receives, originally every item's name was difficult to use as an id for this item...
6
by: Jayender | last post by:
Hi, What is the difference between Hashing and Encryption ?
1
by: Tinku | last post by:
Hi friends I know Static Hashing and i know about Dynamic Hashing, still i have problem to make program with Dynamic Hashing I am new in "C" world, please help me, my problem is: i have to...
15
by: Vinodh | last post by:
I am reading about hashing techniques. The map data structure available in C++ STL uses hashing techniques?
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.