473,549 Members | 2,682 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

making a typing speed tester

Referred here from the tutor list.
I'm trying to write a program to test someones typing speed and show
them their mistakes. However I'm getting weird results when looking
for the differences in longer (than 100 chars) strings:

import difflib

# a tape measure string (just makes it easier to locate a given index)
a =
'1-3-5-7-9-12-15-18-21-24-27-30-33-36-39-42-45-48-51-54-57-60-63-66-69
-72-75-78-81-84-87-90-93-96-99-103-107-111-115-119-123-127-131-135-139
-143-147-151-155-159-163-167-171-175-179-183-187-191-195--200'

# now with a few mistakes
b = '1-3-5-7-
l-12-15-18-21-24-27-30-33-36-39o42-45-48-51-54-57-60-63-66-69-72-75-78
-81-84-8k-90-93-96-9l-103-107-111-115-119-12b-1v7-131-135-139-143-147-
151-m55-159-163-167-a71-175j179-183-187-191-195--200'

s = difflib.Sequenc eMatcher(None, a ,b)
ms = s.get_matching_ blocks()

print ms
>>>[(0, 0, 8), (200, 200, 0)]

Have I made a mistake or is this function designed to give up when the
input strings get too long? If so what could I use instead to compute
the mistakes in a typed text?
---------- Forwarded message ----------
From: Evert Rol

Hi Tom,

Ok, I wasn't on the list last year, but I was a few days ago, so
persistence pays off; partly, as I don't have a full answer.

I got curious and looked at the source of difflib. There's a method
__chain_b() which sets up the b2j variable, which contains the
occurrences of characters in string b. So cutting b to 199
characters, it looks like this:
b2j= 19 {'a': [168], 'b': [122], 'm': [152], 'k': [86], 'v':
[125], '-': [1, 3, 5, 7, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 42,
45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93,
96, 99, 103, 107, 111, 115, 119, 123, 127, 131, 135, 139, 143, 147,
151, 155, 159, 163, 167, 171, 179, 183, 187, 191, 195, 196], 'l': [8,
98], 'o': [39], 'j': [175], '1': [0, 10, 13, 16, 20, 50, 80, 100,
104, 108, 109, 110, 112, 113, 116, 117, 120, 124, 128, 130, 132, 136,
140, 144, 148, 150, 156, 160, 164, 170, 172, 176, 180, 184, 188, 190,
192], '0': [29, 59, 89, 101, 105, 198], '3': [2, 28, 31, 32, 34, 37,
62, 92, 102, 129, 133, 137, 142, 162, 182], '2': [11, 19, 22, 25, 41,
71, 121, 197], '5': [4, 14, 44, 49, 52, 55, 74, 114, 134, 149, 153,
154, 157, 174, 194], '4': [23, 40, 43, 46, 53, 83, 141, 145], '7':
[6, 26, 56, 70, 73, 76, 106, 126, 146, 166, 169, 173, 177, 186], '6':
[35, 58, 61, 64, 65, 67, 95, 161, 165], '9': [38, 68, 88, 91, 94, 97,
118, 138, 158, 178, 189, 193], '8': [17, 47, 77, 79, 82, 85, 181,
185]}

This little detour is because of how b2j is built. Here's a part from
the comments of __chain_b():

# Before the tricks described here, __chain_b was by far the most
# time-consuming routine in the whole module! If anyone sees
# Jim Roskind, thank him again for profile.py -- I never would
# have guessed that.

And the part of the actual code reads:
b = self.b
n = len(b)
self.b2j = b2j = {}
populardict = {}
for i, elt in enumerate(b):
if elt in b2j:
indices = b2j[elt]
if n >= 200 and len(indices) * 100 n: # <--- !!
populardict[elt] = 1
del indices[:]
else:
indices.append( i)
else:
b2j[elt] = [i]

So you're right: it has a stop at the (somewhat arbitrarily) limit of
200 characters. How that exactly works, I don't know (needs more
delving into the code), though it looks like there also need to be a
lot of indices (len(indices*10 0>n); I guess that's caused in your
strings by the dashes, '1's and '0's (that's why I printed the b2j
string).
If you feel safe enough and on a fast platform, you can probably up
that limit (or even put it somewhere as an optional variable in the
code, which I would think is generally better).
Not sure who the author of the module is (doesn't list in the file
itself), but perhaps you can find out and email him/her, to see what
can be altered.

Hope that helps.

Evert

Nov 14 '07 #1
2 2066
On Nov 14, 11:56 am, tavspamno...@go oglemail.com wrote:
Referred here from the tutor list.
I'm trying to write a program to test someones typing speed and show
them their mistakes. However I'm getting weird results when looking
for the differences in longer (than 100 chars) strings:
import difflib
# a tape measure string (just makes it easier to locate a given index)
a =
'1-3-5-7-9-12-15-18-21-24-27-30-33-36-39-42-45-48-51-54-57-60-63-66-69
-72-75-78-81-84-87-90-93-96-99-103-107-111-115-119-123-127-131-135-139
-143-147-151-155-159-163-167-171-175-179-183-187-191-195--200'
# now with a few mistakes
b = '1-3-5-7-
l-12-15-18-21-24-27-30-33-36-39o42-45-48-51-54-57-60-63-66-69-72-75-78
-81-84-8k-90-93-96-9l-103-107-111-115-119-12b-1v7-131-135-139-143-147-
151-m55-159-163-167-a71-175j179-183-187-191-195--200'
s = difflib.Sequenc eMatcher(None, a ,b)
ms = s.get_matching_ blocks()
print ms
>>[(0, 0, 8), (200, 200, 0)]
Have I made a mistake or is this function designed to give up when the
input strings get too long? If so what could I use instead to compute
the mistakes in a typed text?
---------- Forwarded message ----------
From: Evert Rol

Hi Tom,

Ok, I wasn't on the list last year, but I was a few days ago, so
persistence pays off; partly, as I don't have a full answer.

I got curious and looked at the source of difflib. There's a method
__chain_b() which sets up the b2j variable, which contains the
occurrences of characters in string b. So cutting b to 199
characters, it looks like this:
b2j= 19 {'a': [168], 'b': [122], 'm': [152], 'k': [86], 'v':
[125], '-': [1, 3, 5, 7, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 42,
45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93,
96, 99, 103, 107, 111, 115, 119, 123, 127, 131, 135, 139, 143, 147,
151, 155, 159, 163, 167, 171, 179, 183, 187, 191, 195, 196], 'l': [8,
98], 'o': [39], 'j': [175], '1': [0, 10, 13, 16, 20, 50, 80, 100,
104, 108, 109, 110, 112, 113, 116, 117, 120, 124, 128, 130, 132, 136,
140, 144, 148, 150, 156, 160, 164, 170, 172, 176, 180, 184, 188, 190,
192], '0': [29, 59, 89, 101, 105, 198], '3': [2, 28, 31, 32, 34, 37,
62, 92, 102, 129, 133, 137, 142, 162, 182], '2': [11, 19, 22, 25, 41,
71, 121, 197], '5': [4, 14, 44, 49, 52, 55, 74, 114, 134, 149, 153,
154, 157, 174, 194], '4': [23, 40, 43, 46, 53, 83, 141, 145], '7':
[6, 26, 56, 70, 73, 76, 106, 126, 146, 166, 169, 173, 177, 186], '6':
[35, 58, 61, 64, 65, 67, 95, 161, 165], '9': [38, 68, 88, 91, 94, 97,
118, 138, 158, 178, 189, 193], '8': [17, 47, 77, 79, 82, 85, 181,
185]}

This little detour is because of how b2j is built. Here's a part from
the comments of __chain_b():

# Before the tricks described here, __chain_b was by far the most
# time-consuming routine in the whole module! If anyone sees
# Jim Roskind, thank him again for profile.py -- I never would
# have guessed that.

And the part of the actual code reads:
b = self.b
n = len(b)
self.b2j = b2j = {}
populardict = {}
for i, elt in enumerate(b):
if elt in b2j:
indices = b2j[elt]
if n >= 200 and len(indices) * 100 n: # <--- !!
populardict[elt] = 1
del indices[:]
else:
indices.append( i)
else:
b2j[elt] = [i]

So you're right: it has a stop at the (somewhat arbitrarily) limit of
200 characters. How that exactly works, I don't know (needs more
delving into the code), though it looks like there also need to be a
lot of indices (len(indices*10 0>n); I guess that's caused in your
strings by the dashes, '1's and '0's (that's why I printed the b2j
string).
If you feel safe enough and on a fast platform, you can probably up
that limit (or even put it somewhere as an optional variable in the
code, which I would think is generally better).
Not sure who the author of the module is (doesn't list in the file
itself), but perhaps you can find out and email him/her, to see what
can be altered.

Hope that helps.

Evert
I would use the time module to "time" the user. Then you should be
able to compare the original string with the user inputted string
using cmp.

<code>
# untested

start = time.time()
print 'some complicated long string'

# you should use a GUI toolkit's textbox rather than
# using a variable
user_string = raw_input('Plea se type the string above as quickly and
accurately as you can:\n\n')
end = time.time()
print 'amount of time to complete: %s seconds' % (end-start)

# do the comparison here
# which I am not sure how to do right now
</code>

See the following for ideas on comparing similar strings/iterables:

http://www.velocityreviews.com/forum...r-strings.html

Mike

Nov 14 '07 #2
En Wed, 14 Nov 2007 14:56:25 -0300, <ta**********@g ooglemail.comes cribió:
>I'm trying to write a program to test someones typing speed and show
them their mistakes. However I'm getting weird results when looking
for the differences in longer (than 100 chars) strings:

import difflib

# a tape measure string (just makes it easier to locate a given index)
a =
'1-3-5-7-9-12-15-18-21-24-27-30-33-36-39-42-45-48-51-54-57-60-63-66-69
-72-75-78-81-84-87-90-93-96-99-103-107-111-115-119-123-127-131-135-139
-143-147-151-155-159-163-167-171-175-179-183-187-191-195--200'

# now with a few mistakes
b = '1-3-5-7-
l-12-15-18-21-24-27-30-33-36-39o42-45-48-51-54-57-60-63-66-69-72-75-78
-81-84-8k-90-93-96-9l-103-107-111-115-119-12b-1v7-131-135-139-143-147-
151-m55-159-163-167-a71-175j179-183-187-191-195--200'

s = difflib.Sequenc eMatcher(None, a ,b)
ms = s.get_matching_ blocks()

print ms
>>>>[(0, 0, 8), (200, 200, 0)]

Have I made a mistake or is this function designed to give up when the
input strings get too long? If so what could I use instead to compute
the mistakes in a typed text?
Yes, there are some limitations on how SequenceMatcher works.
---------- Forwarded message ----------
From: Evert Rol
[...]
And the part of the actual code reads:
if n >= 200 and len(indices) * 100 n: # <--- !!
populardict[elt] = 1
del indices[:]
else:
indices.append( i)>
So you're right: it has a stop at the (somewhat arbitrarily) limit of
200 characters. [...]If you feel safe enough and on a fast platform, you
can probably up
that limit (or even put it somewhere as an optional variable in the
code, which I would think is generally better).
If you try with a slightly shorter text (190 chars, by example) you get
the expected result, pretty fast:

pys = difflib.Sequenc eMatcher(None, a[:190], b[:190])
pyms = s.get_matching_ blocks()
pyprint ms
[(0, 0, 8), (9, 9, 30), (40, 40, 46), (87, 87, 11), (99, 99, 23), (123,
123, 2),
(126, 126, 26), (153, 153, 15), (169, 169, 6), (176, 176, 14), (190, 190,
0)]

So it appears that your strings are hitting that (arbitrary) limit. From
the algorithm point of view, your strings are a rather degenerate case: so
many '-' and '0' and '1's to match.
Try increasing that 200 to somewhat larger than your strings.

--
Gabriel Genellina

Nov 15 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1670
by: Michele Simionato | last post by:
Some time ago I hacked a custom solution to run doctests on text files containing documentation. The solution involved this kind of game: tester=doctest.Tester(globs={},verbose=1) tester.runstring(mytest) It worked fine, but now with Python 2.4.a3 I get DeprecationWarning: class Tester is deprecated; use class
8
2393
by: andrew.queisser | last post by:
Yesterday I typed in some C++ code that called a function with two ints. Intellisense (auto-complete) helpfully told me that the first formal parameter was called "frontLight" and the second "ringLight". It occurred to me that I'm getting some semantic help here on top of the obvious type safety. It seems to me that the existance of this kind...
2
3502
by: zheenma | last post by:
There is a program that speed up your typing and avoid spelling errors. My friends use it ,their averege is more than 100 wpm. You can download it from www.wamasoft.com. And It has a library for c++. I hope that it can help you!
11
9172
by: Sezai YILMAZ | last post by:
Hello I need high throughput while inserting into PostgreSQL. Because of that I did some PostgreSQL insert performance tests. ------------------------------------------------------------ -- Test schema create table logs ( logid serial primary key, ctime integer not null,
669
25607
by: Xah Lee | last post by:
in March, i posted a essay “What is Expressiveness in a Computer Language”, archived at: http://xahlee.org/perl-python/what_is_expresiveness.html I was informed then that there is a academic paper written on this subject. On the Expressive Power of Programming Languages, by Matthias Felleisen, 1990....
11
1392
by: Stephen Plotnick | last post by:
I upgraded from VB 2003 to VB 2005. I have a 4000 record table in Access that I'm using oleDB to do a data adapter fill to a Data Grid. In VB.2003 it takes 2-3 seconds to load the data grid in vb 2005 it takes aroudn 20 seconds. IS there something that could be causing this problem?
16
1698
by: byteschreck | last post by:
I recently switched temporarily from C# to VB.NET to see what the differences are. To my surprise I am *much* faster with VB.NET. I don't know why, pressing the shift key to capitalize letters slows me down tremendously, among having to type a semicolon after each statement, not having intelligent auto-complete and auto-indentation and not...
1
1602
by: staticfire | last post by:
Hi i am in need of help with an ajax related problem on my site. I've coded an active users list which shows the members and guests online. I'm sure you've all seen active user lists before, like the one at the bottom of this forum for example but using ajax i created one which updates without you having to refresh the page. It all works fine...
17
1458
by: Suresh Pillai | last post by:
I am performing simulations on networks (graphs). I have a question on speed of execution (assuming very ample memory for now). I simplify the details of my simulation below, as the question I ask applies more generally than my specific case. I would greatly appreciate general feedback in terms of computing and of course considerations...
0
7520
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7450
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7957
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
7809
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6043
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
5088
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3500
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
1059
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
763
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.