473,473 Members | 1,739 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Lix count (readabillity) / Flesch-Kincaid

For my thesis I'm using access to analyze news-articles. I would like
to calculate readability score. But I don't know how.

I have a very simple word count already, and I also have an
approximation of whole sentences (number of periods "."). If this group
can help me count how many occurrences of long words there is in an
article, I can calculate the Danish LIX.

A long word is defined as longer than 7 letters.

Thank you in advance

Nov 13 '05 #1
7 4836
What's the definition of a long word: simply the number of letters in it?

Assuming you're using Access 2000 or newer, you can use the Split function
to break each sentence into its individual words, then calculate the length
of each word. Something like the following untested aircode (it assumes that
the entire paragraph is stored in strFile):

Dim lngCountLongWords As Long
Dim lngCountWords As Long
Dim lngSentence As Long
Dim lngWords As Long
Dim strFile As String
Dim varSentences As Variant
Dim varWords As Variant

varSentences = Split(strFile, ".")
If IsNull(varSentences) = False Then
For lngSentence = LBound(varSentences) To UBound(varSentences)
' Extra spaces and the like may result in some elements in
' varSentences not actually being true sentences. For example,
' if there are 2 blanks in a row, you'll end up with an empty element
' in varSentences. You can vary the limit here if you like...
If Len(varSentences(lngSentence) > 0 Then
varWords = Split(varSentences(lngSentence), " ")
If IsNull(varWords) = False Then
For lngWord = LBound(varWords) To UBound(varWords)
If Len(varWords(lngWord)) > 0 Then
lngCountWords = lngCountWords + 1
' You'll likely want to vary this number too
If Len(varWords(lngWord)) > 9 Then
lngCountLongWords = lngCountLongWords + 1
End If
End If
Next lngWord
End If
End If
Next lngSentence
End If
--
Doug Steele, Microsoft Access MVP
http://I.Am/DougSteele
(no e-mails, please!)

"achristoffersen" <ac*************@gmail.com> wrote in message
news:11**********************@g49g2000cwa.googlegr oups.com...
For my thesis I'm using access to analyze news-articles. I would like
to calculate readability score. But I don't know how.

I have a very simple word count already, and I also have an
approximation of whole sentences (number of periods "."). If this group
can help me count how many occurrences of long words there is in an
article, I can calculate the Danish LIX.

A long word is defined as longer than 7 letters.

Thank you in advance

Nov 13 '05 #2
> What's the definition of a long word: simply the number of letters in it?

:-) Yeah - Danish is not that complicated.
Something like the following untested aircode
Thank you SO!!! much... I'm not good at VB and SQL, so I really dont
understand everything thats going on in your code. But I'll try it out
tomorrow (right now its 2 am on a saturday night) and let you know.
(it assumes that the entire paragraph is stored in strFile)


It's Access 'memo'.

Again: THX a bunch

Nov 13 '05 #3
A few more questions:

Is it correct that I cannot use this code in a query? So I must make a
form, and then call via a cmd - on event function?

What exactly is the syntax for the strfile... is it simply the path to
the mdb file? Or just the tabel and field name?

Where is lngCountLongWords, lngCountWords, lngSentence and lngWords
created? In the strfile? Not in a separate query?

I really am sorry for my incompetense - but beyond drag and drop
access, I have VERY little experience...

Again - thank you in advance

Nov 13 '05 #4
"achristoffersen" <ac*************@gmail.com> wrote in message
news:11*********************@g44g2000cwa.googlegro ups.com...
A few more questions:

Is it correct that I cannot use this code in a query? So I must make a
form, and then call via a cmd - on event function?
You could create turn the snippet of code into a function that returns a
value, but since I don't understand how you calculate readability, it's
difficult to tell you what that value that's returned should be.
What exactly is the syntax for the strfile... is it simply the path to
the mdb file? Or just the tabel and field name?
The code I wrote assumed that strFile contained the actual text of a
paragraph you're analysing. If you were to turn that code into a function,
I'd assume that you'd pass strFile into the function as a parameter:

Function AnalysisText(TextToAnalyse As String) As ?

then you'd replace

varSentences = Split(strFile, ".")

with

varSentences = Split(TextToAnalyse, ".")

Where is lngCountLongWords, lngCountWords, lngSentence and lngWords
created? In the strfile? Not in a separate query?


Not quite sure what you mean by "created". Those four values are variables
used in the code. If you convert the code into a function, they'd be local
to the function: not used (or accessible) anywhere else.

lngSentence and lngWord (that was a typo in the original code: I use
lngWord, not lngWords everywhere else) are simply looping variables. The
first Split function takes the paragraph that's supplied, and creates an
array out it, with each element in the array representing a different
sentence in the paragraph. lngSentence is how the code loops through each
sentence. Similarly, the second Split function takes each sentence, and
creates an array of each word in the sentence, and lngWord loops through
each word.

lngCountWords and lngCountLongWords are the gist of the function: when all
of the loops are done, lngCountWords contains how many words are in the
paragraph, while lngCountLongWords contains how many of those words are
greater than the given length (9 letters in my sample code)

If you need more help, you're going to have to provide details such as how
your data is stored in the database, and what calculation needs to be done
once you've determined how many words and how many "long words" are in the
snippet of text being analysed.

--
Doug Steele, Microsoft Access MVP
http://I.Am/DougSteele
(no e-mails, please!)

Nov 13 '05 #5
> but since I don't understand how you calculate readability, it's
difficult to tell you what that value that's returned should be.


The danish readabillty number, lix, i calculated on the basis of:
Number of words (W)
Number of periods (P)
Number og words with at least 7 letters (L)

Lix=(W/P+(L/W x 100)).

Ie. Lix = number of word per sentence (one can argue that since
abbrevations are more diffucult to read, there is no need to take
account of this) + the percentage of long words in the text.

A lix number of less then 24 is childrens litterature, and a number of
+55 is difficult academic writing. 35-44 is normal, e.g. a newspaper
article.

Since I already have the number of periods and the number of words
returned by a quite simple query, my idea was to get the number of long
words returned, and then simply do the math in a seperate query field.

Does this make sense?

The article is stored in a memofield ("artikel tekst") in
tbl_anonymiser, along with a autoincrement primary key ("artikelID") ,
and other fields.

Nov 13 '05 #6
My advice would be to write a function that accepts a string as input, and
then returns the Lix value.

You say you already know the number of words and the number of periods, and
I've shown you have to calculate the number of long words. Hopefully that's
enough to get you going.

--
Doug Steele, Microsoft Access MVP
http://I.Am/DougSteele
(no e-mails, please!)

"achristoffersen" <ac*************@gmail.com> wrote in message
news:11**********************@f14g2000cwb.googlegr oups.com...
but since I don't understand how you calculate readability, it's
difficult to tell you what that value that's returned should be.


The danish readabillty number, lix, i calculated on the basis of:
Number of words (W)
Number of periods (P)
Number og words with at least 7 letters (L)

Lix=(W/P+(L/W x 100)).

Ie. Lix = number of word per sentence (one can argue that since
abbrevations are more diffucult to read, there is no need to take
account of this) + the percentage of long words in the text.

A lix number of less then 24 is childrens litterature, and a number of
+55 is difficult academic writing. 35-44 is normal, e.g. a newspaper
article.

Since I already have the number of periods and the number of words
returned by a quite simple query, my idea was to get the number of long
words returned, and then simply do the math in a seperate query field.

Does this make sense?

The article is stored in a memofield ("artikel tekst") in
tbl_anonymiser, along with a autoincrement primary key ("artikelID") ,
and other fields.

Nov 13 '05 #7
> You say you already know the number of words and the number of periods, and
I've shown you have to calculate the number of long words. Hopefully that's
enough to get you going.


It's certainly enough to get me going :-) If I go crashing in to a
concreete wall - I'll let you know. For now, I'll just say - thanx a
million!!!

Sincerely
Andreas

Nov 13 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

22
by: Ling Lee | last post by:
Hi all. I'm trying to write a program that: 1) Ask me what file I want to count number of lines in, and then counts the lines and writes the answear out. 2) I made the first part like this: ...
6
by: Geetha | last post by:
I searched in the Oracle documents what count (1) meant and I could not find an answer. Can some one explain what Oracle does internally when use count (1) VS count (*). Thank you very much in...
0
by: Fred | last post by:
Does anyone know of a .NET component that will provide the Flesch Reading Ease score and the Flesch-Kincade Grade Level for some given text. I am looking for functionality that is similar to...
22
by: Joseph Shraibman | last post by:
On a 7.3.4 database: explain analyse select count(*) from elog; Aggregate (cost=223764.05..223764.05 rows=1 width=0) (actual time=81372.11..81372.11 rows=1 loops=1) -> Seq Scan on elog ...
5
by: Cro | last post by:
Hello Access Developers, I'd like to know if it is possible to perform a count in an expression that defines a control source. My report is based on a query. In my report, I want a text box to...
1
by: JD | last post by:
Hi guys I'm trying to write a program that counts the occurrences of HTML tags in a text file. This is what I have so far: #include <stdio.h> #include <stdlib.h> #include <string.h> ...
5
by: Eric Johannsen | last post by:
I have a simple object that inherits from CollectionBase and overrides the Count property: namespace MyTest { public class CollTest : System.Collections.CollectionBase { public override int...
1
by: heckstein | last post by:
I am working in Access 2002 and trying to create a report from our company's learming management system. I am not a DBA and most of my SQL knowledge has been self taught through trial and error. I...
22
by: MP | last post by:
vb6,ado,mdb,win2k i pass the sql string to the .Execute method on the open connection to Table_Name(const) db table fwiw (the connection opened via class wrapper:) msConnString = "Data Source="...
1
by: jlt206 | last post by:
This code <?php include("counter.php")?> on the webpage produces the count number. (function code below) I want to place the current number into a variable $MemberNo or into a FormField to be sent...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.