By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,215 Members | 1,372 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,215 IT Pros & Developers. It's quick & easy.

Lix count (readabillity) / Flesch-Kincaid

P: n/a
For my thesis I'm using access to analyze news-articles. I would like
to calculate readability score. But I don't know how.

I have a very simple word count already, and I also have an
approximation of whole sentences (number of periods "."). If this group
can help me count how many occurrences of long words there is in an
article, I can calculate the Danish LIX.

A long word is defined as longer than 7 letters.

Thank you in advance

Nov 13 '05 #1
Share this Question
Share on Google+
7 Replies


P: n/a
What's the definition of a long word: simply the number of letters in it?

Assuming you're using Access 2000 or newer, you can use the Split function
to break each sentence into its individual words, then calculate the length
of each word. Something like the following untested aircode (it assumes that
the entire paragraph is stored in strFile):

Dim lngCountLongWords As Long
Dim lngCountWords As Long
Dim lngSentence As Long
Dim lngWords As Long
Dim strFile As String
Dim varSentences As Variant
Dim varWords As Variant

varSentences = Split(strFile, ".")
If IsNull(varSentences) = False Then
For lngSentence = LBound(varSentences) To UBound(varSentences)
' Extra spaces and the like may result in some elements in
' varSentences not actually being true sentences. For example,
' if there are 2 blanks in a row, you'll end up with an empty element
' in varSentences. You can vary the limit here if you like...
If Len(varSentences(lngSentence) > 0 Then
varWords = Split(varSentences(lngSentence), " ")
If IsNull(varWords) = False Then
For lngWord = LBound(varWords) To UBound(varWords)
If Len(varWords(lngWord)) > 0 Then
lngCountWords = lngCountWords + 1
' You'll likely want to vary this number too
If Len(varWords(lngWord)) > 9 Then
lngCountLongWords = lngCountLongWords + 1
End If
End If
Next lngWord
End If
End If
Next lngSentence
End If
--
Doug Steele, Microsoft Access MVP
http://I.Am/DougSteele
(no e-mails, please!)

"achristoffersen" <ac*************@gmail.com> wrote in message
news:11**********************@g49g2000cwa.googlegr oups.com...
For my thesis I'm using access to analyze news-articles. I would like
to calculate readability score. But I don't know how.

I have a very simple word count already, and I also have an
approximation of whole sentences (number of periods "."). If this group
can help me count how many occurrences of long words there is in an
article, I can calculate the Danish LIX.

A long word is defined as longer than 7 letters.

Thank you in advance

Nov 13 '05 #2

P: n/a
> What's the definition of a long word: simply the number of letters in it?

:-) Yeah - Danish is not that complicated.
Something like the following untested aircode
Thank you SO!!! much... I'm not good at VB and SQL, so I really dont
understand everything thats going on in your code. But I'll try it out
tomorrow (right now its 2 am on a saturday night) and let you know.
(it assumes that the entire paragraph is stored in strFile)


It's Access 'memo'.

Again: THX a bunch

Nov 13 '05 #3

P: n/a
A few more questions:

Is it correct that I cannot use this code in a query? So I must make a
form, and then call via a cmd - on event function?

What exactly is the syntax for the strfile... is it simply the path to
the mdb file? Or just the tabel and field name?

Where is lngCountLongWords, lngCountWords, lngSentence and lngWords
created? In the strfile? Not in a separate query?

I really am sorry for my incompetense - but beyond drag and drop
access, I have VERY little experience...

Again - thank you in advance

Nov 13 '05 #4

P: n/a
"achristoffersen" <ac*************@gmail.com> wrote in message
news:11*********************@g44g2000cwa.googlegro ups.com...
A few more questions:

Is it correct that I cannot use this code in a query? So I must make a
form, and then call via a cmd - on event function?
You could create turn the snippet of code into a function that returns a
value, but since I don't understand how you calculate readability, it's
difficult to tell you what that value that's returned should be.
What exactly is the syntax for the strfile... is it simply the path to
the mdb file? Or just the tabel and field name?
The code I wrote assumed that strFile contained the actual text of a
paragraph you're analysing. If you were to turn that code into a function,
I'd assume that you'd pass strFile into the function as a parameter:

Function AnalysisText(TextToAnalyse As String) As ?

then you'd replace

varSentences = Split(strFile, ".")

with

varSentences = Split(TextToAnalyse, ".")

Where is lngCountLongWords, lngCountWords, lngSentence and lngWords
created? In the strfile? Not in a separate query?


Not quite sure what you mean by "created". Those four values are variables
used in the code. If you convert the code into a function, they'd be local
to the function: not used (or accessible) anywhere else.

lngSentence and lngWord (that was a typo in the original code: I use
lngWord, not lngWords everywhere else) are simply looping variables. The
first Split function takes the paragraph that's supplied, and creates an
array out it, with each element in the array representing a different
sentence in the paragraph. lngSentence is how the code loops through each
sentence. Similarly, the second Split function takes each sentence, and
creates an array of each word in the sentence, and lngWord loops through
each word.

lngCountWords and lngCountLongWords are the gist of the function: when all
of the loops are done, lngCountWords contains how many words are in the
paragraph, while lngCountLongWords contains how many of those words are
greater than the given length (9 letters in my sample code)

If you need more help, you're going to have to provide details such as how
your data is stored in the database, and what calculation needs to be done
once you've determined how many words and how many "long words" are in the
snippet of text being analysed.

--
Doug Steele, Microsoft Access MVP
http://I.Am/DougSteele
(no e-mails, please!)

Nov 13 '05 #5

P: n/a
> but since I don't understand how you calculate readability, it's
difficult to tell you what that value that's returned should be.


The danish readabillty number, lix, i calculated on the basis of:
Number of words (W)
Number of periods (P)
Number og words with at least 7 letters (L)

Lix=(W/P+(L/W x 100)).

Ie. Lix = number of word per sentence (one can argue that since
abbrevations are more diffucult to read, there is no need to take
account of this) + the percentage of long words in the text.

A lix number of less then 24 is childrens litterature, and a number of
+55 is difficult academic writing. 35-44 is normal, e.g. a newspaper
article.

Since I already have the number of periods and the number of words
returned by a quite simple query, my idea was to get the number of long
words returned, and then simply do the math in a seperate query field.

Does this make sense?

The article is stored in a memofield ("artikel tekst") in
tbl_anonymiser, along with a autoincrement primary key ("artikelID") ,
and other fields.

Nov 13 '05 #6

P: n/a
My advice would be to write a function that accepts a string as input, and
then returns the Lix value.

You say you already know the number of words and the number of periods, and
I've shown you have to calculate the number of long words. Hopefully that's
enough to get you going.

--
Doug Steele, Microsoft Access MVP
http://I.Am/DougSteele
(no e-mails, please!)

"achristoffersen" <ac*************@gmail.com> wrote in message
news:11**********************@f14g2000cwb.googlegr oups.com...
but since I don't understand how you calculate readability, it's
difficult to tell you what that value that's returned should be.


The danish readabillty number, lix, i calculated on the basis of:
Number of words (W)
Number of periods (P)
Number og words with at least 7 letters (L)

Lix=(W/P+(L/W x 100)).

Ie. Lix = number of word per sentence (one can argue that since
abbrevations are more diffucult to read, there is no need to take
account of this) + the percentage of long words in the text.

A lix number of less then 24 is childrens litterature, and a number of
+55 is difficult academic writing. 35-44 is normal, e.g. a newspaper
article.

Since I already have the number of periods and the number of words
returned by a quite simple query, my idea was to get the number of long
words returned, and then simply do the math in a seperate query field.

Does this make sense?

The article is stored in a memofield ("artikel tekst") in
tbl_anonymiser, along with a autoincrement primary key ("artikelID") ,
and other fields.

Nov 13 '05 #7

P: n/a
> You say you already know the number of words and the number of periods, and
I've shown you have to calculate the number of long words. Hopefully that's
enough to get you going.


It's certainly enough to get me going :-) If I go crashing in to a
concreete wall - I'll let you know. For now, I'll just say - thanx a
million!!!

Sincerely
Andreas

Nov 13 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.