well - it looks like an assignment but i guess we can answer it now after a few months to help those that might find it through a search or such. a pretty quick and straight forward solution to that could look like this:
- class TextHandler {
-
-
constructor(txt) {
-
-
if (typeof txt !== 'string') {
-
throw 'cant create object - param has to be a string';
-
}
-
-
this.txt = txt;
-
}
-
-
__initList() {
-
-
return {
-
a:0, b:0, c:0, d:0, e:0, f:0, g:0,
-
h:0, i:0, j:0, k:0, l:0, m:0, n:0,
-
o:0, p:0, q:0, r:0, s:0, t:0, u:0,
-
v:0, w:0, x:0, y:0, z:0
-
};
-
}
-
-
get histogram() {
-
-
let charHistogram = this.__initList();
-
-
for (let i in charHistogram) {
-
-
let re = new RegExp('([' + i + '])', 'gi');
-
let ma = this.txt.match(re);
-
-
charHistogram[i] = ma !== null ? ma.length : 0;
-
}
-
-
return charHistogram;
-
}
-
}
it should fulfill the minimum requirements that can be read from the post - and it should be adapted to what the real! requirements would be.
PS: usage could look like:
- new TextHandler('foobar').histogram;
PPS: since i now was interested :) it appears that such a task can be optimized much by using another approach that simply reduces the needed operations - so during every loop-step we could reduce the text by just replacing the already checked characters like that:
-
-
get histogram() {
-
-
let charHistogram = this.__initList();
-
let txt = this.txt;
-
-
for (let i in charHistogram) {
-
-
let le = txt.length;
-
let re = new RegExp('([' + i + '])', 'gi');
-
-
txt = txt.replace(re, '');
-
-
charHistogram[i] = le - txt.length;
-
}
-
-
return charHistogram;
-
}
-
that made creating the histogram much faster with longer texts - i roughly measured 5x faster execution times with that method (text used for measurement had 21793 characters). note that this consumes the text we use for that - thats why i dont operate on it but assign it to a local variable - and the nice side effect would be that we can have an easy access to the residue at the end that we didnt count in - in the above case all characters like special chars or numbers and such (we count only whats in the initial list).
PPPS: could even be more optimized - probably slightly only but still - by ordering our counting map according to the common frequency of letters in a text in a language. i ordered the list according to here:
https://en.wikipedia.org/wiki/Letter_frequency
that squeezed out another few milliseconds per pass - was only noticable when i increased the text size though to 130758 characters. got roughly 15% faster execution times with it by just starting with the letter
e and
t instead of
a and
b.