Do you think text editors give us a correct word count? I thought so. Until, I accidentally discovered otherwise.
A while ago, I had to count the total number of words in Moby Dick’s first chapter. (Moby Dick is an American novel written by Herman Melville.)
To get that number of words I tried Google Docs first; it gave me a word count of 1655 words. Next, out of curiosity, I tried Apple’s Pages which gave a word count of 1679 words. Then, out of even more curiosity, I tried Grammarly which reported 1648 words.
I was surprised. I was puzzled. And I ended up counting words by hand (for the curious it came out to be 1663 words–not the same as text editors’).
I thought a basic problem, such as counting words, would be solved correctly by text editors in this age and I would get one answer no matter which one I try. But no, that’s not the case.
Why is that? It’s possible that different vendors use a different, and not same, definition of a word. How do you handle spaces in text, how do you handle dashes or en-dashes or even em-dashes; all this matters.
Counting words may not seem like an exciting problem but some people often need word count for their jobs (journalists or writers, for example). People rely on this. As an engineer working on software you’ve to balance different things. Yes, you need things fast. Yes, you need good design. But you also need things to be correct.
When I told this little word-count story to some engineers, it didn’t thrill them. Many people I talked to thought this word-count problem wasn’t that “sexy.”
Boring problems matter. They should get some focus. They should get some attention. We paid some attention to this and launched a word counter. The problem of parsing all kinds of formatted text looks simple but it’s not; it’s also possible you may come across an issue in which case please let us know. We’ll try to solve it soon. We hope you find it useful.