A friend of mine discovered Times Haiku, and said she’d love to be able to check for haiku in her writing, challenging her friends to write a script to do it. I knew I could, having already learnt how to find the syllables in words using the built-in Mac speech synthesis while creating my robot choir. So I wrote Haiku Detector in my next free weekend, and found some haiku in wikipedia, New Scientist, John Scalzi’s Old Man’s War, some scientific papers about the Higgs discovery, and my own blog. A few days later I made some improvements, and found more haiku in New Scientist, Edwin Abbott Abbott’s Flatland, Douglas Adams’ Last Chance to See, and some scientific works by Charles Darwin and J.G. M’Pherson. Since then I’ve found haiku in various other texts, including The Princeton Companion to Mathematics, the works of Lewis Carroll and Edgar Rice Burroughs, and several more issues of New Scientist. They can all be found in the Haiku Detector category.
You can download Haiku Detector for free; it should work in Mac OS X 10.7 and up.
Just paste or type text into the top part of the window, and any detected haiku will appear in the bottom part. Then you can copy individual haiku, copy them all (Copy All Haiku in the Edit menu) or save them to a file (Save in the File menu.)
Haiku Detector looks for sentences or sequences of sentences with a total of seventeen syllables, and then goes through the individual words and checks whether the sentence can be split after the fifth and twelfth syllables without breaking a word in half. Then it double-checks the last line still has five syllables, because sometimes the punctuation between words is pronounced. The Times Haiku-finding program has a database of syllable counts per word, but I didn’t need that since I can use the Mac OS X speech synthesis API to count the syllables. Haiku Detector makes no attempt to check for kigo (season words.) At some point I intend to learn to use a natural language parsing library add some kind of learning algorithm so that Haiku Detector can work out which sound best as haiku, based on the word classes at the line breaks.