Posts Tagged programming
I’ve made a new version of Haiku Detector. The main changes are:
- Performance improvements
- Tweaks to which haiku are identified when punctuation is pronounced differently depending on line breaks and other factors (this includes a workaround for the ‘all numbers pronounced as zero’ bug I found in the speech synthesiser.) In my test data the list of haiku identified is better now.
- Bug fixes.
To celebrate the new release, I fed in the text from the latest New Scientist ‘Collection’ issue, on medical frontiers. The funniest haiku arose when the last sentence of one article joined up with the headline and byline of the next. For example, this looks like the tagline of a movie about an underappreciated superhero, fighting to save anti-vaxxers from diseases of yore:
They will not thank you.
Dan Jones FIGHTING INFECTION
Small shot, big impact
After the opening credits, we see our hero Dan Jones in his lab, and the subtitle announcing his first challenge.
SOURCE: Deathstalker scorpion
His superpowers come, of course, from vaccines:
Some vaccines seem to
provide us with a host of
But not everybody is happy with that:
Several groups have been
trying to develop drugs
that block these signals.
These groups spread propaganda:
Half an hour or
so later, you’ll feel a lot
better. Or will you?
They work around rules:
“Because we use cells,
not field-grown plants, we don’t come
under the same rules.”
And they target humanity by zapping the very microorganisms they’re made up of. Here’s a quote from the evil mastermind:
There are more cells in
your body than there are stars
in the galaxy.
These cells can then be
killed using a laser that
penetrates the skin.
And just when Dan thought he had the solution, the problems compounded to the point of suspension of disbelief, precipitating a crisis. The mastermind had cooked up her own microbial minions:
Those microbes can be
in the environment or
a vaccine syringe.
To make matters worse,
there is a shortage of new
The sequel, which may or may not be a Doctor Who crossover, features a heroine who will live forever:
“Just endless.” Helen
Let’s get physical
Yep, it’s definitely a Doctor Who crossover. Here’s a quote from that movie:
“I’m the doctor. I’m
going to tell you what your
feelings really mean.”
She discovered that time, and specifically time travel, is the best cure for a broken heart:
If we can’t fix hearts
with stem cells there might be an
even better way
As the animal
was slowly warmed, it began
to return to life.
But however clever the TARDIS is, there’s one thing Helen Thomson isn’t sure she can do:
But can we ever
turn the clock back to a world
It turned out, weirdly enough, that the answer was in making sure there was enough shelf space for one’s awards. So she went home to Britain to save the Officers of the British Empire:
On her return home,
she applied those lessons in
So far, two patients
have had OBEs, but neither in
a room with a shelf…
While we’re making sequels, let’s revive an old favourite, which never had any sequels:
The matrix holds a
dazzling array of future
But what is the matrix?
is harvested from human
or pig cadavers.
I guess you have to see it for yourself.
I subjected Haiku Detector to some serious stress-testing with a 29MB text file (that’s 671481 sentences, containing 16810 haiku, of which some are intentional) a few days ago, and kept finding more things that needed fixing or could do with improvement. A few days in a nerdsniped daze later, I have a new version, and some interesting tidbits about the way Mac speech synthesis pronounces things. Here’s some of what I did:
- Tweaked the user interface a bit, partly to improve responsiveness after 10000 or so haiku have been found.
- Made the list of haiku stay scrolled to the bottom so you can see the new ones as they’re found.
- Added a progress bar instead of the spinner that was there before.
- Fixed a memory issue.
- Changed a setting so it should work in Mac OS X 10.6, as I said here it would, but I didn’t have a 10.6 system to test it on, and it turns out it does not run on one. I think 10.7 (Lion) is the lowest version it will run on.
- Added some example text on startup so that it’s easier to know what to do.
- Made it a Developer ID signed application, because now that I have a bit more time to do Mac development (since I don’t have a day job; would you like to hire me?), it was worth signing up to the paid Mac Developer Program again. Once I get an icon for Haiku Detector, I’ll put it on the app store.
- Fixed a few bugs and made a few other changes relating to how syllables are counted, which lines certain punctuation goes on, and which things are counted as haiku.
That last item is more difficult than you’d think, because the Mac speech synthesis engine (which I use to count syllables for Haiku Detector) is very clever, and pronounces words differently depending on context and punctuation. Going through words until the right number of syllables for a given line of the haiku are reached can produce different results depending on which punctuation you keep, and a sentence or group of sentences which is pronounced with 17 syllables as a whole might not have words in it which add up to 17 syllables, or it might, but only if you keep a given punctuation mark at the start of one line or the end of the previous. There are therefore many cases where the speech synthesis says the syllable count of each line is wrong but the sum of the words is correct, or vice versa, and I had to make some decisions on which of those to keep. I’ve made better decisions in this version than the last one, but I may well change things in the next version if it gives better results.
Here are some interesting examples of words which are pronounced differently depending on punctuation or context:
|ooohh||Pronounced with one syllable, as you would expect|
|ooohh.||Pronounced with one syllable, as you would expect|
|ooohh..||Spelled out (Oh oh oh aitch aitch)|
|ooohh…||Pronounced with one syllable, as you would expect|
|H H||Pronounced aitch aitch|
|H H H||Pronounced aitch aitch aitch|
|H H H H H H H H||Pronounced aitch aitch aitch|
|Da-da-de-de-da||Pronounced with five syllables, roughly as you would expect|
|Da-da-de-de-da-||Pronounced dee-ay-dash-di-dash-di-dash-di-dash-di-dash. The dashes are pronounced for anything with hyphens in it that also ends in a hyphen, despite the fact that when splitting Da-da-de-de-da-de-da-de-da-de-da-de-da-da-de-da-da into a haiku, it’s correct punctuation to leave the hyphen at the end of the line:
Though in a different context, where – is a minus sign, and meant to be pronounced, it might need to go at the start of the next line. Greater-than and less-than signs have the same ambiguity, as they are not pronounced when they surround a single word as in an html tag, but are if they are unmatched or surround multiple words separated by spaces. Incidentally, surrounding da-da in angle brackets causes the dash to be pronounced where it otherwise wouldn’t be.
|U.S or u.s||Pronounced you dot es (this way, domain names such as angelastic.com are pronounced correctly.)|
|U.S. or u.s.||Pronounced you es|
|US||Pronounced you es, unless in a capitalised sentence such as ‘TAKE US AWAY’, where it’s pronounced ‘us’|
I also discovered what I’m pretty sure is a bug, and I’ve reported it to Apple. If two carriage returns (not newlines) are followed by any integer, then a dot, then a space, the number is pronounced ‘zero’ no matter what it is. You can try it with this file; download the file, open it in TextEdit, select the entire text of the file, then go to the Edit menu, Speech submenu, and choose ‘Start Speaking’. Quite a few haiku were missed or spuriously found due to that bug, but I happened to find it when trimming out harmless whitespace.
Apart from that bug, it’s all very clever. Note how even without the correct punctuation, it pronounces the ‘dr’s and ‘st’s in this sentence correctly:
the dr who lives on rodeo dr who is better than the dr I met on the st john’s st turnpike
However, it pronounces the second ‘st’ as ‘saint’ in the following:
the dr who lives on rodeo dr who is better than the dr I met in the st john’s st john
This is not just because it knows there is a saint called John; strangely enough, it also gets this one wrong:
the dr who lives on rodeo dr who is better than the dr I met in the st john’s st park
I could play with this all day, or all night, and indeed I have for the last couple of days, but now it’s your turn. Download the new Haiku Detector and paste your favourite novels, theses, holy texts or discussion threads into it.
If you don’t have a Mac, you’ll have to make do with a few more haiku from the New Scientist special issue on the brain which I mentioned in the last post:
Being a baby
is like paying attention
with most of our brain.
But that doesn’t mean
there isn’t a sex difference
in the brain,” he says.
They may even be
a different kind of cell that
just looks similar.
It is easy to
see how the mind and the brain
We like to think of
ourselves as rational and
It didn’t seem to
matter that the content of
these dreams was obtuse.
I’d like to thank the people of the xkcd Time discussion thread for writing so much in so many strange ways, and especially Sciscitor for exporting the entire thread as text. It was the test data set that kept on giving.
Perhaps I will not post something interesting every day for the rest of the month, but I should at least try.
Today I watched this video from the Virtual Linguistics Campus:
After that, I intended to analyse some sentences myself, but I got sidetracked thinking of simple ways to make diagrams like the ones in the video. It looks like there are apps and LaTeX packages to do something like it, but just for fun, I modified the AppleScript I wrote for diagramming monduckens to turn text like this:
Clause(Adverb(Perhaps) NP(Noun(you)) VP(Auxiliary(will) Adverb(never) Verb(find) NP(Determiner(a) Noun(job)) PP(Preposition(as) NP(Determiner(a) Noun(linguist))))) Clause(Conjunction(but) Noun(you) VP(Auxiliary(should) Adverbial(at least) Verb(try)))
into a tree like this in OmniGraffle:
Note that I am not sure if this is strictly correct (I think the adverbial ‘at least’ could have been broken into words, and the conjunction perhaps shouldn’t have been included in the second clause) but it’s how it is in the video. Redone with only rectangles (which is an option when running the script) and using the exact same Tree nester script the monducken diagrams did, this can then be turned into a rather oversized and misaligned version of the sentence with rectangles around the constituents:
I didn’t have a lot of time, so it’s pretty crude as yet, but it would be fairly simple to adjust the settings of the shapes to be more like what’s in the video. I’m posting it now in order to continue with Holidailies.
While we’re on the subject of grammar, The Doubleclicks have just covered a Tom Lehrer song about adverbs. I get this song in my head every single time I answer a ‘how’ question with an L-Y adverb, so I am very happy about the cover.
Let’s synchronise our beating hearts and I’ll
lay open just for you my very soul,
secure that you would never take control.
So, [End Of File]
Well thank you for your frankness; I’ll compile
some poems of my own uncensored whole,
that you may take a key companion role,
and take this key to tour my domicile.
My dear, do you not see that you’ve been played?
My heart’s not big; I sent but lies to you,
and used you for your private information;
I felt inside your sockets and got laid.
I understood what hearts are meant to do
is bleed with force to drive the circulation.
For those who have been out in the real world for the last few days instead of living in an internet-enabled cave like the rest of us, there’s a serious bug in OpenSSL which allows private information to be leaked to malicious users in much the same way as illustrated in this poem. It means that you should probably change your passwords on any site that had the buggy version of OpenSSL installed, provided it has been fixed; if the site hasn’t been fixed yet, there’s no point changing your password since the new one could still be hacked. Here is one list of servers and their status with regards to this bug; there are probably others. The bug is called Heartbleed, because it happens when a client sends a ‘heartbeat’ (to keep the connection alive) and pretends that it is sending more data than it actually is, and the server doesn’t check this, so when it tries to respond with the same data, it sends a random assortment of its own data the size of what the client said it had sent.
The ‘SSL’ in OpenSSL stands for ‘Secure Sockets Layer‘, which is supposed to be what keeps secret information safe on the internet, but this bug made it more open than secure. I made sure to include the words (or derivatives thereof) ‘Open’, ‘secure’, ‘sockets’ and ‘layer’ in order (with an additional ‘lay’ for luck) in the poem, so that the lying no-good user is in fact an open, secure, sockets layer.
If you have been living in the right kind of cave, you might be interested in seeing the code change which caused the bug.
I’ve never understood what ‘bleeding heart’ was supposed to mean. Bleeding, forcefully and rhythmically, is the heart’s primary function. Maybe its only function, but you never can tell with biology. If there isn’t blood coming out of your heart, you’re in very bad shape. You should get that looked at even before changing your passwords.
Addendum: I should perhaps point out that the heartbeat has nothing to do with synchronising anything; that’s just a sappy thing lovers sometimes talk about which seemed like a good way to get heartbeats into the poem. Don’t expect anything in the first quatrain to be accurate; it’s a malicious SSL client talking. Also, here‘s an article someone I know from JoCo Cruise Crazy wrote about Heartbleed, which seems like it has some useful links and information; I haven’t read it thoroughly yet, though, so for all I know it has a nice introduction and then an end of file marker.
A few weeks ago, a friend linked to Times Haiku, a website listing unintentional haiku found in The New York Times, saying ‘I’d actually pay for a script that could check for Haiku in my writings. That would make prose-production a lot more exciting! Who’s up to the script-writing-challenge?’
I knew I could do it, having written syllable-counting code for my robot choir (which I really need to create an explanation page about.) I told her I’d make it that weekend. That was last weekend, when I decided at the last moment to write an article about neutron stars and ISOLTRAP, and then chickened out of that and wrote a poem about it. So I put off the haiku program until yesterday. It was fairly quick to write, so here it is: Haiku Detector. It should work on Mac OS X 10.6 and above. Just paste or type text into the top part of the window, and any detected haiku will appear in the bottom part.
Haiku Detector looks for sentences with seventeen syllables, and then goes through the individual words and checks whether the sentence can be split after the fifth and twelfth syllables without breaking a word in half. Then it double-checks the last line still has five syllables, because sometimes the punctuation between words is pronounced. The Times Haiku-finding program has a database of syllable counts per word, but I didn’t need that since I can use the Mac OS X speech synthesis API to count the syllables. Haiku Detector makes no attempt to check for kigo (season words.)
The first place I looked for haiku was the Wikipedia page for Haiku in English. Due to the punctuation, it didn’t actually find any of the example haiku on the page, but it did find this:
Robert Spiess (Red Moon
Anthology, Red Moon Press,
How profound. Next, having declared myself contributing troubadour for New Scientist magazine, I fed this week’s feature articles through it, and found:
A pill that lowers
arousal doesn’t teach shy
people what to do
Meanwhile, there are signs
that the tide is turning in
favour of shyness.
So by 4000
years ago, the stage was set
for the next big step.
This heat makes the air
spin faster, so pulling the
storm towards the city.
Some will be cooler
and less humid — suitable
for outdoor sports, say.
The last ones seem almost seasonal.
I needed to stress-test the app with a large body of text, so I grabbed the first novel of which I had the full text handy: John Scalzi‘s Old Man’s War, which I had on my iPad on my lap to read while my code was compiling. This book has at least one intentional haiku in it, which Haiku Detector detected. Apart from that, some of my favourites are:
I hate that her last
words were “Where the hell did I
put the vanilla.”
As I said, this is
the place where she’s never been
anything but dead.
“I barely know him,
but I know enough to know
he’s an idiot.”
She’d find me again
and drag me to the altar
like she had before.
A gaper was not
long in coming; one swallow
and Susan was in.
They were nowhere to
be found, an absence subtle
and yet substantial.
And it stares at me
like it knows something truly
strange has just happened.
I haven’t got up to that fifth one in the novel yet, but it mentions a swallow, which I understand is (when accompanied by more swallows) a harbinger of Spring or Summer depending on which language you get your idioms from, so there’s the kigo.
Next I figured I should try some scientific papers — the kinds of things with words that the Times haiku finder would not have in its syllable database. You probably can’t check this unless your workplace also provides access to Physics Letters B, but I can assure you that the full text of the ISOLTRAP paper about neutron stars does not contain any detectable haiku. However, the CMS paper announcing the discovery of the boson consistent with the Higgs does:
In the endcaps, each
muon station consists of
six detection planes.
As is usual for CMS papers, the author and institute lists are about as long as the paper itself, and that’s where most of the haiku were too. Here are a few:
LHC Higgs Cross Section
Working Group, in: S.
of California, Davis,
That’s ‘one hundred and two’ in case anyone who doesn’t say it that way was wondering.
And here are some from my own blog. I used the text from a pdf I made of it before the last JoCo Cruise Crazy, so the last few months aren’t represented:
Beds of ground cover
spread so far in front of him
they made him tired.
those who only understand
half of this poem.
I don’t remember
what colour he said it was,
but it was not green.
His eyes do not see
the gruesome manuscript scrawled
over the white wall.
• Lines 1 to 3 have
four syllables each, with stress
on the first and last.
(That’s not how you write a haiku!)
I don’t wear armour
and spikes to threaten you, but
to protect myself.
A single female
to perpetuate the genes
of a thousand men.
Kerblayvit is a
made-up placeholder name, and
a kerblatent cheat.
He wasn’t the first,
but he stepped on the moon soon
after Neil Armstrong.
He just imagined
that in front of him there was
a giant dunnock.
(there are plenty more where that one came from, at the bottom of the page)
She was frustrated
just trying to remember
what the thing was called.
Please don’t consider
this a failing; it is part
of your programming.
While writing this program, I discovered that that the speech API now has an easier way to count syllables, which wasn’t available when I wrote the robot choir. The methods I used to separate the text into sentences and the view I used to display the haiku are also new. Even packaging the app for distribution was different. I don’t get to write Mac software often enough these days.
Yet again, I didn’t even bother to deal out the cards because I already had something to inspire me. In my halfhearted attempt to find a matching card, I came across one about electronics in the service of ALICE, so I ran the latest instalment of Probably Never, by Alice, into it, and got this:
Or well, I have to
put up with getting called a
fake girl all the time.
The jackhole who called
me a “he/she” recognized
that he crossed the line.
If that sounds interesting, subscribe to Probably Never, and I could probably forward you the rest of that episode if you want.
And finally, two unintentional haiku from this very post:
makes no attempt to check for
kigo (season words.)
(there are plenty more
where that one came from, at the
bottom of the page)
Wait; make that three!
And finally, two
from this very post:
Have fun playing with Haiku Detector, and post any interesting haiku you find in the comments. Also, let me know of any bugs or other foibles it has; I wrote it pretty quickly, so it’s bound to have some.
I know what I’m doing for the six of hearts; I’ve planned it for a long time but still haven’t actually started it. It’s musical, so it will probably be terrible; brace yourselves. By the way, I keep forgetting to mention, but They Might Not Be Giants will be published in Offshoots 12. Yay!
To be sung to the tune of My Favourite Things from The Sound of Music (though like in my other My Favourite Things parody, the structure is modeled more on various other parodies of that song.) Feel free to record yourself singing it so I don’t have to:
Catch all exceptions; what are they the heck for?
Just return nulls that the callers won’t check for,
or show an error box, if they insist,
brought back by loops every time it’s dismissed.
Checks and injection and joins are just theories;
just add more levels of nested subqueries,
lace all your filters with unescaped strings,
fetch from a multi-use table called THINGS.
Love the warning
all your huge source files;
they’re all just suggestions, there’s no need to test
as long as it all compiles.
Code reuse means not one code block is wasted —
ev’ry last one has been copied and pasted.
Make up for duplicates no more the same:
reclaim some space with a one-letter name.
I’ve used these same antipatterns since FORTRAN;
why should I listen to hacks I’m paid more than?
Even my students are older than you;
how dare you tell me I need code review?
Slam resource leaks
till you’re hoarse, geeks!
Rail against that kludge.
There’s no way to beat them; you’ll have to submit
to The Daily What The Fudge.
First, check out Vi Hart‘s video about the Thanksgiving turduckenen-duckenen:
Okay, there are monkeys instead of turkeys, and the mathematics isn’t quite as explicit, but it’s pretty similar, don’t you think? Now, let’s imagine that Mike Phirman is actually singing the recipe for a fractal turducken, or rather, monducken. You can imagine all the monkeys are turkeys if you’d rather eat the result than present it to some pretty thing to please them. (Note: Please do not kill any actual monkeys.) Monkeys, like birds, belong in trees, so I wrote an AppleScript to draw binary trees in OmniGraffle based on the text of the song. You can try it for yourself if you like; all you need is a Mac, OmniGraffle, and a text file containing some words. See the bottom of this post for links and instructions.
If Mike’s reading the binary tree recipe layer by layer, like the first example in Vi’s video, one possible tree for the first stanza of Chicken Monkey Duck looks like this, where the orange ovals are monkeys, blue hexagons are chickens and green clouds are ducks. You can click it (or any other diagram in this post) for a scalable pdf version where you can read the words:
I added numbers so you can easily tell the chickens, monkeys and ducks apart and see which way to read the tree. It’s simple enough now, but the numbers will be useful for reading later trees which are not in such a natural reading order. This is called a breadth-first traversal of the tree, in case you’re interested. Now, what do birds and monkeys do in trees? They nest! So I wrote another script that will take any tree-like diagram in OmniGraffle and draw what it would look like if the birds, monkeys, or whatever objects they happen to be (the drawing is pretty abstract) were nested inside each other, just like the quails inside the chickens inside the ducks inside the turkey. This is what the monducken described by the first stanza of Chicken Monkey Duck, in the tree structure shown above, would look like:
The Monducken script allows using a different shape for each animal as redundant coding for colourblind people, even though it already chooses colours which most colourblind people should be able to distinguish. But that makes the nested version look a little messy, so here’s the above diagram using only ovals:
If you named this particular recipe in the other way, going down the left side of the tree and then reading each branch in turn in what is known as a pre-order traversal, it would be called a Monenmonenduckduckmon-monmonducken-enenmonduckmon-enmonduck-enduckmonducken-enmonen-duckenenmon-monenmon. It doesn’t sound nearly as nice as Turduckenailailenailail-duckenailailenailail because Mike Phirman didn’t take care to always put smaller animals inside large ones. I’m not holding that against him, because he didn’t realise he was writing a recipe, and besides, it’s his birthday. For reasons I’m not sure I can adequately explain, it’s always his birthday.
But what if I completely misunderstood the song, and his recipe is already describing the fractal monducken as a pre-order traversal, always singing a bird or monkey immediately before the birds and monkeys inside it? Well, don’t worry, I added a ‘pre-order’ option to the script, so you can see what that would look like. Here’s the tree:
and here’s how the actual birds/monkeys would look if you cut them in some way that showed all the animals, dyed them the correct colours, and looked through something blurry (here’s the version with different shapes):
Okay, but that’s only the first stanza. What if we use the whole song? If we pretend the recipe is breadth-first, this just means all the extra monkeys and birds will be at the bottom levels of the tree, so the outer few layers of our monducken will be the same, but they’ll have a whole lot of other things inside them:
Here’s a close-up. Isn’t it beautiful?
If the entire song were treated as a pre-order monducken recipe, we’d still have the same monkey on the outside, but the rest would be quite different:
We could also read the birds and monkeys from left to right, as Vi did in her video. That’s what’s called an in-order tree traversal. But as delicious as they are mathematically, none of these orderings make much sense from a culinary perspective. Even if the monkeys were turkeys, it’s obvious that a nice big goose should be the outer bird. Vi suggested that herself. Of course, we could put the goose on the outside simply by reversing the song so it started with goose. But it would be much more fun and practical to pretend that Mike is naming the two inner birds before the one that contains them. This is called a post-order traversal, because you name the containing bird after the two birds or monkeys it will contain. It makes sense for a recipe. First you prepare a monkey (or turkey) and a chicken, then you immediately prepare a chicken and put them into it. You don’t have your workspace taken up with a whole lot of deboned birds you’re not ready to put anything into yet. Here’s one way the recipe could be done:
Note that no matter what kind of traversal we use, there are actually several ways the recipe could be interpreted. If Mike says ‘monkey chicken chicken’ you know you should take a monkey and a chicken and put them in a chicken. But if the next words are ‘monkey chicken’, do you take that stuffed chicken and a monkey and put them inside a chicken? Do you debone the monkey and the chicken and wait for the next bird to find out what to put them into? What if there’s no next bird? What if there’s only one more bird (let’s say a duck) and you end up with a stuffed chicken, a stuffed duck, and nothing to stuff them into? You’d have to throw one of them out, because obviously your oven only has room for one monducken. Assuming you want two things in each thing, and you don’t know how long the song’s going to be, the best way to minimise this kind of problem is to always take your latest stuffed thing and the next, unstuffed thing, and put them inside the thing after that. The worst that’ll happen is you’ll have to throw out one unstuffed bird or monkey. But then you end up with a really unbalanced monducken, with a whole lot of layers in one part and lonely debonely birdies floating around in the rest.
It helps to have a robot chef on hand to figure out how many full layers of monducken you can make without it being too asymmetric. Mine makes the trees completely balanced as deeply as possible, and then does whatever was easiest to program with the remaining birds and monkeys. In this case it was easiest for my program to stuff a whole lot of extra animals into that one monkey on the left. This is what it looks like, with the varied shapes this time. Luckily, geese are rectangular, so they fill your oven quite efficiently:
I like how you can see the explosion of duck radiating out from the inner left, engulfing all the other birds and monkeys before itself being swallowed by a goose. Such is life.
If you would like to make diagrams like this yourself, there are two AppleScripts you can use. Both of them require OmniGraffle 5 for Mac, and if you want to make trees with more than 20 nodes you’ll probably need to register OmniGraffle.
The first is Monducken diagrammer, which you can download either as a standalone application (best if you don’t know what AppleScript is) or source code (if you want to tweak and critique my algorithms, or change it to use OmniGraffle Professional 5 instead of OmniGraffle 5.) Because it’s AppleScript, it works by telling other applications what to do, rather than doing things itself. So when you run it, TextEdit will ask you to open the text file you want to turn into a tree. Once you’ve opened one, OmniGraffle will start up (you may need to create a new document if it’s just started up) and ask you two things. First it will ask what kind of tree traversal the text file represents. Then it will ask you what kinds of shapes you want to use in your tree. You can select several shapes using the shift and command keys, just as you would for selecting multiple of just about anything on your Mac. Then you can sit back and watch as it creates some shapes and turns them into a tree.
The other one is Tree nester (standalone application/source code) You should have an OmniGraffle document open with a tree-like diagram in it (I suggest a tree generated using Monducken diagrammer; it has not been tested on anything else, and will probably just duplicate most of the shapes that aren’t trees or end up in an infinite loop if there’s a loopy tree) before you run this. It won’t ask any questions; it’ll just create a new layer in the front OmniGraffle document and draw nested versions of any trees into that layer.
If you’re looking at the source code, please bear in mind that I wrote most of this while on a train to Cologne last weekend, based on some code I wrote a while ago to draw other silly diagrams, and I really only dabble in AppleScript, and I forgot about the ‘outgoing lines’ and ‘incoming lines’ properties until I’d almost finished, so it probably isn’t the best quality AppleScript code. Not the worst either though. I welcome any tips.
I’ve mentioned before that I have grapheme-colour synaesthesia. That means that I intuitively associate each letter or number with a colour. The colours have stayed the same throughout my life, as far as I remember, and they are not all the same colours that other grapheme-colour synaesthetes (such as my father and brother) associate with the same letters. I still see text written in whichever colour it’s written in, but in my mind it has other colours too. If I have to remember the number of a bus line, there’s a chance I’ll remember the number that goes with the colour it was written in rather than the correct letter, or I’ll remember the correct letter and look in vain for a bus with a number written in that colour.
Well, I’ve been wondering whether it could work the other way.
- Could grapheme-colour synaesthetes learn to look at a sequence of colours that correspond to letters in their synaesthesia, and read a word?
- Could this be used to send code messages that only a single synaesthete can easily read?
- Could colours be used to help grapheme-colour synaesthetes learn to read a new alphabet, either one constructed for the purposes of secret communication, or a real script they will be able to use for something?
- What would be the difference in learning time for a grapheme-colour synaesthete using their own colours for the replacement graphemes, a grapheme-colour synaesthete using random colours, and a non-synaesthete?
I know that for me, there are quite a few letters with similar colours, and a few that are black or white, so reading a novel code wouldn’t be infallible, but I suspect I would be able to learn a new alphabet a little more easily or read it more naturally if it were presented in the ‘right’ colours. I wonder whether the reason the Japanese symbol for ‘ka’ seemed so natural and right to me was that it seemed to be the same colour as the letter k.
It occurred to me that, as a programmer and a grapheme-colour synaesthete, I could test these ideas, or at least come up with some tools that scientists working in this area could use to test them. So I wrote a little Mac program called Synaesthetist. You can download it from here. In it, you choose the colours that you associate with different letters (or just make up some if you don’t have grapheme-colour synaesthesia and you want to know what it’s like) and save them to a file.
Then you can type in some text, and you’ll see the text with the letters in the right colours, like so:
But even though this sample is using the ‘right’ colours for the letters, it still looks all wrong to me. When I think of a word, usually the colour of the word is dominated by the first letter. So I added another view with a slider, where you can choose how much the first letter of a word influences the colours of the rest of the letters in the word.
This shows reasonably well what words are like for me, but sometimes the mix of colours doesn’t really resemble either original colour. It occurred to me that an even better representation would be to have the letters in their own colours, but outlined in the colour of the first letter. So I added that:
Okay, so that gives you some idea of what the words look like in my head. And maybe feeding text through this could help me to memorise it. Here’s an rtf file of the lyrics to Mike Phirman‘s song ‘Chicken Monkey Duck‘ in ‘my’ colours, with initial letter outline. I’ll study these and let you know it it helps me to memorise them. To be scientific about it, I really should recruit another synaesthete (who would have different colours from my own, and so might be hindered by my colours) and a non-synaesthete to try it as well, and define exactly how much it should be studied and how to measure success. But I’m writing a blog, not running a study, so if you want to try it, download the file. (I’d love it if somebody did run a study to answer some of my questions, though. I’d add whatever features were necessary to the app.)
But these functions don’t go too far in answering the questions I asked earlier. How about reading a code? Well, I figured I’d be more likely to intuit letters from coloured things if they looked a little bit like letters: squiggles rather than blobs. So first I added a view that simply distorts the letters randomly by an amount that you can control with the slider. I did this fairly quickly, so there are no spaces or word-wrapping yet.
I can’t read it when it gets too distorted, but perhaps it’s easier to read at low-distortion than it would be if the letters were all black. Maybe I’d be able to learn to ‘read’ the distorted squiggles based on colour alone, but I doubt it. This randomly distorts the letters every time you change the distortion amount of change the text, and it doesn’t keep the same form for each occurrence of the same letter. Maybe if it did, I’d be able to learn and read the new graphemes more easily than a non-synaesthete would. Okay, how about just switching to a font that uses a fictional alphabet? Here’s some text in a Klingon font I found:
I know that Klingon is its own language, and you can’t just write English words in Klingon symbols and call it Klingon. But the Futurama alien language fonts I found didn’t work, and Interlac is too hollow to show much colour.
Anyhow, maybe with practice I’ll be able to read that ‘Klingon’ easily. I certainly can’t read it fluently, but even having never looked at a table showing the correspondence between letters and symbols, I can figure out some words if I think about it, even when I copy some random text without looking. I intend to add a button to fetch random text from the web, and hide the plain text version, to allow testing of reading things that the synaesthete has never seen before, but I didn’t have time for that.
Another thing I’ll probably do is add a display of the Japanese kana syllabaries using the consonant colour as the outline and the vowel colour as the fill.
Here’s a screenshot of the whole app:
As I mentioned, you can download it and try it for yourself. It works on Mac OS X 10.7, and maybe earlier versions too. To use it, either open my own colour file (which is included with the download) or create a new document and add some characters and colours in the top left. Then enter some text on the bottom left, and it will appear in all the boxes on the right side. If you change the font in the bottom left, say to a Klingon font, it will change in all the other displays except the distorted one.
This is something I’ve coded fairly hastily on the occasional train trip or weekend, usually forgetting what I was doing between stints, so there are many improvements that could be made, and several features already halfway developed. It could do with an icon and some in-app help, too. I’m still working on this, so if you have any ideas for it, I’m all ears.
In the film Spider-Man 3, escaped convict Flint Marko jumps over a fence marked:
Particle Physics Test Facility
And ends up getting caught in a some kind of beam and becoming the Sandman, a being made out of sand who can change his shape at will. I watched it in the theatre with about a dozen people from CERN (all of them named Maikel), and one of them exclaimed, ‘Run to building 40, get a coffee!’
Unfortunately, you won’t turn into the Sandman by sneaking into CERN. But you might just turn into something like the Silver Surfer. Well, okay, maybe you wouldn’t travel faster than light, but you could levitate. I finally got to do so on their superconducting scooter at the Supra Show to celebrate 100 years of superconductivity a couple of weeks ago:
And you don’t even need to jump a fence! Just keep an eye on CERN’s homepage and MaNEP’s homepage, and sign up to the Globe’s mailing list to find out when there will be interesting talks and demonstrations for the general public. There are also a few other events coming up where it might make an appearance. I’ve seen the scooter at a couple of different events, and I don’t know how often they bring it out, but there are many other interesting talks and demonstrations.
There’s more information on how the superconducting scooter works in the video description. It’s essentially superdiamagnetism, as far as I know. Doesn’t quite have the same ring to it as Superman, but hey, it’s real! Welcome to the future. Here’s a nice explanation which begins with a Superman reference. Incidentally, you don’t have to be a superconductor to levitate due to diamagnetism. Even frogs can levitate, but it’s not easy.
Of course, the other way you could become a superhero is by using Generic™ brand hair gel.
By the way, the song in that video is Liquid Nitrogen, by CERN’s other LHC, Les Horribles Cernettes. My other superpower is knowing a song about almost every topic. Today, somebody brought up Malcolm Gladwell’s idea that becoming an expert at something takes 10 000 hours of practice, so I decided to find out how much time I’ve spent listening to funny music. I wrote an AppleScript to sum up the time spent listening to the selected songs in iTunes, and selected all the songs in my Silly Songs playlist. Alas, I have only listened to it for 3026 hours, at least since April 2005 when I dropped my iPod and lost all that information. So if it turns out there’s something I don’t have a song about, it’s because I’m not an expert. I am an expert on all of my music, including the ‘normal’ stuff, though, with 11 242 hours.
Back to superheroes: Could somebody who understands more about the relationship of electric power to superconductivity please make a joke involving Spider-Man’s ‘with great power comes great responsibility’? As far as I can tell, with great power comes the same great power, circulating forever, but that’s not very funny. Just like immortality without immunity to pain isn’t very funny after the Sun burns out, when you’re just floating through space for eons on end, occasionally getting stuck inside a star or black hole until it goes supernova or evaporates.
Addendum: I finally wrote a short story about that last sentence.
Addendum 2: Someone I know only as arthurd006_5 suggests ‘with great power comes great coercivity‘ but isn’t sure whether that works electromagnetically. It does sound nice though, and outside of electromagnetism, great coercion seems to come with great power.
I’m a bit of a free music junkie. Free as in beer (or doughnuts, since I don’t like beer) is good, free as in speech is better, but this post is about the free as in doughnuts kind, which costs nothing until you get a taste for doughnuts and then end up buying out the whole Krispy Kreme, travelling around the world to have different doughnuts with different people, and getting too fat for your iPod. Download free music responsibly, kids (okay, I guess the beer metaphor would have made more sense.) Anyway, back to free music. One way I discover a lot of music is through podcasts which regularly publish individual songs. However, I use iTunes, and iTunes gives podcast tracks the name and artist given in the podcast feed (often taken from the title of a blog post) over whatever was set in the ID3 tags of the mp3 file itself. This might be a good idea for non-music podcasts, and maybe some music podcasts, where the details aren’t necessarily filled out, but for some of the music podcasts I subscribe to it doesn’t really work out. Particularly if there’s a blog post associated with each podcast episode, the title tends to include the artist name and sometimes some other information.
I can’t be bothered fixing all of the tracks manually, so a few years ago I wrote a few AppleScripts to fix up the metadata of the music podcasts I was subscribed to, and also add the tracks to my Songs playlist (which I use as the basis of most of my smart playlists) and turn off the ‘Remember position’ and ‘Skip while shuffling’ options that are turned on by default for podcast tracks. I’ve since subscribed to and made scripts to fix a few more music podcasts, and it occurred to me that other people might find the scripts useful, so I’ve just tidied up the code and added a way to choose which playlist to add the tracks to. There are links to the scripts and related podcasts below.