Kadin2048's Weblog
JulAug Sep
Oct Nov Dec


Fri, 07 Dec 2007

As I recently mentioned, I’m a big fan of the “Unified Logging Format” for instant messaging logs. Unfortunately, of the two IM clients I use most — Adium on Mac OS X and CenterIM on Linux — only Adium uses it. CenterIM uses a simple flat-file format, delimited with linefeeds and formfeeds.

Since I’d really like to get all my logs in one place, in the same format, I wrote a little Python script to convert CenterIM’s flat-file format into something approximating ULF as implemented by Adium. It’s not perfect, and I’d suggest that persons with weak constitutions and functional programmers not look at the code, but it does seem to work fairly well on my logs. To get an idea of what it does, this is a snippet of a CenterIM log, showing an incoming message followed by an outgoing reply:

how's your day going?

With “^L” representing the ASCII form-feed character. In Adium format / ULF, this might appear as:

<chat account="joeblow" service="AIM" version="0.4">
  <message sender="janedoe" time="2007-12-04T14:47:35-0000">hey</message>
  <message sender="joeblow" time="2007-12-04T15:34:38-0000">how's your day going?</message>

The major limitations the converter suffers from are a consequence of the differing structure of CenterIM’s logs and Adium’s. CenterIM stores chats in a single file for each contact, with one record for each message sent or received. ULF/Adium use one file per ‘conversation,’ which is apparently all the messages sent or received in a single window (i.e. when you close the window, a new conversation begins on the next message). CenterIM has no concept of conversations, only messages. This means that when you convert a CenterIM log to Adium’s format, Adium sees it as one long conversation, and it appears this way in Adium’s log viewer.

Also, while Adium and ULF store the account names of the conversation participants in the logs, CenterIM simply marks messages as ‘IN’ or ‘OUT’, requiring you to look at the log’s enclosing directory to get the name of the participant. Currently, my script doesn’t do this: it just expects the sender’s and receiver’s account names as command-line arguments.

The syntax is:
$ python cimconverter.py filename yoursn theirsn service
Where filename is the name of the log file you want to convert (usually “history”), yoursn is your screen or account name, theirsn is the account name of the person you had the conversation with, and service is the name of the IM service (AIM, MSN, etc.).

At some point, I will try to fix it so it can grab more of the parameters (at least theirname and service) from the history file’s path. But for now it’s just the bare minimum. I can’t guarantee that the output actually conforms to the ULF specification, since to my knowledge nothing formal exists; however, it does produce output that Adium’s log viewer processes and displays, and that’s basically the de facto standard at the moment.

0 Comments, 0 Trackbacks

[/technology] permalink

Thu, 06 Dec 2007

The “Unified Logging Format” is one of those ideas that you can’t believe hasn’t been done already. Its goal is simple: define a standard format for instant messaging logs that can be used across applications and across platforms, instead of the mishmash of poorly defined, application-specific microformats that exist today. Although currently only one IM program supports it (Adium for Mac OS X), and the informal standardization process seems to have stalled, it’s such a good idea and would benefit so many users that I really hope other developers will see the light.

If you’ve ever switched from one IM client to another you’ve probably had to abandon all your old logs, or keep a copy of the old client around in case you ever wanted to look at them again. With ULF, this wouldn’t happen: you could take the logs from one IM client and move them over happily to a different one. In fact, rather than having each IM application manage and store its logs separately, you could define one location on your system for logs, and let various applications all dump their stuff there. Rather than having to read your logs through whatever minimal reader the client program provides, you could use a purpose-built log viewer (which there would probably be more of, because writing a log viewer is a lot easier when you only have to worry about one format of log files).

Plus, it opens the door to log-file synchronization across multiple systems with ease, even if your computers are running different OSes (and thus use different IM clients). And since ULF is at its core a text-based XML format, it’s far more ‘future proof’ than an application-specific format that’s going to fade into obscurity and become unreadable once the application ceases to be developed.

There’s really no bad here at all.

2 Comments, 0 Trackbacks

[/technology] permalink