Generalized Metadata in your Palm

Norman Walsh

Staff Engineer
Sun Microsystems, XML Technology Center

15 Aug 2002

Extreme Markup Languages
04 - 09 August, 2002

Montreal, Quebec, Canada

Abstract

This paper describes a system for integrating generalized RDF metadata into the standard Palm™ databases.

$Id: palmrdf.xml,v 1.6 2002/08/15 20:26:17 ndw Exp $


Table of Contents

1. Ad-Hoc Metadata
1.1. Syncing with Ad-Hoc Metadata
1.2. Is There a Better Way?
2. RDF Metadata
2.1. Using RDF on the Palm
2.1.1. Using N3
2.2. Syncing with RDF
3. Examples
4. The Palm Ontology
4.1. Other Ontologies
5. Conclusions
6. Future Work
References

Like many people, I rely on my Palm™ organizer to keep my life in order. It is the definitive source for information about my schedule, my address book, and my outstanding “todo” items. It also stores various bits of miscellany in memos. These are the four standard applications in the Palm series of PDAs. There are literally thousands of other applications available for the Palm, but I won't be considering them here.

One of the first things that I wanted from my Palm was access to the data in XML. I've never liked GUI interfaces much and I certainly didn't want to rely on a proprietary, unprogrammable application to access my data.

I was initially unsuccessful, but after I moved my laptop to Linux, I found the [PilotManager] tools and I was eventually able to write [XML Conduits]. This was a great step forward:

[Netscape screengrab of Palm data on the Web]

Figure 1. Web View of my Palm Calendar

Moving my data to the web, especially to share it with colleagues, revealed a weakness in the Palm that I hadn't previously noticed: there is no easy way to associate records in one database with another.

Consider the simple example of travel to the Extreme conference. I have travel plans to come to Montreal for a number of days (in my schedule) where I'll be staying at a particular hotel (in my address book). When I publish this data on the web, I'd like the phone number of the hotel to be displayed along with the information about where I'll be traveling. A simple problem, but not one that the Palm is designed to solve.

Obviously a mechanism is needed for associating the phone number of my hotel with the date book entry for my trip. How might that be done?

  1. Copy the phone number into the description of the event.

    That makes display easy, but it clutters up the display on the PDA and doesn't provide any means to distinguish hotel phone numbers from other phone numbers that might be relevant. Besides, it gives the impression that the phone number is somehow for the event.

  2. Copy the phone number into the “note” field of the date book entry.

    This works, but the note field is unstructured, so there's no easy way to get the phone number back out.

  3. Use a structured note field: add XML or some structured text to the note field so that it can be parsed later.

    I suppose ideally one could use XML in the note field, but practically it's too tedious: entering XML with PDA gestures is a struggle and the data isn't parsed at input time so errors aren't identified when they're made.

    Using a semi-formal structure works reasonably well.

These three possibilities are shown below:

Palm snapshot: phone number in event description
Palm snapshot: phone number in note
Palm snapshot: phone number in structured note

1. Ad-Hoc Metadata

The first release of my XML conduits used an ad-hoc solution like the one described above. When the event record was stored in XML, the note field was examined for user fields, and they were copied into the XML as unique elements. An example record is shown in Example 1.

<appointment id="id55" event="1" category="Conference">
  <repeat weekstart="0" frequency="1" type="Daily">
    <end year="2002" month="8" day="9" untimed="1"/>
  </repeat>
  <user-field name="hotel">Hotel Wyndham Montreal</user-field>
  <user-field name="phone">514.285.1450</user-field>
  <note>User-fields:
hotel: Hotel Wyndham Montreal
phone: 514.285.1450
</note>
  <description>Extreme Markup Lang</description>
  <begin year="2002" month="8" day="4" untimed="1"/>
</appointment>

Example 1. XML Representation of an Event with User Fields

1.1. Syncing with Ad-Hoc Metadata

This solution adds the user fields to the XML record for styling, but leaves them in the note field for synchronization. In other words, the user-field values are read-only; to change the Palm record, you have to edit the note field. The publication process for this system is summarized in Figure 2.

Palm/XML Synchronizing/publishing diagram

Figure 2. Publishing from the Palm with XML

PilotManager talks to the Palm and uses the SyncXmlDB conduit to read and write an XML representation of the date book (there are analogous conduits for the address book, memos, and todos). The resulting XML document can be transformed to produce web pages.

1.2. Is There a Better Way?

In the spring of 2002, several nearly simultaneous events occurred that made me reconsider the metadata solution for my palm:

  1. I found an RDF application ([Circle and Arrow Diagrams]) that was compelling enough to interest me in learning RDF.

  2. Consequently, I discovered [N3], a simple text representation of RDF rules.

  3. I found that I had an ever growing collection of ad-hoc user field names (hotel, phone, uri, hoteluri, class, type, …). RDF seemed like it would provide a mechanism for me to identify the semantics of these fields with a little more rigor. (I'm not sure that was particularly necessary, but it seemed like “a good thing”.)

  4. The ability to do circle and arrow diagrams easily inspired me to revisit some projects that I'd had on the back burner for a long time. One of those was a little family genealogy. It occurred to me almost immediately that what I wanted was the ability to store these genealogical relationships in the same place where I store all the other data about people in my life, in my address book.

  5. I actually rely on two sources of information about people: my Palm and my [BBDB] database in Emacs. Keeping these sources in sync is a bit painful and I hadn't done it in a while. RDF seemed to promise the ability to deal with these two data sources more semantically which also seemed like a good thing.

  6. Trying to sync my BBDB and my Palm always reminds me of another irritating limitation of the Palm address book: you can only have one address per individual. It would be nice to be able to have multiple Palm records and associate them together.

  7. Some of the individuals in my address book are related to each other so they have, for example, different work phones but the same home phone and the same home address. It would be nice if I could avoid duplicating the information that's the same. In particular, the RDF tool [CWM] would allow me to write inference rules: if “‘A’ is the significant other of ‘B’ then ‘A’ has the same home address as ‘B’” (not a universal truth, but true for my address book).

2. RDF Metadata

Converting my existing system to one that uses RDF required pervasive small adjustments.

2.1. Using RDF on the Palm

The first change was the move to RDF instead of my ad-hoc user fields.

If I had access to the source code for the Palm applications, or if I was writing my own applications, I'd probably create some kind of form that made it easy to construct the RDF statements. I can imagine various sorts of pull-down menus with context-sensitive default values that would make adding the metadata quite easy.

However, my goal is to integrate RDF directly into the existing Palm applications, so I really can't extend the user interfaces: the RDF will have to be written by hand in the notes field of each entry. These applications are neither RDF- nor XML-aware so some serialization syntax will have to be used.

Data entry on a PDA is somewhat tedious. There's strong motivation to minimize the amount of typing required to add metadata to an entry. Furthermore, the XML serialization(s) of RDF are fairly verbose, so some sort of abbreviated syntax that uses relatively few markup characters is required to make the authoring task practical on a Palm.

Enter N3. N3 is a simplified syntax designed for ease of authoring. It's perfect for this application, is supported by a number of tools, and can be automatically converted to an XML serialization when necessary.

2.1.1. Using N3

Most RDF statements that you'll want to make about an entry in your PDA are fairly simple. In fact, the vast majority will simply associate a property value with the current entry. In N3, entries and properties are represented by QNames. Property values can either be other entries or string literals[1].

For example, let's say we have a date book entry for the conference identified as “db:extreme2002” and we want to set the “p:phone” property of that entry to “514.285.1450”. In N3, we could express that as follows:

db:extreme2002 p:phone "514.285.1450" .

While that's short and easier to type than the equivalent XML representation, further reflection reveals that we could simplify things even further for this application. Remember that our N3 statements will have to be parsed out of the note field, so it's OK if they're not quite proper N3 as long as they can easily be turned into proper N3 by the program that parses the note field. With that in mind, the following rules can be applied:

  1. Observe that we're setting this property in the entry for the conference, so we shouldn't need to explicitly identify the conference for each property that we want to set. We can leave off the leading identifier and just start with the property value.

  2. Periods are hard to see on the PDA screen, so we establish the rule that every statement must fit on a single line and then we can infer the period.

  3. Property values are either other entries (identified by a QName) or they are string literals, so we establish the rule that quotes are optional around a value unless that value contains a colon.

  4. On those uncommon occasions where you want to enter a more complex N3 expression, we adopt the final rule that if a line begins with a double quote, we discard the leading double quote and treat the rest of the line as literal N3 markup. This allows for multi-line entries and other complexities that would otherwise be impossible to express.

With these rules in place, our entry is simply:

p:phone 514.285.1450

Storing the hotel phone number is a literal translation of the old system into RDF, but it doesn't take advantage of any of the new power we have available. It's not the phone number association that's really important, it's the association with the hotel that we want to capture. So, if the hotel is identified by “ab:wyndham-montreal”, a better entry is:

p:hotel ab:wyndham-montreal

That raises the question, if it hadn't occurred before, of how to associate unique identifiers with an entry. We provide that facility with one more convention. The value associated with the property “:id” is taken to be the identifier. So, if we want to be able to refer to the Wyndham Montreal hotel as “ab:wyndham-montreal”, we'll have to have the following N3 statement in its entry:

:id wyndham-montreal

Similarly, our entry for the conference would have to include

:id extreme2002

if we want to refer to it with that identifier. If you don't need to point to an entry, you don't have to give it an identifier.

2.1.1.1. Storing N3 in each Entry

Now that we know what N3 statements to write, two questions remain: how do we store them in the note field, and how do we deal with any “prologue” material needed in the N3 format. In the preceding discussion, we ignored the fact that the N3 file needs a prolog to associate prefixes with URIs and make general statements.

In answer to the first question, we simply use the identifier “rdf:” in the note field. If “rdf:” appears on a line by itself in the note field, every following line up to the first blank line is taken to be an N3 statement.

Palm snapshot: phone number in note using RDF

This example identifies the location, URI, and hotel associated with the conference.

2.1.1.2. Storing N3 “Headers”

The header data associates namespaces with prefixes. It's also the place where you'll put any inference rules or other global statements.

Ideally, I think I'd store these values in a memo, but the current conduit syncing system doesn't give a conduit access to data from other databases in any practical, reliable way, so I adopted a different convention: I store the information in the note field of an event with the description “‘database’ N3 Header”. For example, I have a date book entry with the description “DateBook N3 Header” on Jan 2, 2020. When the N3 data is extracted, the memo field for that entry is always placed at the top of the file. Similar entries in the other databases provide the equivalent functionality.

2.2. Syncing with RDF

Syncing with the new RDF system is mostly the same as before. What's new is the generation of N3 and RDF versions of each of the Palm database and the use of CWM to expand inferences. The publication process for this system is summarized in Figure 3.

Palm/XML+RDF Synchronizing/publishing diagram

Figure 3. Publishing from the Palm with XML and RDF

PilotManager talks to the Palm and uses the SyncXml{AB,DB,MB,TB} conduits to read and write an XML representation of the various databases. The result is a set of XML documents: the XML representation of the database, the N3 file derived from the database, and an RDF representation of the database. These files are fed through cwm to produce a unified RDF document that is subsequently styled.

3. Examples

To see how this system works, let's examine a particular example in a little more detail. My Palm contains the following records for the Extreme conference and the Wyndham Montreal hotel:

Extreme entry
Extreme entry note
Wyndham hotel entry
Wyndham hotel entry note

Table 1. Date Book and Address Book Entries for Extreme

After syncing, the resulting N3 files contain:

:id55 p:class p:undecided .
:id55 p:loc "Montreal, Quebec, CA" .
:id55 p:uri "http://www.extrememarkup.com/extreme/" .
:id55 p:hotel ab:wyndham-montreal .
:wyndham-montreal :palmid :id502 .
:wyndham-montreal :id "wyndham-montreal" .
:wyndham-montreal p:uri "http://www.wyndham.com/" .

More significantly, the resulting records in the combined, post-cwm file are shown in Example 2 and Example 3.

    <palm:Contact rdf:about="file:/home/ndw/.xmlAddr.rdf#wyndham-montreal">
        <phones rdf:resource="file:/home/ndw/.xmlAddr.rdf#wyndham-montreal_phones"/>
        <primaryPhone rdf:resource="file:/home/ndw/.xmlAddr.rdf#wyndham-montreal_phone0"/>
        <phone rdf:resource="file:/home/ndw/.xmlAddr.rdf#wyndham-montreal_phone0"/>
        <phone rdf:resource="file:/home/ndw/.xmlAddr.rdf#wyndham-montreal_phone1"/>
        <phone rdf:resource="file:/home/ndw/.xmlAddr.rdf#wyndham-montreal_phone2"/>
        <phone rdf:resource="file:/home/ndw/.xmlAddr.rdf#wyndham-montreal_phone3"/>
        <email rdf:resource="file:/home/ndw/.xmlAddr.rdf#wyndham-montreal_email4"/>
        <addresses rdf:resource="file:/home/ndw/.xmlAddr.rdf#wyndham-montreal_addresses"/>
        <palmAddress rdf:resource="file:/home/ndw/.xmlAddr.rdf#wyndham-montreal_address0"/>
        <address rdf:resource="file:/home/ndw/.xmlAddr.rdf#wyndham-montreal_address0"/>
        <lastname></lastname>
        <firstname></firstname>
        <company>Hotel Wyndham Montreal</company>
        <title></title>
        <custom1></custom1>
        <custom2></custom2>
        <custom3></custom3>
        <custom4></custom4>
        <note>rdf:
:id wyndham-montreal
p:uri http://www.wyndham.com/
</note>
        <category rdf:resource="file:/home/ndw/.xmlAddr.rdf#_cat14"/>
        <palmid rdf:resource="file:/home/ndw/.xmlAddr.rdf#id502"/>
        <id>wyndham-montreal</id>
        <p:uri>http://www.wyndham.com/</p:uri>
    </palm:Contact>

Example 2. The Date Book Entry in RDF

    <palm:Appointment rdf:about="file:/home/ndw/.xmlDate.rdf#id55">
        <db:event>1</db:event>
        <db:category rdf:resource="file:/home/ndw/.xmlDate.rdf#_cat5"/>
        <db:repeat rdf:resource="file:/home/ndw/.xmlDate.rdf#id55_repeat"/>
        <db:notes>##@@E@@@@@@@@@@@@@@
rdf:
p:loc Montreal, Quebec, CA
p:uri http://www.extrememarkup.com/extreme/
p:hotel ab:wyndham-montreal

</db:notes>
        <db:datebk4 rdf:resource="file:/home/ndw/.xmlDate.rdf#id55_datebk4"/>
        <db:description>Extreme Markup Lang</db:description>
        <db:untimed>1</db:untimed>
        <db:begin-year>2002</db:begin-year>
        <db:begin-month>8</db:begin-month>
        <db:begin-day>4</db:begin-day>
        <p:loc>Montreal, Quebec, CA</p:loc>
        <p:uri>http://www.extrememarkup.com/extreme/</p:uri>
        <p:hotel rdf:resource="file:/home/ndw/.xmlAddr.rdf#wyndham-montreal"/>
    </palm:Appointment>

Example 3. The Address Book Entry in RDF

Note how the N3 statements have been made available in the RDF, giving XML processors direct access to them. The XSLT stylesheets in particular have access to all of the information needed to produce the complete web page.

An example that utilizes RDF inference rules is even more interesting. Consider the following address book records for John and Jane and the date book record for John's Birthday.

John's address entry
Jane's address entry
John's birthday
John's birthday note

Table 2. Date Book and Address Book Entries for John and Jane

The N3 headers for the address book in my Palm include the following statements:

{ :p p:spouse :s } log:implies { :p p:sigother :s } .
{ :p p:sigother :s } log:implies { :s p:sigother :p } .
{ :p p:sigother :s } log:implies { :p p:livesWith :s } .

These rules indicate that:

  1. The relationship “spouse” is a kind of significant other.

  2. The significant other relationship is reflexive. (If Jane is John's significant other, then John is Jane's significant other.)

  3. The significant other relationship implies cohabitation. In other words, significant others live together.

Next, I have the rule:

{ :p p:livesWith :s .
  :s :phone :o .
  :o :label :t .
  :t palm:label "Home" } log:implies { :p :phone :o } .

Which says that if John lives with Jane and Jane has a phone number and that phone number has the Palm label “Home” then John has that phone number too.

And finally:

{ :p p:livesWith :s .
  :s :address :o .
  :o :label "Home" } log:implies { :p :address :o } .

This rule says that if John lives with Jane and Jane has an address labelled “Home”, John has that address too. (The Palm doesn't support multiple addresses, alas, so the address label has to be inserted manually.)

When CWM processes these entries, it expands these implications. Here is the resulting record for Jane:

    <palm:Contact rdf:about="file:/home/ndw/.xmlAddr.rdf#jane">
        <phones rdf:resource="file:/home/ndw/.xmlAddr.rdf#jane_phones"/>
        <primaryPhone rdf:resource="file:/home/ndw/.xmlAddr.rdf#jane_phone0"/>
        <phone rdf:resource="file:/home/ndw/.xmlAddr.rdf#jane_phone0"/>
        <phone rdf:resource="file:/home/ndw/.xmlAddr.rdf#jane_phone1"/>
        <phone rdf:resource="file:/home/ndw/.xmlAddr.rdf#jane_phone2"/>
        <phone rdf:resource="file:/home/ndw/.xmlAddr.rdf#jane_phone3"/>
        <email rdf:resource="file:/home/ndw/.xmlAddr.rdf#jane_email4"/>
        <lastname></lastname>
        <firstname>Jane</firstname>
        <company></company>
        <title></title>
        <custom1></custom1>
        <custom2></custom2>
        <custom3></custom3>
        <custom4></custom4>
        <note>rdf:
:id jane
</note>
        <category rdf:resource="file:/home/ndw/.xmlAddr.rdf#_cat8"/>
        <palmid rdf:resource="file:/home/ndw/.xmlAddr.rdf#id503"/>
        <id>jane</id>
        <p:sigother rdf:resource="file:/home/ndw/.xmlAddr.rdf#john"/>
        <p:livesWith rdf:resource="file:/home/ndw/.xmlAddr.rdf#john"/>
        <address rdf:resource="file:/home/ndw/.xmlAddr.rdf#john_address0"/>
        <phone rdf:resource="file:/home/ndw/.xmlAddr.rdf#john_phone1"/>
    </palm:Contact>

Example 4. Jane's Address Record

Note that her entry has pointers to John's address and home phone numbers. The resulting address book entry for Jane includes the information implied from her relationship with John:

[Netscape screengrab of Palm data on the Web]

Figure 4. Web View of Jane's Address Book Entry

And the entry for John includes links not only to Jane, but also to the birthday associated with him:

[Netscape screengrab of Palm data on the Web]

Figure 5. Web View of John's Address Book Entry

4. The “Palm Ontology”

Throughout this paper, I've been using QNames of the form “p:someverb” as RDF “verbs”. These are elements from an informally developed Palm ontology.

My palm ontology namespace contains the following useful verbs:

photo

Associates the URI of a photograph with an entry in the address book. This is specifically a photograph of the person or entity that is the subject of the address book entry.

addresslabel

Associates a label with an address in the address book. The Palm address book only stores a single address for any given entry, but it is sometimes useful to distinguish between, for example, home addresses and business addresses.

birthday

Identifies the birthday of an individual in the address book. The birthday can be either a string or a reference to an appointment.

class

Associates a class with an entry in the date book. I often enter conferences and other events into my date book that I may be uncertain that I'm going to attend (or even certain that I'm not). This property lets me classify appointments.

For example, suppose I have an appointment at the dentist during the same week as an XML conference. If I've classified the XML conference as one that I'm not going to attend, or one that I'm not sure I'm going to attend, then I don't flag the overlapping dentist appointment as a conflict. If I am going to attend, I do. In either case, I may want to know when the conference is taking place so that I don't schedule other meetings on top of it.

hotel

Associates a hotel with a date book entry.

image

Associates a picture or other image (such as a map) with a date book or address book entry.

loc

Identifies the location where an event takes place.

mforks and mstars

Identifies the number of Michelin forks and stars associated with a restaurant in the address book. Some metadata is important.

phone

Associates a phone number with an entry in the date book (for example, for a teleconference).

sigother and spouse

Associates two address book entries with a “significant other” or “spouse” relationship.

uri

Associates an arbitrary URI (usually a homepage URL) with an entry in the address book or date book.

who

Associates an individual with an entry in the date book.

year

Associates a year with an entry in the date book. This is really only used for birthdays. This is just a convenience verb, it saves me from having to enter birthdays in the correct year in the date book. For some genealogical dates, it's not even clear that I could enter the correct year anyway.

4.1. Other Ontologies

I've also used QNames from a number of other namespaces in this paper. The system described in this paper uses six namespaces in addition to my ontology namespace. These are really just URIs used as RDF identifiers:

http://nwalsh.com/pim/Palm#

The “palm:” namespace is the URI for Palm concepts. A Palm address, for example, can be identified with the URI reference http://nwalsh.com/pim/Palm#Address.

http://nwalsh.com/pim/Palm/Calendar#

The “cal:” namespace is the URI for calendar concepts. The Palm date book is a collection of appointments, but calendar calculations are notoriously tricky. To make processing this information easier, the conduits also expose a calendar divided into years, months, weeks, and days. The calendar concepts URI is used to identify these resources.

file:/home/ndw/.xmlAddr.rdf#

The “ab:” namespace is the URI for my address book.

file:/home/ndw/.xmlDate.rdf#

The “db:” namespace is the URI for my date book.

file:/home/ndw/.xmlMemo.rdf#

The “mb:” namespace is the URI for my memos.

file:/home/ndw/.xmlTodo.rdf#

The “tb:” namespace is the URI for my todos.

5. Conclusions

On the whole, converting from an ad-hoc system to the current RDF-based system was straightforward and the benefits were obvious and immediate.

The combination of N3, which makes entering RDF data much more manageable, and programs like CWM that can expand logical inferences opens up new possibilities for interesting, useful applications.

Since I wrote the first draft of this paper, I've added several new verbs (for example, “photo”) to my Palm ontology as new ideas have occurred to me.

In the long run, I hope that real, useful, processable metadata will become a common feature of PDAs, cell phones, and other “consumer devices”. In the course of my day-to-day activities, I often view the web version of my Palm. I've become accustomed to traversing the relationships between people, dates, memos, and action items simply by clicking on them. The fact that I can't perform these operations in the same simple way when I'm actually using my Palm has become quite annoying.

6. Future Work

There's still quite a lot that could be done to improve the current system. Although it works quite well, there are clearly some areas in which it could work much better.

  • Sync from the RDF directly, eliminating the parallel, non-RDF XML files. In fact, current situation is even worse than it appears because PilotManager is also maintaining a copy of the data and the conduits are actually syncing against that copy which is then synchronized with the actual Palm data.

  • Support a more general RDF setup so that it is possible to sync BBDB and possibly other databases from the same common data.

  • Write proper RDF Schemas for the various ontologies used by the system.

  • Write Java applications that are enhanced work-alikes for the Palm applications. The goal being to provide all of the features of the Palm tools on the desktop, with the addition of better RDF metadata handling.

References

[PilotManager] Alan Harder. PilotManager.

[Circle and Arrow Diagrams] Dan Connolly. Circles and arrows diagrams using stylesheet rules.

[CWM] Sean B. Palmer. CWM - Closed World Machine.

[XML Conduits] Norman Walsh. XML From Your Palm. XML at Sun Developer Connection. Sun Microsystems, Inc. Aug, 2000.

[XSLT] James Clark, editor. XSL Transformations (XSLT) Version 1.0. World Wide Web Consortium, 1999.



[1] This is a simplification that captures the common case rather than the full generality. The full generality can be expressed, but this paper doesn't attempt to explain the details of N3 or RDF.