A mini-tutorial on how to use the SaxObjectDecoder class from SaxObjC. The example implements a very simple RSS newsfeed parser.
Sources
- rssparse.m - the tool
- rssparse.xmap - the XML-object mapping file
- slashdot.rss - a SlashDot example feed
Introduction
SaxObjectDecoder is a SAX handler which can be used to map XML documents to
object structures. It makes use of the NSKeyValueCoding protocol to dynamically map XML tags and tag attributes to object properties of custom
classes.
Note that SaxObjectDecoder is not really suitable for mapping any XML
format to objects, but for a wide range it's fine - and it's supereasy to
use ;-)
What do we want to parse ?
Take a look at rss2plist1 for a short intro into the RSS document we are going to parse.
Tasks
To parse a XML document into an object structure, you need to write classes
for the objects you are going to parse and you need to write a model which
maps the XML to the class attributes.
You should be aware on how Objective-C key/value coding works, for an introduction use the
KeyValueCoding
programming topic at the Apple website. Basically key/value coding adds the
"property" concept to NSObject (much like JavaBeans).
Writing the "Model" Classes
We are going to parse two different kinds of objects from the RSS XML stream,
RSS channels and RSS items. Everything else is ignored.
Both, channels and items have a "link", a "description" and a "summary" property, so we are going to write an abstract superclass "RSSObject".
@interface RSSObject : NSObject { NSString *title; NSString *link; NSString *info; } @end @implementation RSSObject - (void)dealloc { [self->title release]; [self->link release]; [self->info release]; [super dealloc]; } /* accessors */ - (void)setTitle:(NSString *)_value { [self->title autorelease]; self->title = [_value copy]; } - (void)setLink:(NSString *)_value { [self->link autorelease]; self->link = [_value copy]; } - (void)setInfo:(NSString *)_value { [self->info autorelease]; self->info = [_value copy]; } /* description */ - (NSString *)description { NSMutableString *s = [NSMutableString stringWithCapacity:64]; [s appendFormat:@"<0x%08X[%@]:", self, NSStringFromClass([self class])]; if (self->title) [s appendFormat:@" title='%@'", self->title]; if (self->link) [s appendFormat:@" link='%@'", self->link]; [s appendString:@">"]; return s; } @end /* RSSObject */
There is nothing special about the class, it contains a -dealloc method to
free the properties, a set-accessor for each property and a description
method which is used in the main program for output.
The only thing missing for a "real" implementation are the get accessors.
The concrete subclasses RSSChannel and RSSItem define nothing - for the demonstration we are only interested in title, link and description which are fully covered by the superclass :-)
@interface RSSChannel : RSSObject @end @interface RSSItem : RSSObject @end @implementation RSSChannel @end /* RSSChannel */ @implementation RSSItem @end /* RSSItem */
That's it for the model classes, the actual XML processing logic is written down in a declarative way in the model.
The XML->Object Mapping Model
The model is written down as a property list file and maps XML tags and attributes to Objective-C classes and their properties. For sets of objects it uses toManyRelationships as available in the KeyValueCoding extensions as defined in NSClassDescription.h.
{ "http://purl.org/rss/1.0/" = { channel = { class = RSSChannel; attributes = { title = title; link = link; description = info; }; }; item = { class = RSSItem; attributes = { title = title; link = link; description = info; }; }; title = { class = NSString; }; link = { class = NSString; }; description = { class = NSString; key = "item"; }; }; "http://www.w3.org/1999/02/22-rdf-syntax-ns#" = { RDF = { class = NSMutableDictionary; ToManyRelationships = { channels = ( channel ); items = ( item ); }; }; }; }
The first level of the model declares the namespaces of the tags we are going
to map, in our case we have declared a mapping for the RDF and for the RSS
namespaces. Following in the second level the tags can be found, eg "RDF" is
a mapping for the RDF tag in the RDF namespace.
We map each of the tags to a class, eg the "RDF" tag to a NSMutableDictionary
and the "item" tag to RSSItem. For the complex tags "item" and "channel" we
define a set of attributes which are mapped to subtags.
If we map a tag to NSString, like we do for "title" and "link", key value
coding is "disabled" and the tag is mapped to a simple string containing the
content of the tag.
Connecting the Model to the Parser and the Classes
We now have working model classes which are able the represent our XML
information and we have a model which describes how to map our XML information to the classes. The only thing missing is the glue to make it
work.
Since SaxObjectDecoder is a "usual" SAX handler, the glue is pretty much the
same like for the other SAX handlers in rss2plist1
and rss2plist2.
First, we construct a SAX parser:
parser = [[SaxXMLReaderFactory standardXMLReaderFactory] createXMLReaderForMimeType:@"text/xml"];
Next, we create the SaxObjectDecoder handler and attach it to the parser:
sax = [[SaxObjectDecoder alloc] initWithMappingAtPath:@"./rssparse.xmap"]; [parser setContentHandler:sax]; [parser setErrorHandler:sax];
Note that the decoder is initialized with the model.
And finally we start parsing and output the root object:
[parser parseFromSource:[NSURL URLWithString:@"file:///...."]]; root = [sax rootObject]; NSLog(@"parsed object: %@", root);
That's it !