[iPhone Development] How to Parse HTML

A few weeks ago, I was looking for a simple parser for html in iPhone because I just need to scrape a couple of webpages to get the contents. I found a nice wrapper on this posting, and it’s called hpple. Simple steps to use the library.

Include and Link libxml2

  1. Expand Targets
  2. Double Click on your project name
  3. Select All Configuration
  4. Search for Header Search Path
  5. Add this line below with recursive option
${SDKROOT}/usr/include/libxml2
  1. Search for Other Linker Flag
  2. Add this line below
-lxml2

See screenshots below

html_include html_linking Download Source Codes

git clone git://github.com/topfunky/hpple.git

Then drag and drop following source codes

TFHpple.h
TFHpple.m
TFHppleElement.h
TFHppleElement.m
XPathQuery.h
XPathQuery.m

That’s it. Let’s write some codes.

 // Don't forget
 // #import "TFHpple.h"
 NSData *htmlData = [[NSString stringWithContentsOfURL:[NSURL URLWithString: @"http://www.objectgraph.com/contact.html"]] dataUsingEncoding:NSUTF8StringEncoding];
 TFHpple *xpathParser = [[TFHpple alloc] initWithHTMLData:htmlData];
 NSArray *elements  = [xpathParser search:@"//h3"]; // get the page title - this is xpath notation
 TFHppleElement *element = [elements objectAtIndex:0];
 NSString *myTitle = [element content];
 NSLog(myTitle);
 [xpathParser release];
 [htmlData release];

XCode Project Download

Donwnload the complete project file is available here.

By: kiichi on: