Pull解析方法給應用程序完全的控制文檔該怎麼樣被解析。
Android中對Pull方法提供了支持的API,主要是
復制代碼 代碼如下:
org.xmlpull.v1.XmlPullParser;
org.xmlpull.v1.XmlPullParserFactory;
二個類,其中主要使用的是XmlPullParser,XmlPullParserFactory是一個工廠,用於構建XmlPullParser對象。
應用程序通過調用XmlPullParser.next()等方法來產生Event,然後再處理Event。可以看到它與Push方法的不同,Push方法是由Parser自己主動產生Event,回調給應用程序。而Pull方法是主動的調用Parser的方法才能產生事件。
假如XML中的語句是這樣的:"<author country="United States">James Elliott</author>",author是TAG,country是ATTRIBUTE,"James Elliott"是TEXT。
要想解析文檔先要構建一個XmlPullParser對象
復制代碼 代碼如下:
final XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
factory.setNamespaceAware(true);
final XmlPullParser parser = factory.newPullParser();
Pull解析是一個遍歷文檔的過程,每次調用next(),nextTag(), nextToken()和nextText()都會向前推進文檔,並使Parser停留在某些事件上面,但是不能倒退。
然後把文檔設置給Parser
復制代碼 代碼如下:
parser.setInput(new StringReader("<author country=\"United States\">James Elliott</author>");
這時,文檔剛被初始化,所以它應該位於文檔的開始,事件應該是START_DOCUMENT,可以通過XmlPullParser.getEventType()來獲取。然後調用next()會產生
START_TAG,這個事件告訴應用程序一個標簽已經開始了,調用getName()會返回"author";再next()會產生
TEXT事件,調用getText()會返回"James Elliott",再next(),會產生
END_TAG,這個告訴你一個標簽已經處理完了,再next(),會產生
END_DOCUMENT,它告訴你整個文檔已經處理完成了。
除了next()外,nextToken()也可以使用,只不過它會返回更加詳細的事件,比如 COMMENT, CDSECT, DOCDECL, ENTITY等等非常詳細的信息。如果程序得到比較底層的信息,可以用nextToken()來驅動並處理詳細的事件。需要注意一點的是TEXT事件是有可能返回空白的White Spaces比如換行符或空格等。
另外有二個非常實用的方法nextTag()和nextText()
nextTag()--首先它會忽略White Spaces,如果可以確定下一個是START_TAG或END_TAG,就可以調用nextTag()直接跳過去。通常它有二個用處:當START_TAG時,如果能確定這個TAG含有子TAG,那麼就可以調用nextTag()產生子標簽的START_TAG事件;當END_TAG時,如果確定不是文檔結尾,就可以調用nextTag()產生下一個標簽的START_TAG。在這二種情況下如果用next()會有TEXT事件,但返回的是換行符或空白符。
nextText()--它只能在START_TAG時調用。當下一個元素是TEXT時,TEXT的內容會返回;當下一個元素是END_TAG時,也就是說這個標簽的內容為空,那麼空字串返回;這個方法返回後,Parser會停在END_TAG上。比如:
復制代碼 代碼如下:
<author>James Elliott</author>
<author></author>
<author/>
當START_TAG時,調用
nextText(),依次返回:
"James Elliott"
""(empty)
""(empty)
這個方法在處理沒有子標簽的標簽時很有用。比如:
復制代碼 代碼如下:
<title>What Is Hibernate</title>
<author>James Elliott</author>
<category>Web</category>
就可以用以下代碼來處理:
復制代碼 代碼如下:
while (eventType != XmlPullParser.END_TAG) {
switch (eventType) {
case XmlPullParser.START_TAG:
tag = parser.getName();
final String content = parser.nextText();
Log.e(TAG, tag + ": [" + content + "]");
eventType = parser.nextTag();
break;
default:
break;
}
}
這就要比用next()來處理方便多了,可讀性也大大的加強了。
最後附上一個解析XML的實例Android程序
復制代碼 代碼如下:
import java.io.IOException;
import java.io.InputStream;
import org.xmlpull.v1.XmlPullParser;
import org.xmlpull.v1.XmlPullParserException;
import org.xmlpull.v1.XmlPullParserFactory;
import android.util.Log;
public class RssPullParser extends RssParser {
private final String TAG = FeedSettings.GLOBAL_TAG;
private InputStream mInputStream;
public RssPullParser(InputStream is) {
mInputStream = is;
}
public void parse() throws ReaderBaseException, XmlPullParserException, IOException {
if (mInputStream == null) {
throw new ReaderBaseException("no input source, did you initialize this class correctly?");
}
final XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
factory.setNamespaceAware(true);
final XmlPullParser parser = factory.newPullParser();
parser.setInput(mInputStream);
int eventType = parser.getEventType();
if (eventType != XmlPullParser.START_DOCUMENT) {
throw new ReaderBaseException("Not starting with 'start_document'");
}
eventType = parseRss(parser);
if (eventType != XmlPullParser.END_DOCUMENT) {
throw new ReaderBaseException("not ending with 'end_document', do you finish parsing?");
}
if (mInputStream != null) {
mInputStream.close();
} else {
Log.e(TAG, "inputstream is null, XmlPullParser closed it??");
}
}
/**
* Parsing the Xml document. Current type must be Start_Document.
* After calling this, Parser is positioned at END_DOCUMENT.
* @param parser
* @return event end_document
* @throws XmlPullParserException
* @throws ReaderBaseException
* @throws IOException
*/
private int parseRss(XmlPullParser parser) throws XmlPullParserException, ReaderBaseException, IOException {
int eventType = parser.getEventType();
if (eventType != XmlPullParser.START_DOCUMENT) {
throw new ReaderBaseException("not starting with 'start_document', is this a new document?");
}
Log.e(TAG, "starting document, are you aware of that!");
eventType = parser.next();
while (eventType != XmlPullParser.END_DOCUMENT) {
switch (eventType) {
case XmlPullParser.START_TAG: {
Log.e(TAG, "start tag: '" + parser.getName() + "'");
final String tagName = parser.getName();
if (tagName.equals(RssFeed.TAG_RSS)) {
Log.e(TAG, "starting an RSS feed <<");
final int attrSize = parser.getAttributeCount();
for (int i = 0; i < attrSize; i++) {
Log.e(TAG, "attr '" + parser.getAttributeName(i) + "=" + parser.getAttributeValue(i) + "'");
}
} else if (tagName.equals(RssFeed.TAG_CHANNEL)) {
Log.e(TAG, "\tstarting an Channel <<");
parseChannel(parser);
}
break;
}
case XmlPullParser.END_TAG: {
Log.e(TAG, "end tag: '" + parser.getName() + "'");
final String tagName = parser.getName();
if (tagName.equals(RssFeed.TAG_RSS)) {
Log.e(TAG, ">> edning an RSS feed");
} else if (tagName.equals(RssFeed.TAG_CHANNEL)) {
Log.e(TAG, "\t>> ending an Channel");
}
break;
}
default:
break;
}
eventType = parser.next();
}
Log.e(TAG, "end of document, it is over");
return parser.getEventType();
}
/**
* Parse a channel. MUST be start tag of an channel, otherwise exception thrown.
* Param XmlPullParser
* After calling this function, parser is positioned at END_TAG of Channel.
* return end tag of a channel
* @throws XmlPullParserException
* @throws ReaderBaseException
* @throws IOException
*/
private int parseChannel(XmlPullParser parser) throws XmlPullParserException, ReaderBaseException, IOException {
int eventType = parser.getEventType();
String tagName = parser.getName();
if (eventType != XmlPullParser.START_TAG || !RssFeed.TAG_CHANNEL.equals(tagName)) {
throw new ReaderBaseException("not start with 'start tag', is this a start of a channel?");
}
Log.e(TAG, "\tstarting " + tagName);
eventType = parser.nextTag();
while (eventType != XmlPullParser.END_TAG) {
switch (eventType) {
case XmlPullParser.START_TAG: {
final String tag = parser.getName();
if (tag.equals(RssFeed.TAG_IMAGE)) {
parseImage(parser);
} else if (tag.equals(RssFeed.TAG_ITEM)) {
parseItem(parser);
} else {
final String content = parser.nextText();
Log.e(TAG, tag + ": [" + content + "]");
}
// now it SHOULD be at END_TAG, ensure it
if (parser.getEventType() != XmlPullParser.END_TAG) {
throw new ReaderBaseException("not ending with 'end tag', did you finish parsing sub item?");
}
eventType = parser.nextTag();
break;
}
default:
break;
}
}
Log.e(TAG, "\tending " + parser.getName());
return parser.getEventType();
}
/**
* Parse image in a channel.
* Precondition: position must be at START_TAG and tag MUST be 'image'
* Postcondition: position is END_TAG of '/image'
* @throws IOException
* @throws XmlPullParserException
* @throws ReaderBaseException
*/
private int parseImage(XmlPullParser parser) throws XmlPullParserException, IOException, ReaderBaseException {
int eventType = parser.getEventType();
String tag = parser.getName();
if (eventType != XmlPullParser.START_TAG || !RssFeed.TAG_IMAGE.equals(tag)) {
throw new ReaderBaseException("not start with 'start tag', is this a start of an image?");
}
Log.e(TAG, "\t\tstarting image " + tag);
eventType = parser.nextTag();
while (eventType != XmlPullParser.END_TAG) {
switch (eventType) {
case XmlPullParser.START_TAG:
tag = parser.getName();
Log.e(TAG, tag + ": [" + parser.nextText() + "]");
// now it SHOULD be at END_TAG, ensure it
if (parser.getEventType() != XmlPullParser.END_TAG) {
throw new ReaderBaseException("not ending with 'end tag', did you finish parsing sub item?");
}
eventType = parser.nextTag();
break;
default:
break;
}
}
Log.e(TAG, "\t\tending image " + parser.getName());
return parser.getEventType();
}
/**
* Parse an item in a channel.
* Precondition: position must be at START_TAG and tag MUST be 'item'
* Postcondition: position is END_TAG of '/item'
* @throws IOException
* @throws XmlPullParserException
* @throws ReaderBaseException
*/
private int parseItem(XmlPullParser parser) throws XmlPullParserException, IOException, ReaderBaseException {
int eventType = parser.getEventType();
String tag = parser.getName();
if (eventType != XmlPullParser.START_TAG || !RssFeed.TAG_ITEM.equals(tag)) {
throw new ReaderBaseException("not start with 'start tag', is this a start of an item?");
}
Log.e(TAG, "\t\tstarting " + tag);
eventType = parser.nextTag();
while (eventType != XmlPullParser.END_TAG) {
switch (eventType) {
case XmlPullParser.START_TAG:
tag = parser.getName();
final String content = parser.nextText();
Log.e(TAG, tag + ": [" + content + "]");
// now it SHOULD be at END_TAG, ensure it
if (parser.getEventType() != XmlPullParser.END_TAG) {
throw new ReaderBaseException("not ending with 'end tag', did you finish parsing sub item?");
}
eventType = parser.nextTag();
break;
default:
break;
}
}
Log.e(TAG, "\t\tending " + parser.getName());
return parser.getEventType();
}
}