problem: actually it is not a problem, it si more likely a collection of tips to manage a DOM document (using
Xerces and
Xalan open source library from Apache Group): parsing an XML file, serialization and deserialization, aplying an XSL or validating against a defined XSD. You don't need a degree to discover how to do it, but as you have thousands of thousands of classes inside that libraries, without
Google I found it hard.
solution: ok, where is the beginning?
What we'll do first is a simple reading from a text file. Xerces gives you a simple way to open an XML file, by using:
private Document parseXmlFromFile(String filePath){
try {
//get the factory
DocumentBuilderFactory dbf =
DocumentBuilderFactory.newInstance();
//Using factory get an instance of document builder
DocumentBuilder db = dbf.newDocumentBuilder();
//parse using builder to get DOM representation
//of the XML file
return db.parse(
filePath
);
}catch(IOException ioe) {
ioe.printStackTrace();
}
}
Sometimes anyway you have only a string that contains your XML data (maybe got from an HTTP request or a webservice), and you just want to deserialize it into a Document object:
private Document deserialize(String xml)
try{
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setIgnoringComments(true);
dbf.setIgnoringElementContentWhitespace(true);
DocumentBuilder db = dbf.newDocumentBuilder();
StringReader reader = new StringReader(xml);
InputSource source = new InputSource(reader);
return db.parse(source);
}
catch(Exception e){
e.printStackTrace();
}
}
Ok, I know, it's quite the same of the previous code; the only difference is that now you take your data from a
StringReader object rather then from the path of the file. I also have added two properties to the builder, one that ignores comments and the other that ignores useless white spaces (for example formatting spaces).
And what if you want to obtain the string equivalent to the content of a
Document? Don't be scared, there is a solution:
private String serialize(Document document){
try {
OutputFormat format = new OutputFormat(document);
StringWriter stringOut = new StringWriter();
XMLSerializer serial = new XMLSerializer(stringOut, format);
serial.asDOMSerializer();
serial.serialize(document.getDocumentElement());
return stringOut.toString();
} catch (Exception e) {
e.printStackTrace();
}
}
...and
voilĂ your string is right there!
And what if you need to apply an
XSL style sheet to your wonderfull XML ?
No way of trouble, just use this simple code:
public String transform(Document xml, String urlSchema){
try{
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer =
tFactory.newTransformer(new StreamSource(urlSchema));
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(xml),
new StreamResult(writer));
return writer.toString();
}
catch(Exception e){
e.printStackTrace();
}
}
Ending dear fellow butcher, if you want to validate your XML data against a
XSD definition, just copy&paste this code:
public boolean validate(String xml, String schemaURL) {
try {
String schemaLang = "http://www.w3.org/2001/XMLSchema";
SchemaFactory factory = SchemaFactory.newInstance(schemaLang);
Schema schema = factory.newSchema(new StreamSource(
schemaURL
));
Validator validator = schema.newValidator();
StringReader reader = new StringReader(xml);
validator.validate(new StreamSource(reader));
return true;
} catch (Exception e) {
e.printStackTrace();
return false;
}
}
This method returns
true/false depending of the result of the the validation process; moreover you can catch, outside the method, the exception (if it is a DOMParserException or a SAXParseException) to have an idea of why xml content is not valid.
I showed how to use DOM Classes to manage the whole XML content of a specific file/string, without discovering all the
whys and
becauses of each line; I'm pretty sure you can find several other ways to do what I coded but this fits my usual needs.