Exercise: Validate XML with a Given DTD

Last modified by superadmin on 2018-01-12 21:34

Exercise: Validate XML with a Given DTD

Public source: http://java-eim.googlecode.com/svn/trunk/java-eim-demo-xmlsamples Technologies: Java (J2SDK 1.6), Ant and Maven2 build tools, JDeveloper IDE, Xerces XML library, Sun MSV Estimated time: 15 minutes

Use Ant to validate XML and correct validation errors. Compare behavior of the default validator (e.g. "validate" task in Ant, which relies on Xerces) versus fail-fast validator (e.g. Sun Multi-schema validator). Define "PUBLIC" DTD locations locally (rather than getting them from the Internet, which may slow down the parsing process considerably).

General Description and Scope

To ensure that the structure of XML data is correct, it can be validated against a DTD grammar. Despite its many limitations (DTD has no support for namespaces, has very few data types for attributes, etc), it is still widely used due to its simplicity. In this exercise you will validate a file against the given DTD and possibly correct parse errors, e.g. missing attributes, missing closing tags, etc. 

For some sorts of documents it could be difficult to locate the place of parse error (because some parsers could report syntax error only much later - perhaps at the very end of the document). Therefore it could sometimes be reasonable to use fail-fast validation, i.e. parsing, which guarantees detection of grammatically incorrect XML (either not well-formed or invalid w.r.t. given DTD) as early as possible. 

Provided resources

  • Run Ant scripts to validate
  • Use some text editor (something like "vi", mc's "Edit" function or "Kate") to edit XML files whenever necessary.

Activities

  1. Run this from a command-line: 
cd /home/student/workspace/java-eim-demo-xmlsamples  
ant -version          (Check the Ant version)
ant -projecthelp      (See available Ant tasks)
ant validate          
ant -verbose validate
ant validate-local    
ant failfast-validate
  1. Correct the syntax error in src/test/resources/validation/minimal.html and validate again. 
  2. Notice that "ant validate-local" displays a different line number than "ant failfast-validate". The task "failfast-validate" is actually validating a file with its DTD declaration commented out. 
  3. If you have time, you can add another validation task. Try to validate some RSS file, e.g. Google News Atom 0.3 file against its grammar (e.g. Atom 0.3 grammar). In order to do this, add one more line to the "failfast-validate" task in build.xml
<target name="failfast-validate" description="Check some
      files against a DTD (Sun Multi-Schema Validator)"
>
 <java classpathref="msv.classpath" fork="true" dir="."
     classname="com.sun.msv.driver.textui.Driver">
   <arg value="src/test/resources/validation/xhtml1-strict.dtd" />
   <arg value="src/test/resources/validation/minimal_no_dtd.html" />
 </java>
</target>
  1. If there are any syntax errors, try correcting them until there are no more errors. 
Created by Kalvis Apsītis on 2007-10-13 22:40
    
This wiki is licensed under a Creative Commons 2.0 license
XWiki Enterprise 6.4 - Documentation