Ruby XML, XSLT and XPath tutorials

What is XML?

XML alludes to the eXtensible Markup Language.

Extensible Markup Language, a subset of the standard all inclusive markup language, a markup language used to check electronic archives to make them basic.

It very well may be utilized to check information, characterize information types, and is a source language that enables clients to characterize their own markup language. It is perfect for web transport and gives a brought together approach to portray and trade organized information that is free of the application or seller.

For progressively content, look at our XML tutorial

XML parser structure and API

XML parsers are for the most part DOM and SAX.

  • The SAX parser depends on occasion preparing. It needs to examine the XML report from start to finish. Amid the examining procedure, each time a linguistic structure is experienced, the occasion handler of this particular grammar structure is called. The application sends an occasion.
  • The DOM is an archive object demonstrate parsing, developing a progressive grammar structure of a record, and setting up a DOM tree in memory. The hubs of the DOM tree are recognized by an item. After the archive is parsed, the whole DOM tree of the record is set in the memory.

Parsing and making XML in Ruby

The parsing of XML archives in RUBY can utilize this library REXML library.

The REXML library is a XML toolbox for ruby that is written in unadulterated Ruby and holds fast to the XML 1.0 determination.

In Ruby 1.8 and past, REXML will be incorporated into the RUBY standard library.

The way to the REXML library is: rexml/document

All techniques and classes are epitomized into a solitary REXML module.

The REXML parser has the accompanying focal points over other parsers:

  • 100% composed by Ruby.
  • Applicable to SAX and DOM parsers.
  • It's lightweight, with under 2000 lines of code.
  • Methods and classes that are straightforward.
  • Based on the SAX2 API and full XPath support.
  • Use Ruby establishment without introducing it independently.

The following XML code for the example is spared as movies.xml:

<collection shelf=" New Arrivals"> <movie title="Enemy Behind"> <type>War, Thriller</type > <format>DVD</format> <year>2003</year> <rating>PG</rating> <stars>10</stars> <description>Talk about a US-Japan war</description > </movie> <movie title="Transformers"> <type>Anime, Science Fiction</type> <format>DVD</format> <year>1989</year> <rating>R</rating> <stars>8</stars> <description>A schientific fiction</description > </movie> <movie title="Trigun"> <type>Anime, Action</type > <format>DVD</format> <episodes>4</episodes> <rating>PG</rating> <stars>10</stars> <description>Vash the Stampede!</description> </movie> <movie title="Ishtar"> <type>Comedy</type> <format>VHS</format> <rating>PG</rating> <stars>2</stars> <description>Viewable boredom</description> </movie> </collection>

DOM parser

Let's first parse the XML data. First we introduce the rexml/document library. Usually we can introduce REXML in the top-level namespace:


#!/usr/bin/ruby -w require 'rexml/document' include REXML xmlfile= File.new("movies.xml") xmldoc = Document.new(xmlfile) # Get the root element root= xmldoc.root puts "Root element : " + root.attributes ["shelf" ] # The following will output the movie title xmldoc.elements.each("collection/movie"){ |e| puts "Movie Title : " + e.attributes["title"] } # All movie types will be output below< /span> xmldoc.elements.each("collection/movie/type") { |e| puts "Movie Type : " + e.text } # All movie descriptions will be output below< /span> xmldoc.elements.each("collection/movie/description") { |e| puts "Movie Description : " + e.text }

The output of the above example is:

Root element : New Arrivals
Movie Title : Enemy Behind
Movie Title : Transformers
Movie Title : Trigun
Movie Title : Ishtar
Movie Type : War , Thriller
Movie Type : Anime , Science  Fiction
Movie Type : Anime , Action
Movie Type : Comedy
Movie Description : Talk About a US-Japan war
Movie Description : A schientific fiction
Movie Description : Vash The Stampede!
Movie Description : Viewable Boredom
SAX-like Parsing:

SAX parser

Process the same data file: movies.xml. It is not recommended to parse SAX into a small file. Here is a simple example:


#!/usr/bin/ruby -w require 'rexml/document' require 'rexml/streamlistener' include REXML class MyListener include REXML::StreamListener def tag_start(*args) puts "tag_start: #{args.map {|x| x.inspect}.join(', ')}" end def text(data) return if data =~ /^\w* $/ # whitespace only abbrev = data[0..40] + (data.length > 40 ? "..." : "") puts " text : #{abbrev.inspect}" end end list= MyListener.new xmlfile= File.new("movies.xml") Document.parse_stream(xmlfile, list)

The above output is:

tag_start: "collection", { "shelf"=>"New Arrivals" }
Tag_start: "movie", {"title"=>"Enemy Behind"}
Tag_start: "type", {}
  Text : "War, Thriller"
Tag_start: "format", {}
Tag_start: "year", {}
Tag_start: "rating", {}
Tag_start: "stars", {}
Tag_start: "description", {}
  Text : "Talk about a US-Japan war"
Tag_start: "movie", {"title"=>"Transformers"}
Tag_start: "type", {}
  Text : "Anime, Science Fiction"
Tag_start: "format", {}
Tag_start: "year", {}
Tag_start: "rating", {}
Tag_start: "stars", {}
Tag_start: "description", {}
  Text : "A schientific fiction"
Tag_start: "movie", {"title"=>"Trigun"}
Tag_start: "type", {}
  Text : "Anime, Action"
Tag_start: "format", {}
Tag_start: "episodes", {}
XPath.each(xmldoc, "//type") { |e| puts e.text }
# Get the types of all movie formats, return an array
names= XPath.match(xmldoc, "/ /format").map {|x| x.text }
p names

The yield of the above model is:

<movie title='Enemy Behind'> ... </> 

War, Thriller 

Anime, Science Fiction 

Anime, Action 


["DVD", "DVD", "DVD", "VHS"]

XSLT and Ruby

There are two XSLT parsers in Ruby, which are quickly portrayed below:


This parser was composed and kept up by Justice Masayoshi Takahash. This is principally composed for the Linux working framework and requires the accompanying libraries:

  • Sablot
  • Iconv
  • Expat

You can discover these libraries at Ruby-Sablotron.


The XSLT4R was composed by Michael Neumann. XSLT4R is utilized for straightforward direction line cooperation and can be utilized by outsider applications to change XML reports.

XSLT4R requires XMLScan activities, including the XSLT4R chronicle, which is a 100% Ruby module. These modules can be introduced utilizing the standard Ruby establishment technique (ie Ruby install.rb).

The XSLT4R sentence structure is as follows:

ruby xslt .rb stylesheet.xsl document.xml [arguments]

If you need to utilize XSLT4R in your application, you can present XSLT and enter the parameters you need. A precedent is as follows:


require " xslt" stylesheet = File.readlines("stylesheet.xsl").to_s xml_doc= File.readlines("document.xml").to_s arguments = { ' image_dir' => '/....' } sheet = XSLT::Stylesheet.new( stylesheet , arguments ) # output to StdOut sheet.apply( xml_doc ) # output to 'str' str = "" sheet.output = [ str ] sheet.apply( xml_doc )

More information

welookups is optimized for learning.© welookups. 2018 - 2019 All Right Reserved and you agree to have read and accepted our term and condition.