Ruby uses the rexml library to parse data in XML format

Time:2022-1-2

Rexml is a processor written entirely in ruby. It has a variety of APIs, two of which are distinguished by Dom like and sax like. The first is to read the whole file into memory and store it in a hierarchical form (that is, a tree) The second is “parse as you go”, which is more suitable when your file is large and memory is limited.
rexmlIt has the following characteristics:

  • 100% written in Ruby
  • Can be used to parse sax and DOM
  • Lightweight, less than 2000 lines of code
  • Provide complete API support
  • Built in Ruby

Let’s see how to use it. Suppose we have the following XML files:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
<collection shelf="New Arrivals">
 <movie title="Enemy Behind">
 <type>War, Thriller</type>
 <format>DVD</format>
 <year>2003</year>
 <rating>PG</rating>
 <stars>10</stars>
 <description>Talk about a US-Japan war</description>
 </movie>
 <movie title="Transformers">
 <type>Anime, Science Fiction</type>
 <format>DVD</format>
 <year>1989</year>
 <rating>R</rating>
 <stars>8</stars>
 <description>A schientific fiction</description>
 </movie>
 <movie title="Trigun">
 <type>Anime, Action</type>
 <format>DVD</format>
 <episodes>4</episodes>
 <rating>PG</rating>
 <stars>10</stars>
 <description>Vash the Stampede!</description>
 </movie>
 <movie title="Ishtar">
 <type>Comedy</type>
 <format>VHS</format>
 <rating>PG</rating>
 <stars>2</stars>
 <description>Viewable boredom</description>
 </movie>
</collection>

Parsing DOM:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
require 'rexml/document'
include REXML
xmlfile = File.new("movies.xml")
xmldoc = Document.new(xmlfile)
root = xmldoc.root
puts "Root element : " + root.attributes["shelf"]
xmldoc.elements.each("collection/movie"){
 |e| puts "Movie Title : " + e.attributes["title"]
}
xmldoc.elements.each("collection/movie/type") {
 |e| puts "Movie Type : " + e.text
}
xmldoc.elements.each("collection/movie/description") {
 |e| puts "Movie Description : " + e.text
}

Using XPath:

?
1
2
3
4
5
6
7
8
9
require 'rexml/document'
include REXML
xmlfile = File.new("movies.xml")
xmldoc = Document.new(xmlfile)
movie = XPath.first(xmldoc, "//movie")
p movie
XPath.each(xmldoc, "//type") { |e| puts e.text }
names = XPath.match(xmldoc, "//format").map {|x| x.text }
p names

In case of need!

PS: security issues about rexml
RubyThe official website issued a safety notice on August 23: http://www.ruby-lang.org/en/news/2008/08/23/dos-vulnerability-in-rexml/ , in the XML parsing library rexml currently used by ruby, when parsing XML files with nested recursive elements, there will be a denial of service attack, resulting in the depletion of server resources!
All applications that use the XML file parsing function in rails applications have the above defects and need to be repaired. The repair methods in rails are as follows:
1、Rails2. 0.2 and previous versions
downloadRepair file, copy to rails_ Root / lib directory, and in environment Adding statements to RB

?
1
require ‘rexml-expansion-fix'

2. Rails 2.1.0 and above
downloadRepair file, copy to rails_ Root / config / initializers directory.