FulDex: A Fully-Indexing-Enabled Memory Representation Model for Supporting XML Regular Expression Queries

  • 李 文淦

Student thesis: Master's Thesis


XML is widely used in the field of service computing due to its simplicity generality and usability Although significant efforts have been made on investigating XML query evaluations little emphasis has been put on optimizing the evaluation of queries aimed at locating elements/attributes through the matching of names/values rather than paths with regular expressions This paper presents a memory representation model referred to as FulDex for performing XML regular expression queries over XML documents The pro-posed model includes two key features: (1) the indexing of all characters of the names/values of the elements/attributes within an XML document; and (2) an algorithm for performing regular expression queries in conjunction with a set of proposed rules for filtering out names/values that do not need to be matched with a query Experiment re-sults demonstrate that in 95% of the test cases the average query efficiency of FulDex is superior to that of seven other state-of-the-art memory representation-based XML parsers Specifically when dealing with a large XML document (1 58GB) the average execution time of FulDex for a regular expression query is 80 41% less than that required by RapidXml that has the best query performance among the existing tools
Date of Award2015 Sept 8
Original languageEnglish
SupervisorJung-Hsien Chiang (Supervisor)

Cite this