Grouping according to equality of values
The second method of a grouping in XQuery is closer to the relational method to the extent that the group definition is determined by the equality of manifestations. For example, all persons living in Berlin form a group defined via equality of values which finds no direct reflection in a hierarchical XML structure. As a consequence, a grouping via equality of values is applied whenever entities shall be grouped together via properties which are not represented a priori by a hierarchical structure or if flat XML data are used. The latter case in particular is of significant interest because of the XML-like representation of relational database contents.
This grouping and aggregation methods are illustrated by means of the previous example showing the calculation of the average age of the medical personnel per occupational group based on the following XML fragment:
<MedicalPersonnel>
<Person><Occupation>Doctor</Occupation>
<Name>Naumann</Name>
<Age>32</Age> </Person>
<Person><Occupation>Doctor</Occupation>
<Name>Shore</Name>
<Age>27</Age> </Person>
<Person><Occupation>Doctor</Occupation>
<Name>Meier</Name>
<Age>25</Age> </Person>
<Person><Occupation>Nurse</Occupation>
<Name>Guldenstern</Name>
<Age>41</Age> </Person>
<Person><Occupation>Nurse</Occupation>
<Name>Murawitz</Name>
<Age>65</Age> </Person>
</MedicalPersonnel>
In this case, a representative is generated by duplicate elimination under usage of the fn:distinct-values() function which was already applied in the introduction of the FLWOR construct. For the above scenario this means that the iteration takes place via all explicitly defined different occupational groups or different occupations (one time each):
<MedicalPersonnel>
{
for $o in fn:distinct-values(fn:doc("...")//Occupation/text())
let $x := fn:doc("...")//Age[../Occupation/text() = $o]
return
element {$o}
{<Age>{ fn:avg($x) }</Age>}
}
</MedicalPersonnel>
In this example, the for clause realises an iteration for the various manifestations of the Occupation element, i.e. via Doctor and Nurse. In the second step all ages are bound to the $x variable which can be found in a personnel entry with the appropriate occupation. Then, the result is created by a computed element constructor whereby the average age is determined, as in the previous example, by calling the fn:avg() function.
In the above XML fragment the values used to determine the belonging to a group were represented as the textual form of an element. As a variant of this presentation exists, for example, also a group formation via an attribute value of an element or via the element identifier. Both possibilities are shown by means of an example:
Group formation via attribute manifestations
<MedicalPersonnel>
<Person Occupation=”Doctor”><Name>Naumann</Name>
<Age>32</Age></Person>
...
The occupation is indicated in the form of an attribute value ("Doctor") in the Person element. The for and let clauses change accordingly as follows:
...
for $o in fn:distinct-values(fn:doc("...")//Person/@Occupation)
let $x := fn:doc("...")//Age[../@Occupation = $o]
...
Group formation via element identifier
<MedicalPersonnel><Person><Doctor/><Name>Naumann</Name>
<Age>32</Age></Person>
...
In this case, the elements <Doctor/> and <Nurse/> indicate the belonging to the appropriate occupational group. Here, the query changes clearly when determining the different manifestations of the occupational groups since the duplicate elimination must occur, not on the elements, but on the textual identifiers which require for their determination a nested FLWOR expression:
...
for $o in fn:distinct-values(
for $i in fn:doc("...")//Person/(Doctor|Nurse)
return
fn:node-name($i))
let $x := fn:doc("...")//Age[../fn:node-name(.) = $o]
...
In conclusion, a grouping and an aggregation is realisable with the normal FLWOR mechanisms and no special syntax extension, as is the case, for example, in SQL, must exist. The group formation is implicitly predetermined by the hierarchical structure or explicitly performed – in analogy to the relational approach – based on values.
Source: "XQuery – Grundlagen und fortgeschrittene Methoden", dpunkt-Verlag, Heidelberg (2004)
<< back | next >> |