186 Informatica e diritto /Proceedings of the Workshop LOAIT 2010
other sources of law, using a parser based on patterns for references1. This
structure and reference information is stored in CEN/MetaLex XML2.
Fig. 1 – Steps in automatic modelling of legal texts
The next step is to create models for each individual statement in the
text. In most cases, each sentence in Dutch law forms a complete statement
(though possibly part of a bigger construct), so we are, in fact, creating a
model for each sentence in the text. In the last step, these individual models
are integrated with each other to come to a complete model. In order to cre-
ate the models, we start by classifying each sentence in the text as a speciﬁc
provision, such as a deﬁnition, a duty, or a modiﬁcation of an earlier law. In
total, we recognise ten different main categories. As with the references, this
is done by automatic recognition of certain patterns in the text3.
For several types of sentences, these patterns, together with some added
features, are sufﬁcient to extract all information needed to create a model of
the sentence. This is usually the case with sentences that are about the law
itself, instead of the subject matter of the law. These sentences are discussed
in Section 2. Other sentences, such as obligations, do focus on the subject
matter, and can vary wildly. Simple patterns will not sufﬁce to deal with
these sentences, and to extract information from these types of sentences, we
1E. DE MAAT, R. WI NKEL S, T. VAN ENGER S,Automated Detection of Reference Struc-
tures in Law, in Engers T.M.van (ed.), “Legal Knowledge and Information Systems, Proceed-
ings of the Jurix 2006 Nineteenth Annual Conference”, Amsterdam, IOS Press, 2006, pp.
3E. DE MAAT, R. WI NKEL S,Automatic Classiﬁcation of Sentences in Dutch Laws, in
Francesconi E., Sartor G., Tiscornia D. (eds.), “Legal Knowledge and Information Systems,
Proceedings of the Jurix 2008 Twenty-First Annual Conference”, Amsterdam, IOS Press,
2008, pp. 207-216.