All about the Responsa Retrieval Project You always Wanted to Know but Were Afraid to Ask

Autore:Aviezri S. Fraenkel
Carica:Professor at The Wehmann Institute of Science, Department of Applied Mathematics, Rehovot, Israel

Expanded Summary. References.

    Invited address delivered it the Third Symposiom on Legal Data Processing in Europe, Oslo, July 1975, Work supported, in part, by a grant from the National Endowment for the Humanities to Barllan University. The work with U, S. Patents was supported, in part, by a grant from the National Bureau of Standards.

Aviezri S. Fraenkel1

Expanded Summary

The Responsa literature2 - written mainly in Hebrew and Aramaic, but containing verntculars like Arabic, Ladino, Persian German, Yiddish - is a collection of Over half a million queries and answers spanning 17 centuries, generated by several thousand authors and comprising about 4 X 109 characters in some 3,000 volumes. The queries were and still are today posed by individuals from all over the globe to the outstanding Jewish authorities in each generation. The decisions rendered became valuable precedents for Jewish law, and many of them were subsequently incorporated into the codes.

The bulk of the problems dealt with are of a practical nature, relecting real life situations. Thus the historical-sociological milieu is depicted in concrete episodes, in a true-life fashion not normally available from any other source. Therefore the Responsa constitute an enormous store of material of interest to scholars in many areas, such as law, history, economics, philosophy, religion, sociology linguistics, musicology,' folklore, etc. Personages, retlia geographic sites, saints and scholars, wars and kings, together with those minutiae which bring the life of a community into sharp focus - birth, marriage and death customs, recipes, taxes, medical practices - all these provide the scholar with a unique opportunity to recapture Jewish life with an accuracy often impossible from other sources.

The legal materials contained in the Responsa reflect in revealing detail the operations of the oldest applied legal tradition in the western world. This system and philosophy of law have been applied under the most varied conditions and diverse countries, ranging from predominantly agricultural and rural societies to urban commercial and industrial states. It reflects the problems engendered over many centuries by changing economic, social and other conditions.

Part of this material is in manuscript, and the printed texts are scattered throughout the world, many unavailable even to the professional scholar. Moreover, there is no global index or information system encompassing any significant part of this literature. The Responsa Retrieval Project was established to make a very fundamental contribution towards solving this problem.

The unique problems of the database - mixture of languages, lack of vowels and punctuation, extreme language inflection properties, abundance of homographs, existence of thousands of grammatical variants of any given term - dictated development of new methods. Among them we list « grammatical synthesis » which synthesizes all grammatical variants of a given keyword; «grammatical analysis » which analyses a word and associates with it its list of « standard dictionary entries »; « Compact KWIC » which enables the user to have a glimpse of the nature of the search before having performed it; « Conditional KWIC » which prints out'only those KWIC lines which contain specified combinations of keywords in relevant documents; effective citation index imbedded in full text searches, which permits to utilize the extensive reference and citation system interwoven in the Responsa literature; and local dynamic clustering and feedback methods which adapt to the shape of the individual search and improve its results. A short discussion of some of these methods is given in the sequel. Feller details are contained in [1], [2], [3], [4], [5].

Although the problems posed by a Hebrew database are much more severe than those posed by databases in European languages, the basic problems themselves are by and large present also in the latter case. It is therefore not surprising that most of the solutions arrived at for the former case are applicable also to the latter case.

As of the summer 1975, the database contained 70 volumes of books, comprising some 18,000 Responsa, 13,500,000 words (about 75,000,000 characters) made up of 309,000 distinct tettns, written in Spain, Germany, Algiers, France, Austria, Poland, Israel, Italy, Russia, Galizia, Switzerland, Egypt, Turkey, Hungary Lithuania, Syria, Czechoslovakia, Holland, Ireland and Greece, In contrast to other projects, common words are not removed,...

Per continuare a leggere