Focus on...

The DSL (Dictionary of the Scots Language) upgrade project

Ann Ferguson

This is the first in an occasional series of articles focusing on a particular area of SLD's work, to give readers a glimpse of some of the things we do on a day-to-day basis. This one features some aspects of the ongoing project to upgrade the online Dictionary of the Scots Language (DSL) at www.dsl.ac.uk.

We have already implemented many enhancements on our test system which will make the redesigned dictionary more efficient and easier to use once it goes live. The search facility now includes a 'predictive' search, an option for wildcard characters, and the ability to search for parts of words, which can be useful for finding compounds. As for cross-references, the vast majority (some 87,000) now link automatically to the correct entry. The only cross-references still to be implemented are those from the two SND (Scottish National Dictionary) supplements.

Quotation matching

In DOST (A Dictionary of the Older Scottish Tongue) alone, there are nearly 583,000 quotations. These illustrate the definitions by showing the relevant word in context. Each quotation has an abbreviated reference to the text from which it is quoted. However, in the current version of DSL the only way the online user can access the full details of a text is to carry out a separate search for it by choosing the 'Search Bibliographies' option and typing in all or part of the abbreviated reference. In our test version we have implemented an automatic link from the abbreviated text reference for each quotation to its entry in the DOST Revised Register of Titles (the DOST bibliography), so that all the user has to do is click on the link in the entry, and they will see full details of the relevant text.

Because of the way the information was presented in the printed dictionaries and the inevitable inconsistencies that arise in a multi-volume work which took decades to complete, there wasn't always a one-to-one relationship between the abbreviated form in the entry and the listing in the bibliography. To take one relatively simple example: there is a bibliography entry for a text titled Household Books of James Sharp, Archbishop of St Andrews, 1663-6. The abbreviation given for this text in the bibliography is Household Bks. Archb. Sharp. Of the 156 quotations from it that appear in DOST, 138 have the same abbreviated form as appears in the bibliography, but the remaining 18 quotations are variously referenced:

Household Bks. Abp. Sharp Household Bks. Arch. Sharp
Household Bks. Archbp. Sharp Household Bk. Archb. Sharp
Archb. Sharps Househ. Bk. Housekold Bks. Arckb. Skarp [sic]

To the human eye it is fairly obvious that these refer to the same thing, but such differences create problems for an automated match. These and thousands of others therefore had to be individually checked and manually matched up to the correct item in the bibliography.

There are other cases where the reference given for a quotation is so different from the listed bibliography item that merely reading it is not enough – texts have to be identified, books checked, perhaps different editions compared. One example of such a reference is 'Battle of Balrinnes' which appears in three dictionary entries. There is no mention of this title, or any version of it, in the bibliography. Some digging reveals that it is the name of a poem in the collection Scottish Poems of the Sixteenth Century, edited by J.G. Dalyell – so, that's the bibliography item that the quotations have to be linked to. Furthermore there are, inevitably, quotations from texts which are not listed in the bibliography at all – sometimes these can be identified, but sometimes it is a bit more difficult to determine exactly which text the quotation has been taken from.

We have now matched up the vast majority of the DOST quotes to their bibliography items: a task which involved many, many hours of painstaking, often tedious, but always satisfying work. Inevitably such a task generates various unforeseen subsidiary tasks and exercises, all of which have to be prioritised and fitted into the overall upgrade project.

So - if there's one thing we don’t worry about, it's running out of things to do. One of the next major undertakings is to tackle the equivalent quotation matching exercise for the Scottish National Dictionary (SND). As if the DOST exercise wasn't complex enough, the structure of the SND bibliography lends itself even less easily to automatic matching. A very large checking exercise looms on the horizon.

Want to be a test user?

If you would like to try our test versions of DOST and SND, please let us know and we can send you access details. The interface is minimal at the moment as the design of the new website is in its early stages, but we can provide you with some basic guidelines that explain what you can do. All we ask in return is that you tell us what you think of it!