Editing principles

Under 'Editing principles' we provide information on how the data are extracted and what sort of normalisation we use when normalising particular data types.

Data extraction

General remarks on the data

Spatial references

The Norse World data includes two types of spatial references, place names and non-names. These data are collected from vernacular East Norse, i.e. Old Swedish and Old Danish, contexts, see 'Material'. 

Place names are names of topographical, physical, and cultural features and they constitute the largest group of the project's spatial referents. Place names in the East Norse corpus are identified contextually by means of close reading.

For example, France and Lake Ladoga are place names that are relatively easy to identify as such. The place name Mæret mentioned in Själens tröst (Consolation of the Soul) is homonymous with the common noun mær 'sea' in the definite form. In this case, it is the context of the source that justifies the identification of the spatial reference as a name.

The same goes for multiple place names that represent phrases containing spatial references of the type “the place where…”, e.g. Thæn øthken, i hvilik Kristus mættethe fæm thusend mæn (The desert where Jesus fed five thousand men). These types of phrases are frequent in Vejleder for Pilgrimme (Guide for Pilgrims), and also occur in e.g. Mandevilles Rejser (The Travels of Sir John Mandeville), Sydrak (The Book of Sidrach), and various sermons. These phrases do not constitute prototypical place names. However, we consider them to be a part of the project material, because they denote places that we assume were meaningful for the medieval audience in terms of Christian and encyclopaedic geographical knowledge. They have been excerpted when they explicitly refer to a significant location (real or fictional). Thus, phrases of the type “[He tethered the horses] in a stable” are not included, whereas “[They went] to the stable where Jesus was born” are.

Another problematic case is distinguishing between the choronym (country name) Israel and the homonymous personal name denoting a biblical patriarch. When the name does not directly indicate the person (formerly known as Jacob), then we consider the name to refer to the nation or territory and include relevant references into the database, cf. e.g. Old Swedish attestations badh han innirlika israels gudh och æn israels folk hafdhe ther aff enga nødh

In some cases the denotation of the place name is ambiguous. For example, the Old Swedish Babilonia is attested in a variety of East Norse sources denoting either the city of Babylon, the Neo-Babylonian Empire, the Babylon Fortress, or, in the case of Didrik of Bern, a fictional locality (perhaps a castle) by the Rhine. There are thus four standard forms and corresponding localities to account for these attestations in the database: Babylon (city), Babylonia (Neo-Babylonian Empire) (country), Babylon Fortress (castle), and Babylonia (castle). In the same way, the Old Swedish and Old Danish lemma forms Rom can denote both the city of Rome and Roman empire. In the attestations containing a monarchy title such as queenkingemperor, and the like, or a church title such as bishop, the context in our opinion implies the spatial reference is a choronym and denotes the area over which the person has jurisdiction (e.g. kingdom, empire, bishopric). In this way, the attestation keysaren aff rom is linked to the standard form Roman Empire (country), while the attestation room is linked to the standard form Rome (city). For more information and examples, see 'Data and related metadata: spatial references and spatial data'.

Verona is referred to with different appellatives. It is called fæste and slot ’castle, stronghold’; borgh ’castle, fortified city’, and staþer ’place; city’. This affects the standard form. We have chosen to use the standard form Verona (city), except when it is explicitly called fæste or slot in its immediate textual vicinity, where the standard form instead is Verona (castle). This is, of course, not a perfect solution. These terms are not mutually exclusive in the way that the database demands. This is clearly shown in the quote from Didrik av Bern (Didrik of Bern): ok misthe myna godha borgh bærn / ok manga andræ sloth ok stædher ’and I lost my good borgh Verona, and many other slot and stæþer (pl. of staþer)’.

Some churches occur in the material only with the name of their patron saint(s). These instances are excerpted as attestations of spatial reference. This is particularly the case with the Cathedral of Santiago de Compostela in the material often referred to as Sancte Jacob: Han wille til sancte iepss fare.

Non-name is a collective term for spatial references that are not covered by the category place names. The category is heterogenous and comprises adjectives, adverbs, coin designations, inhabitant designations, language designations, noun bynames, and origin designations. Examples of non-names include florinTavastianGreek language, and Sunamitis (of Shunem). For more information and examples, see 'Data and related metadata: spatial references and spatial data'.

Inhabitant designations: problematisation

Inhabitant designations are nouns used to designate the inhabitants of a city, region, or country, e.g. Judean (Jew), RomanTavastian, and Wend. However, there are at least two potential inhabitant designations that the suggested definition does not always cover, Judean (Jew), cf. Old Swedish iuþe, iudhinna, or iudha folk 'Jew', and blaman (Ethiopian), cf. Old Swedish blaman 'Ethiopian' (in a few contexts only).

In the Norse World project, only references to Jews from the Old Testament are included as Jews are usually regarded as a national group (Judeans) in those contexts. Jews in the New Testament are usually depicted as a religious “other” (in comparison to Christians). Therefore, we do not include such references into the Norse World database. Any other religious terms that include references to Jews, e.g. the Old Swedish compound iuþa biskoper, are not included either. We also do not excerpt references to Jews when the reference is used in a contemporaneous/medieval context. The fictional Red Jews, however, are excerpted since they represent a spatial reference, as they according to tradition inhabit a country on the other side of the fictional river Sambation.

In many East Norse contexts, the Old Swedish word blaman indicates a religious “other” (in comparison to Christians). Thus, we only include references to blaman into the database when the word is used to denote inhabitants of a region or a country, e.g. Ethiopians. 


Spatial references (place names and non-names) are sorted into two groups regarding contextual identification: real or fictional. Place names and non-names are sorted under real by default if they do not match the requirements for fictional. Place names referring to biblical places and other places associated with Christianity such as churches or other places of Christian worship are classified as real, since we assume that the medieval (Christian) audience perceived the places as being real.

Place names and non-names are sorted under fictional when they are impossible to identify or if the actual context of the attestation contradicts the assumed identification of the spatial reference. For example, the place name Mundin sorted under fictional is impossible to identify. Another place name classified as fictional is ApolisborghAplesburgh (Apolis Castle) from Floris and Blancheflour. There is a European tradition of identifying corresponding place names in other vernacular versions of the text as the fictional Spanish city Nople (also written Noples) (Grieve 1997:46–47). However, the context of the Old Swedish and the Old Danish attestations contradicts the assumed identification of the place name.


Each occurrence of a spatial reference (place name or non-name) is entered into the database with the reference indexed on four levels:

i) original form
ii) variant form
iii) lemma form (Old Swedish and/or Old Danish)
iv) standard form

Each spatial reference first occurs as it appears in the text of a source (original form) followed by two types of normalisation (variant form and lemma form) and a standard form that links Old Swedish and Old Danish lemma forms to a specific geographical location and its spatial data. In other words, these forms provide the spatial reference in source-near and source-abstract forms that will make different types of search possible. For example, two Old Danish original forms, til egyptoland and innen egypte landh, give two variant forms, Egyptoland and Egyptelandh respectively, but only one Old Danish lemma form, Egipteland. Furthermore, the lemma form Egipteland is linked to the standard form Egypt. Other lemma forms linked to the same standard form include Old Swedish Egyptaland and Egyptus. For more information, see 'Data and related medatada: attestations'.

Data extraction

The original form is an attestation of a spatial reference (place name or non-name) as it appears in the text of the source and, when relevant, transcribed at the diplomatic level with abbreviations expanded in italics. The spatial reference in the original form is marked in bold, e.g. til babiloniam 'to Babylon'. If the editor omits a part of the source text in the original form, the omission is marked with (...). Editorial emendations in original forms appear in square brackets, []. If an attestation is damaged and part of the text is missing in some way, these missing letters are supplied in the lemma form, but not in the variant form (in order to show the variation in spelling). In this way, an attestation of a spatial reference compromised by water damage, for example the reading gariam for the assumed [vn]gariam in Stockholm, National Library of Sweden, D 4, fol. 260v is entered into the database as the original form [vn]gariam following the editor’s practice. The variant form of the spatial reference is Gariam, while the lemma form is normalised as Ungaria. Line breaks in verse and, when relevant/available, in prose are marked with |. Page breaks in the source are, when available, rendered as || (columns are not marked).

The original forms are excerpted from editions and manuscripts, see 'Theoretical considerations'. If the original form detailed view includes information on the edition use, it means that the original form is taken from the edition. If proper names are normalised in the edition of a text, e.g. by capitalising the initial letter, we follow the edition and provide a note regarding this editorial choice in the relevant work entry.

The original form can include textual contexts of varying length. Following textual contexts are represented in the collected material:

  • none, since the spatial reference in attested in the nominative, e.g. [parad]iis.
  • prepositional phrases that the spatial reference is a part of, e.g. til anthiochiam.
  • nominal phrases (including personal name phrases) that the spatial reference is a part of, e.g. konungen aff babilonamicus aff bericano.
  • verb phrases that the spatial reference is a part of, in those cases the context is needed to establish the case declension of the spatial reference, e.g. babiloniam sagh iak alregh.
  • explicit references to the denotatum/referent of the spatial reference, e.g. vnder ena øø ther kallas rodum.
  • in rhymed sources, the line the spatial reference if a part of, e.g. swa at marger rytz saa sik rødhan swet.
  • any other context needed to understand the spatial reference, e.g. Oc førde them saa tijl sleswig opp | i sanecti pædhers kircke the thet lade.


The main principle of normalisation

The main principle of normalisation for Old Swedish is to follow the conventions used in Söderwalls Ordbok öfver svenska medeltids-språket (1884–1973) with one minor adjustment, i.e. we replace ä and ö with æ and ø respectively. For Old Danish, the general normalisation principle is to follow the method for normalisation devised by Kaj Bom in the Old Danish lemma list in 1954 and used ever since by Gammeldansk Ordbog.

Normalisation of variant forms

The variant form is a slightly normalised form of the spatial reference (place name or non-name) based on one or more similar original forms that occur in one or more sources of one or more works. The variant form contains none of the textual context provided for the original form. For more information, see 'Data and related metadata: attestations'.

The normalisation of the variant form varies slightly depending on the type of spatial reference, place name, or non-name. For both place names and non-names, variant forms retain their original spelling and declension according to definiteness and (grammatical) number. However, variant forms are always provided in the nominative irrespective of their case declension in the original form.

For place names, the initial letter of the place-name variant is capitalised. Compound place names are written as one word even if the original form suggests other spellings. For example, the Old Swedish variant forms for the standard form Egypt (Ancient Egypt) include:

  • Egiptiland
  • Egiptilandh
  • Egiptoland
  • Egiptolandh
  • Egiptus
  • Egyptolan
  • Egyptoland
  • Egyptus.

There are six Old Danish variant forms for the standard form Judea in the database:  

  • Iødelandh
  • Iødhælandh
  • Iudhaland
  • Iudhalandit
  • Judhaland
  • Jvdhaland.

For non-names, variant forms are written in lowercase letters and the spelling of compounds varies in accordance with authoritative dictionaries, see 'Normalisation of lemma forms' below. For example, the Old Swedish variant forms for the inhabitant designation Russian include:

  • ridza;
  • ridzo;
  • rydzane;
  • rydzenæ;
  • rydzer;
  • rydzernæ;
  • rydzerna;
  • rytz;
  • rytza;
  • rytza færd;
  • rytza konungen;
  • rytza konunger;
  • rytza konunghe;
  • rytzane;
  • rytzanne;
  • rytze;
  • ryza;
  • ryzsa;
  • ryzsor.

There are four Old Danish variant forms for the standard form Gum arabic in the database:

  • arabisqwadhe;
  • arabsquade;
  • arabsquadhæ;
  • arabsquadhe.

To sum up, different spellings and declension according to definiteness and (grammatical) number give new variant forms.

Normalisation of lemma forms

The lemma form is a normalised form of the spatial reference (place name or non-name) that is constructed on the basis of the collected variant forms and original forms, as well as attestations of the word found in other sources not covered by the project.

The normalisation of the lemma form varies depending on the type of spatial reference, place name, or non-name. For place names, we follow the principle that a new place-name formation gives a new lemma form. Thus, we retain declension according to definiteness and/or (grammatical) number because such declension implies a new name formation. Lemma forms are always provided in the nominative case and their spelling is normalised according to the principles outlined below.

Orthographic and other variation that we interpret as reflecting a re-interpretation of a place name and thus new name formation results in a new lemma form. Isolated forms that we consider simple scribal errors are not given separate lemma forms. Thus, we use the Old Swedish lemma form Holtbeke for Cölbigk reflecting the original form holt beke, but we retain the Old Danish lemma form Israel even for the original form yrael. Furthermore, orthographic forms used in several manuscripts (or for instance in a foreign source text) are treated as lemma forms. The form Sela (and Cela) in the Consolation of the Soul refers to one of the five cities mentioned in Genesis 14: Bela (Zoar). The change of the initial letter is easily explained as a scribal error; however, since the Old Swedish (Cela) and Old Danish (Sela) versions agree, and since the Middle Low German version of the text has Sela, the form Sela is considered a lemma form. Another example is provided by the spelling variant Oxlo denoting the city of Oslo.The form Oxlo forms a separate lemma because it likely originates in the form Opslo, a frequent spelling of the name Oslo during the Middle Ages. The ⟨x⟩ in Oxlo is assumed to go back to ⟨p⟩ in Opslo, cf. the spelling Oxslo and the like in medieval charters.

These principles apply to both real and fictional place names; in the latter cases however, the boundary between scribal errors and name formation is often blurred, cf. Old Swedish lemma forms Mortarie, Montarie, Mactoire, Mintarie to denote the fictional castle of Montarie in the Old Swedish version of Floris and Blancheflour. In the French version of the work, the spatial reference might be a reference to the city Montoro in Andalusia, Spain (see Grieve 1997:48) but as Spain is in no way referenced in the Swedish and Danish versions this identification of Montarie has not been used.

Verona is a complex referent and is attested in several different forms in Old Swedish. Bern is the most common and is thus used here as a lemma form. Also Berne and Berna are frequent. They are brought together under the lemma Berna, as unstressed a and e often alternate in (later) Old Swedish. The one exception is when Berne occurs in a plausible dative position, where instead the lemma Bern is chosen and the final -e is interpreted as a dative ending. Bernen and Berner are two other attested word formations – the former with the definite suffix -en, and the latter with the derivational suffix -er – and are, consequently, two additional lemma forms. Also the root vowels of ’Verona’ vary, more particularly between e, æ, and a. The e-forms are the most common and are thus used in all of the lemma forms; the attested a-form, Barna, is merely sporadic.

It is important to create normalised lemma forms in Old Swedish and Old Danish for each place name, so that all the occurrences of the name with different orthographic or morphological forms can be linked to appropriate lemma forms in either language. For Old Danish it follows the principles for normalisation devised by Kaj Bom in the Old Danish lemma list in 1954 and used ever since by Gammeldansk Ordbog. This list includes numerous place names and spellings that are particularly difficult to decipher. For Old Swedish we follow the method of normalisation used in Söderwalls Ordbok öfver svenska medeltids-språket (1884–1973). However, as neither of these dictionaries includes propria, we are required to coin normalised lemmata for place names. Söderwall’s normalisation of Old Swedish words, for instance, biærgh, staþer, garþer etc., is preserved when the words occur in place names, for instance, Garþrike, Stiærnabiærgh; the same practice is applied to the Old Danish material normalised in accordance with Gammeldansk Ordbog, for instance, Lybækestath. Furthermore, Gammeldansk Ordbog does have a small citation-slip collection of some place names (including foreign places), that has been of great help.

For example, the Old Swedish lemma forms for the standard form Egypt (Ancient Egypt) include:

  • Egyptaland;
  • Egyptus.

There are two Old Danish lemma forms for the standard form Judea in the database:

  • Jutheland;
  • Juthelandet.

For non-names, we take lemma forms from authoritative dictionaries, Söderwalls Ordbok öfver svenska medeltids-språket (1884–1973) for Old Swedish and Gammeldansk Ordbog for Old Danish. Non-names are thus provided in the singular, nominative, indefinite form.

For example, the Old Swedish lemma forms for the inhabitant designation Russian include:

  • ryz;
  • ryza fiærdh;
  • ryza konunger.

There is one Old Danish lemma form for the standard form Gum arabic in the database:

  • arabskvathe.

Other principles of normalisation

  • The most common abbreviations in Old Swedish are normalised in accordance with the Svenskt Diplomatarium's template as presented in Svenskt Diplomatarium 11:1 (2006: VI), e.g. thz is normalised as thet and mz as medh.
  • The Old Swedish suffixed definite article is normalised as -et.
  • In Old Swedish genitive compound lemma forms, the genitive plural ending of the first element is normalised according to authoritative grammars of Old Swedish, usually as -a.
  • We follow Söderwall when normalising the assumed voiced labiodental fricative between two vowels in Old Swedish lemmata as v, cf. hava. Consonantal w is normalised as v, e.g. the original form Østerhaffwet is normalised as Østerhavet, and vocalic v and w are normalised as u, e.g. the original forms Vnghern, wthenlandz are normalised as Ungern and utlændis respectively.
  • In Old Danish the semivowel j is rendered as j in accordance with Gammeldansk Ordbog: juthe, Jerusalem, and Jordan. In Old Swedish the semivowel is rendered as i in accordance with Söderwall: iuþe, Iordanes, or as ih in the case of Iherusalem and Iherico.
  • In some cases, y in both Old Swedish and Old Danish is normalised as i, e.g. Libanus, Tiberis.
  • When there is code-switching with Latin, we use the Medieval Latin form of the name (if there is such an established form). Classical Latin spellings (e.g. with ae) are not used, cf. Egyptus, rather than Aegyptus; Cesarea, rather than Caesarea. If the foreign form has been mediated through French (or German), then the form in the mediating language is taken into consideration; for example, Lombardi (< OFr. Lombardie < MedLat. Lombardia.
  • When lemmatising church names containing the word Saint, we use the vernacular Sankte when the saint's name is declined according to the vernacular rules, e.g. Sankte Iakobs land. If the Latin declension of the saint's name is used we take the latin form Sancti or Sancte, e.g. Sancti Petri.
  • When normalising names of Latin provenances, the use of c and ch has been retained. Thus, Luca, rather than Luka; Antiochia, rather than Antiokia; Macedonia, rather than Makedonia. Similarly, qu is retained in names such as Aquitania, rather than Akvitania. If the name has been adapted to the vernacular in any way, then the vernacular spelling is retained; for example: Old Danish Secananfloth, rather than Sequananfloth.
  • We keep þ, x and z in Old Swedish lemma forms in accordance with Söderwall, e.g. iuþe, Saxen, ryz and Franz.
Last modified: 2022-01-21