Corpus

Text Types in the Cor­pus sub­sys­tem

The Cor­pus sub­sys­tem con­tains texts of the fol­low­ing types.

  • Type [T]. Poetic texts which are trans­la­tions into Russ­ian of (usu­ally poetic) orig­i­nals writ­ten in French, Ital­ian, Span­ish or Por­tuguese (other Romance lan­guages are beyond the scope of the pro­ject at the first stage). By trans­la­tions we mean, among other things, works that some­how repro­duce a non-Russ­ian-lan­guage orig­i­nal, yet do not nec­es­sar­ily meet todayʼs cri­te­ria of fidelity to the source (free trans­la­tions, imi­ta­tions, etc.).

We will be con­cerned with texts that have served as mod­els for Type T (Russ­ian-lan­guage) texts. Three types of such mod­els are dis­tin­guished:

  • Type [O]. Non-Russ­ian-lan­guage orig­i­nals of Russ­ian-lan­guage texts (Type T).

  • Type [I]. Inter­me­di­ary trans­la­tions, either poetic or pro­saic (if the text was not trans­lated directly from the orig­i­nal; cf. trans­la­tions from Ital­ian, etc. through French; a Russ­ian trans­la­tion may be an inter­me­di­ary for another Russ­ian trans­la­tion).

    Between [T] and [О] there can be more than one [I]. Inter­me­di­aries can be included into the [О]–[I1 ... In]–[T] chain both “con­cur­rently” (which means that the author of the trans­la­tion used sev­eral inter­me­di­aries at once) and “se­ri­ally” (the inter­me­di­ary trans­la­tion itself is based on an ear­lier inter­me­di­ary trans­la­tion).

  • Type [S]. “Proto-sources”, i.e. non-Romance sources of Romances orig­i­nals (if they are, in their turn, imi­ta­tions or trans­la­tions). These could be e.g. Ancient Greek or Roman texts (e.g. many of La Fontaineʼs fables are adap­ta­tions of Aesopʼs fables).

At the sub­se­quent stages of the pro­ject, in addi­tion to trans­la­tions, i.e. works that repro­duce the con­tent of a non-Russ­ian model and (in many cases, but not always) its form, we will be inter­ested in works repro­duc­ing only the form of a non-Russ­ian model:

  • Type [X]. “Non-trans­lated” poetic works in Russ­ian, hav­ing a stan­zaic form of Romance ori­gin:

    • dis­tinc­tively Romance stan­zas (ter­cet, ottava rima, Gilbertʼs stanza, Ron­sardʼs stanza...);
    • dis­tinc­tively Romance “fixed forms” (son­net, ron­deau, tri­o­let ...).

While the mod­els for the “trans­lated” work cre­ated under Romance influ­ence are con­sti­tuted by the orig­i­nal, the inter­me­di­ary trans­la­tion(s) and the proto-source, the model for the “non-trans­lated” work cre­ated under Romance influ­ence is an entire class of met­ri­cal and stan­zaic mod­els, rep­re­sented by a par­tic­u­lar stan­zaic scheme and the scheme of a par­tic­u­lar poetic meter.

Lan­guages

The lan­guage of a [T] type text may only be Russ­ian.

The lan­guage of a [O] type text may be French, Ital­ian, Span­ish or Por­tuguese.

The lan­guage of an [I] type text may be any lan­guage (French, Eng­lish, Ger­man, Pol­ish, Russ­ian...).

The lan­guage of an [S] type text may be any lan­guage (Ancient Greek, Latin, Hebrew, Ara­bic...).

Poetry and prose

Our study is con­cerned with lin­guo-poetic forms, viz. Russ­ian poems and their sources. Why do we need pro­saic trans­la­tions? With­out them, we would not be able to estab­lish depen­dency chains. Often­times, inter­me­di­ary trans­la­tions are pro­saic (this is the norm for nine­teenth- and twen­ti­eth-cen­tury French trans­la­tions); more­over, some­times pro­saic trans­la­tions into Russ­ian can serve as inter­me­di­ary trans­la­tions them­selves. In addi­tion, there are his­tor­i­cally and cul­tur­ally impor­tant instances of prose trans­lated into verse, e.g. Tre­di­akovskyʼs “Tele­makhida” (a trans­la­tion of Fénelonʼs prose novel “Les Aven­tures de Télé­maque”), Batiushkovʼs “Is­tochnik” (a trans­la­tion of Parnyʼs poème en prose titled “Le Tor­rent”). Thus, some orig­i­nals are in fact pro­saic. Finally, some cul­tural epochs fea­ture mixed speech forms (e.g., the eigh­teenth cen­tury in French lit­er­a­ture): a pro­saic nar­ra­tive with poetic inser­tions (the con­verse is pos­si­ble as well: a poetic nar­ra­tive with pro­saic inser­tions).

Met­ri­cal and Stan­zaic Forms

Met­rics describe the prop­er­ties of the poetic verse (verse meters) and are con­sti­tuted by two para­me­ters: “Me­ter” and “Line Length”. Stanza refers to the prop­er­ties of verse sequences (i.e. prop­er­ties of stan­zas and fixed forms) and is con­sti­tuted by three para­me­ters: “Stanza or fixed form”, “Clausula” (the sequence of line end­ings in a stanza or a stanza-like pat­tern) and “Rhyme” (the rhyme sequence in a stanza or a stanza-like pat­tern). Other aspects of catalexis and rhyming do not apply to the stanza.

The met­ri­cal and stan­zaic struc­ture of the work is described in the Cor­pus sub­sys­tem using dis­tinc­tive fea­tures of poetic meters (an approach sim­i­lar to that imple­mented in the Poetic Sub­cor­pus of the National Cor­pus of the Russ­ian Lan­guage, see http://rus­cor­pora.ru/mycor­pora-poetic.html). The fol­low­ing fea­tures are taken into account:

  1. 1. Meter. The nomen­cla­ture:
    1. (i) Syl­labic-accen­tual verse:
      • trochee
      • iamb
      • dactyl
      • amphi­brach
      • ana­paest
      • ternary meter with vari­able anacru­sis
    2. (ii) Accen­tual verse and verse with vari­able inter-ictus inter­vals:
      • dol­nik
      • tak­tovik
      • accen­tual verse
    3. (iii) Logaoedic verse:
      • logaoed
      • hexa­m­e­ter
    4. (iv) Other ver­si­fi­ca­tion sys­tems:
      • vers libre
      • syl­labic verse
      • quan­ti­ta­tive verse

    Expla­na­tions on para­graph 1 (on the nomen­cla­ture of meters)

    Both logaoedic lines and logaoedic stan­zas are referred to as “lo­gaoeds”.

    The meter of all types of tonic verse is defined as “ac­cen­tual verse”.

    The meter of all types of syl­labic verse is defined as “syl­labic verse”.

    Unrhymed unmetri­cal verse is defined as “vers libre”.

    The meter of quan­ti­ta­tive verse (e.g., of ancient proto-sources) is defined as “quan­ti­ta­tive verse” and will not be spec­i­fied any fur­ther at the first stage of the pro­ject.

  2. Line length in the “count­ing units” for the par­tic­u­lar ver­si­fi­ca­tion sys­tem (feet, ictus, syl­la­bles). We dis­tin­guish mono­met­ric texts (with con­stant line length), reg­u­lar het­ero­met­ric texts (with reg­u­lar alter­na­tion of lines of dif­fer­ent lengths: for­mu­las “4+3+4+3”, “4+4+4+2”, etc.), as well as free het­ero­met­ric texts (with irreg­u­lar alter­na­tion of lines of dif­fer­ent lengths: for­mu­las “4/5/6”, “1/2/3/4/5/6”, etc.; the fre­quency of lines of var­i­ous lengths is not taken into account).

    A note to para­graph 2 (on mono- and het­ero­metric­ity)

    Both mono­met­ric and het­ero­met­ric texts are non-poly­met­ric. For poly­met­ric texts see below, para­graph 6.

    A note to para­graph 2 (on line length)

    The length of syl­labic-accen­tual verse lines is mea­sured in feet (the clausula does not count; in the case of a 3-syl­la­ble meter with vari­able anacru­sis, nei­ther the clausula, nor anacru­sis count). Line length for dol­nik, tak­tovik and accen­tual verse is mea­sured in met­ri­cally stressed syl­la­bles—ic­tuses (stress in the clausula or anacru­sis does not count, nor does extra­p­at­tern stress; unstressed ictus do count).

    When com­put­ing the length of a syl­labic verse line, the syl­la­bles of the clausula and expanded caesura do not count. There­fore, some tra­di­tional national des­ig­na­tions of syl­labic meters require a “syl­la­ble recount”. Thus, while the “length” of French deca­syl­la­ble verse with the for­mula “10+(0/1)” is con­sid­ered equal to 10 syl­la­bles, the length of Ital­ian hen­deca­syl­la­ble verse with the for­mula “10+1” (in low gen­res: “10+(0/1/2)”) is also con­sid­ered equal to 10 syl­la­bles. Sim­i­larly, the length of French alexan­drine (dodeca­syl­la­ble verse) with the for­mula “6+6+(0/1)” is con­sid­ered equal to 12 syl­la­bles. But the length of Span­ish alexan­drine with the for­mula “6+(0/1)+6+1” is also con­sid­ered equal to 12 syl­la­bles, although its tra­di­tional rep­re­sen­ta­tion in the case of the verse with a female caesura and (oblig­a­tory) female clausula looks like “7+7” (tetradeca­syl­la­ble, four­teener). In other words, to deter­mine the length of syl­labic verses, only “met­ric” syl­la­bles are counted (as is cus­tom­ary in the French tra­di­tion, but not in the Ital­ian, Span­ish or Por­tuguese tra­di­tions).

  3. Clausula (mas­cu­line, fem­i­nine, dactylic, hyper­dactylic), given by a for­mula [mfmf, ffmffm, dmdm, etc.].
  4. Rhyme, given by a for­mula [abab, abba, etc.].
  5. Some com­bi­na­tions of para­me­ters 1 to 4 pro­duc­ing tra­di­tional stan­zas and fixed forms, are pro­vided with appro­pri­ate meta-infor­ma­tion (ottava rima, Ron­sardʼs stanza, ter­cets, son­net, tri­o­let, etc.). For other stan­zaic texts the stanza length (in lines) is indi­cated. This para­me­ter may have a zero value (for astrophic verse).
  6. Poly­met­ric com­po­si­tions (PMC). This met­ri­cal struc­ture is char­ac­ter­is­tic of a num­ber of Romance verse forms, in par­tic­u­lar, for the can­tata. PMC meter and stanza vary across frag­ments of PMC, which nev­er­the­less remains a sin­gle work.

    A note to para­graph 6 (on poly­metric­ity)

    For PMC, the field must be filled with mul­ti­ple val­ues: each mono­met­ric frag­ment is described by its own para­me­ters; the whole work is find­able by search­ing for para­me­ters cor­re­spond­ing to one of the mono­met­ric frag­ments. In addi­tion, a sep­a­rate “Poly­metric­ity” para­me­ter is intro­duced, mak­ing it pos­si­ble to find, in the Cor­pus sub­sys­tem, PMC of any struc­ture (with­out spec­i­fy­ing the struc­ture). At the first stage of the pro­ject, PMCs are defined only by the lat­ter para­me­ter.

Meta­data (attrib­utes)

Texts found in the Cor­pus sub­sys­tem are sup­plied with the fol­low­ing attrib­utes:

  • Short title of work (Author + Title)
  • Bib­li­o­graphic descrip­tion of work (accord­ing to the edi­tion from which the text is extracted for the Cor­pus)
  • Author of work (author of the orig­i­nal if a trans­lated work)
  • Trans­la­tor (if a trans­lated work)
  • Pub­li­ca­tion date (accord­ing to bib­li­o­graphic descrip­tion)
  • Date of com­po­si­tion (if the pre­cise date is unknown, a range is given)
  • Date of the first pub­li­ca­tion (if known)
  • Lan­guage of work
  • Type of work in the Cor­pus (О, T, I or S)
  • Speech form (verse; prose; mixed: verse + prose)
  • Met­ri­cal and stan­zaic para­me­ters of work:
    • meter
    • line length
    • clausula
    • rhyme
    • stanza or fixed form
    • poly­metric­ity
  • Notes (this field con­tains data which is kept infor­mal at this stage)

This meta­data set accom­pa­nies works of all types. This approach lets us obtain the accom­pa­ny­ing infor­ma­tion (meta­data) when work­ing with any text, makes it eas­ier and more intu­itive to estab­lish links between texts, and makes attribute search more flex­i­ble.

The meta­data are invoked by click­ing the (i) but­ton on the tool­bar.

Exam­ple:

author:
Petrarca F.
tra­duc­tor:
Kuzmin M. A.
title:
«Столь сладкой негой, что сказать не в силах...»
title:
Сонет 116
pub­li­ca­tion date:
1996
date of com­po­si­tion:
1927
first pub­lished:
1996
lan­guage:
Russ­ian
type:
trans­la­tion
speech form:
verse
meter:
iamb
verse length:
5
clausula:
f
rhyme:
abba abba cgd dcg
stanza or fixed form:
son­net
poly­metric­ity:
no
note:
Хореические переакцентуации в стихах 7 и 10.

Titles and texts

Titles found in the Cor­pus sub­sys­tem are rep­re­sented by full texts and meta­data. How­ever, the full text may be tem­porar­ily unavail­able (to be added later). Meta­data can be incom­plete (i.e., when the author of an inter­me­di­ary trans­la­tion is known, but no other data on the title is at hand, be it tem­porar­ily or irre­me­di­a­bly).

The Cor­pus sub­sys­tem may con­tain indi­vid­ual works that are not rep­re­sented by full texts and simul­ta­ne­ously have unspec­i­fied attribute val­ues (e.g. “un­known orig­i­nal”, “uniden­ti­fied inter­me­di­ary trans­la­tion”). Such poten­tial texts are in fact desider­ata, as they stim­u­late research.

Exam­ple 1. Alexan­der Tokarev, a mem­ber of the “Green Lamp”, left behind only two poems (1819). One, as Boris Toma­shevsky estab­lished, is a trans­la­tion of Paul Scar­ronʼs comic son­net “Su­perbes mon­u­ments de lʼorgueil des humains...” The sec­ond one is a comic eulogy to tobacco, also, most likely, a trans­la­tion, but its source is still unknown.

Exam­ple 2. Most trans­la­tions from Ital­ian, Span­ish and Por­tuguese, made in the eigh­teenth cen­tury, when these lan­guages were poorly known in Rus­sia, are based on French and Ger­man inter­me­di­aries, not all of which have been iden­ti­fied.

A work of any type [O, T, I or S] is rep­re­sented by one text. Russ­ian-lan­guage works of the type [T] are rep­re­sented by one of the author­i­ta­tive texts from a pub­li­ca­tion selected on the basis of expert opin­ion. Orig­i­nals [O] and inter­me­di­ary trans­la­tions [I] are rep­re­sented by the edi­tion that was pre­sum­ably the source of the trans­la­tion, or that repro­duces the text of this source. In the absence of access to the actual text, it is rep­re­sented by meta­data only.

Mod­ern author­i­ta­tive edi­tions of orig­i­nals and trans­lated works are con­sid­ered meta­texts and hence are found in the Library sub­sys­tem along with other meta­texts (research texts). Such pub­li­ca­tions are repro­duced in their entirety, that is, together with the accom­pa­ny­ing arti­cles, com­ments, etc., as in the edi­tion.

On the struc­ture of the Library sub­sys­tem see the appro­pri­ate sec­tion of this descrip­tion. The issue of the rela­tion­ship between objects in the two sub­sys­tems and switch­ing between them is dis­cussed below.

Orig­i­nals and inter­me­di­ary trans­la­tions

In rare cases, a trans­lated work may have more than one orig­i­nal. Firstly, there are (semi)trans­lated works such as e.g. “ex­cerpts” from Baratyn­skyʼs poem “Воспоминания” (Rec­ol­lec­tions), where some of the text is trans­lated from the abbé Delille, some from Gabriel Legouvé, while other parts are trans­la­tions of a still unknown orig­i­nal (or they may be orig­i­nal them­selves). Sec­ondly, there are works trans­lated from sev­eral edi­tions of the orig­i­nal, e.g. sev­eral Russ­ian trans­la­tions of Millevoyeʼs elegy “La Chute des feuilles” (avail­able in five redac­tions, quite diver­gent from each other). In the sec­ond case, the prob­lem of ver­sions again arises, which we tem­porar­ily “got rid” of for the case of sev­eral trans­la­tion ver­sions.

One trans­lated work may be based on more than one inter­me­di­ary trans­la­tion. And inter­me­di­ary trans­la­tion may be hypo­thet­i­cal, i.e. not known for cer­tain (for exam­ple, if the trans­la­tor did not know the orig­i­nal lan­guage).

The issues of vari­abil­ity and frag­men­ta­tion of texts

Vari­abil­ity. The exam­ple of the Russ­ian trans­la­tions of Millevoyeʼs elegy “La Chute des feuilles” (see above) brings us to the extremely impor­tant philo­log­i­cal issue of the insta­bil­ity and vari­a­tion in the text of a fic­tion work. This applies in equal mea­sure to orig­i­nal and trans­lated works. Thus, out of Petrar­chʼs four son­nets trans­lated by Osip Man­delsh­tam (1933–1934), the first trans­la­tion is known in three redac­tions, the sec­ond one in three, the third in two, and the fourth in five.

At the first stage of the pro­ject, we refrain from an over­all solu­tion of the com­plex prob­lem of text vari­ants and the rela­tion­ship between them. In a sit­u­a­tion where a trans­la­tor has cre­ated sev­eral sig­nif­i­cantly dif­fer­ent ver­sions (“redac­tions”) of a trans­la­tion, they can be rep­re­sented in the sys­tem as sep­a­rate trans­lated works of the same orig­i­nal and by the same author.

Another prob­lem is frag­men­ta­tion. A trans­lated work can be the trans­la­tion not of the entire orig­i­nal, but rather of some of its frag­ments. Even a small poem can be short­ened in trans­la­tion, some­times con­sid­er­ably. At times, such frag­men­ta­tion is caused by crit­i­cal arti­cles, antholo­gies, text­books and other texts, repro­duc­ing only one, the most sig­nif­i­cant or pop­u­lar, frag­ment of the work. Thus, the imi­ta­tion of Petrarch by Ivan Dmitriev (1797) is, in fact, only the fourth stanza from Petrar­chʼs can­zone “Di pen­sier in pen­sier...”, read by Dmitriev in Ital­ian with a par­al­lel French trans­la­tion by Pierre-Charles Levesque, who printed this stanza as a sep­a­rate work. Kon­stan­tin Batiushkovʼs poem “Рыдайте амуры и нежные грации...” (“Weep, Amours, ten­der Graces, weep...”, 1810–1811) is a trans­la­tion of three ini­tial and three final lines from Paolo Rol­liʼs poem: only these lines were repro­duced in Anto­nio Scop­paʼs poetic trea­tise, in which Batiushkov had found Rol­liʼs poem.

Quite fre­quent are incom­plete trans­la­tions of long poems, e.g. epics. Sev­eral can­tos and even frag­ments of some can­tos from Dan­teʼs Divine Com­edy, Arios­toʼs Orlando Furioso, Tas­soʼs Jerusalem Deliv­ered have been trans­lated into Russ­ian a num­ber of times. The first full verse trans­la­tions of these works appeared quite late (Orlando Furioso appeared in Russ­ian as late as 1993, in Mikhail Gas­parovʼs free verse trans­la­tion; a com­plete trans­la­tion repro­duc­ing the rhymed ottavas is still lack­ing). Also com­mon are trans­la­tions of poetic insets from large pro­saic works (see below for an exam­ple of trans­la­tions of a romance from Don Quixote by Cer­vantes).

At the first stage, the issue of frag­men­ta­tion will not be addressed.

The issue of meta­data hier­ar­chy

The prob­lem of meta­data hier­ar­chy is directly related to that of frag­men­ta­tion. Pri­or­ity should be given to struc­turally sim­ple works, so at the first stage we con­fine our­selves mainly to small lyri­cal gen­res and mono­met­ric texts.

How­ever, we have already included a few large texts con­sist­ing of struc­tural units of lower lev­els (for exam­ple, Divine Com­edy > Hell, Pur­ga­tory, Par­adise > indi­vid­ual can­tos from Hell, Pur­ga­tory and Par­adise). Ini­tially, such texts are iden­ti­fied at the first stage as sep­a­rate titles (works). The trans­la­tion of any of their frag­ments is iden­ti­fied “sim­ply” as a trans­la­tion of such works. The next step should be to think out ways of iden­ti­fy­ing the con­nec­tion of frag­men­tary trans­la­tions with the cor­re­spond­ing frag­ments of the orig­i­nal. In other words, we will need to intro­duce a two-level hier­ar­chy into the meta­data (the whole work and its parts).