What kind of edits do and do not happen to the LSJ in Logeion?

The LSJ was painstakingly keyboarded by the Perseus Project and processed to become machine-actionable in many ways. Scripts were used to discover and mark up sense distinctions, citations, etc., and to split one entry from the next. That last seems obvious, but in the original typesetting the start of a new entry may just look like so:

-ια, (sc. ἱερά), τά, festival of Dionysus..

So the task of the scripts was to: recognize the start of a new entry, find the most recent hyphen, supplement the word properly (source of a lot of mistakes because humans feel a less pressing need for hyphens at every turn than computers). [Easter egg: see what I now need to go edit in this entry? Blegh..]

Many entries did not in fact get split up, or got the wrong supplementation. Take a recent problem report:

The correspondent is completely correct in saying that these words (ἔτραγον, ἔττακαν,..) ought to be separate entries in principle. However, they are not lemmata, but dialect forms that belong with other lemmata.

Perhaps you think at this point, why don’t we just add a hard return and have a new entry already? Well.. Here’s a look at this snippet in xml:

<div2 id=”crosse(to/s” orig_id=”n43294″ key=”e(to/s” type=”main” opt=”n”><head extent=”full” lang=”greek” opt=”n” orth_orig=”ἑτός”>ἑτός</head>, <itype lang=”greek” opt=”n”>ή</itype>, <itype lang=”greek” opt=”n”>όν</itype>, verb. <pos opt=”n”>Adj.</pos> of <foreign lang=”greek”>ἵημι</foreign>, <sense id=”n43294.0″ n=”” level=”1″ opt=”n”><i>sent</i>, only in compds., as <foreign lang=”greek”>ἀν-ετός, ἀφ-ετός</foreign>. <orth extent=”full” lang=”greek” opt=”n”>ἔτρᾰγον</orth>, <tns opt=”n”>aor. 2</tns> of <foreign lang=”greek”>τρώγω</foreign>. <orth extent=”full” lang=”greek” opt=”n”>ἔττακαν·</orth> <foreign lang=”greek”>ἔστησαν</foreign>, <author>Hsch.</author> <orth extent=”full” lang=”greek” opt=”n”>ἔττε</orth>, <abbr>v.</abbr> sub <foreign lang=”greek”>ἔστε</foreign>.</sense></div2>

div2 with all the information, plus the ‘head‘ with its attributes, should be produced for ἔτραγον, ἔττακαν, ἔττε. The <sense..> tag should be limited to only the relevant entry. In other words, just adding some white space doesn’t suffice.

As a result, I choose my battles – if I see that an entry in Logeion is really hiding additional lemmata (and they are lemmata that LSJ doesn’t explicitly keep together with a ‘hence’ or the like), I’ll split the entry and give it a new id, etc. If it’s a dialect form or an only-in-Hesychius word, I direct my efforts elsewhere. [If you feel strongly about an entry for every stray Hesychius word, be my guest and edit the files on github..]

The edit that did make it: DR discovered κνικνίκος v. κνῆκος in the κνίκιον entry. That is going to get a proper correction. As is that inscription citation for the Dionysia that this entry started with.

2 responses

    • True! I should either split the entries after all or add ἔττε to my morphological database, so that no one who reads the half dozen inscriptions from Boeotia in which it occurs gets left out in the cold.. I looked at the word wheel and I do see that it’s hard to figure out where to look in LSJ to find this entry, because there are a lot of intervening entries from other dictionaries. [PS: Done.]

Leave a Reply

Your email address will not be published. Required fields are marked *