Letters from Descartes in digital format

1 / 31

Letters from Descartes in digital format - PowerPoint PPT Presentation

Letters from Descartes in digital format. An exercise in conversion Dirk Roorda @ eHumanities 2012-01-26. overview. the task the method the lessons the result demo. The Task: converting from . JapAM Descartes Correspondence ca. 700 letters 69,237 lines 600 formulas

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about ' Letters from Descartes in digital format' - olina

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Letters from Descartes in digital format

An exercise in conversion

Dirk Roorda

@ eHumanities 2012-01-26

overview
• the method
• the lessons
• the result
• demo

JapAM

Descartes Correspondence

ca. 700 letters

69,237 lines

600 formulas

4.2 MB (without the 311 pictures)

CKCC corpus Descartes

XML : Text Encoding Initiative (TEI)

~ 35,000 elements, of which

7,700 paragraphs

6,200 formulas

6,000 text-formattings

4,200 structure

2,900 page-breaks

538 images

The (re)Sources

The method

observation

non-algorithmic changes

consolidation

proofs

Observation

use digital equipment:

observation: italic scopes

replace

=(.*?)\$

by

<italic>match1</italic>

???

Aargh!#@\€]

conversion process

The anatomy of conversion

convert.pl

100 KB of program code text

=

25 densely typed pages

=

3427 lines

of which

2175 real code lines

Code/Input = 1/32

Statistics

1/3 of the tasks need 2/3 of the code

formulas: (2) 37 %

headers, openers, closers: (3) 16 %

meta and images: (3) 11 %

formulas: (2) 29 %

headers, openers, closers: (3) 6 %

meta and images (3) 10 %

total run time (25) 40 sec

The tricks of conversion
• task = configuration + workflow
• Count and check
• Performance matters
• Do not give up automation

(2a) that can be run separately

(2b) that can be reordered easily

5. Performance matters!

was 30+ seconds

is now 2.07 seconds

many new subtasks based on same template

(gain = 15 * 30 = 7.5 min per run)

many, many runs before everything is OK

(gain = 100 * 7.5 = 12.5 hours CPU-time)

6. Do not give up automation

we used a lot of expert knowledge

which has all been transferred to

• the source
• consolidated extra inputs

so the conversion is still repeatable and modifiable

Thank You

conversion program