[ e-Science and Ancient Documents ]

I am at the Oxford e-Research Centre attending a workshop called Understanding Image-based Evidence.

Introduction

Prof. Sir Michael Brady (Oxford)

Prof. Brady gave an introduction describing the historical background of image interpretation in the Humanities. He also gave a brief introduction of the sessions that are going to be happening during the day.

Quality of Digitisation and Capture of Evidence

eSAD Research Question: Dr. Melissa Terras (UCL)

Dr. Terras introduced the session by describing how resources on the web have transformed the way that we do research. She ponders about the consequences of the inherent nature of digital images as illusions.

Invited Speaker: Prof. Lindsay MacDonald (LCC, UCL)

Prof. MacDonald ponders what is the resolution that we need to image to capture every man-made detail in a specific object. He describes different traditional techniques to acquire high-resolution images, charting their performance by resolution. He also shows some alternative methods, like laser scanning. He uses an example of a real artefact to determine, for this specific artefact, what is the spacial resolution needed to accurately represent it digitally.

There are different factors that might also contribute to the technology that is required. The researcher might also be interested in the composition of the paper, or the quality of the colour. Cameras will always see colours differently from a human observer. He describes a common problem where an observer matches two colours under one light, but the colours might not match under a different light, or under analysis from a different observer.

He concludes by giving some recommendations into how to make a better digitisation:

  • Analyse the original Documents

  • Choose the best scanner of camera to minimise distortion.

  • Select a resolution of at least double the highest spatial frequency of the original

  • generate and use icc profiles.

  • Store RAW image files.

QP

The decisions taken at the time of digitisation for a specific project might affect what can be done with those images in future projects. Ideally you want to capture as much information as you can.

Respondent: Dr. Julia Craig-McFeely (Oxford, Royal Holloway)

Dr. Craig-McFeely argues that we need to take into account the way that each person’s perception of colour is different.

Evaluating evidence

eSAD Research Question: Dr. Ségolène Tartre (Oxford)

Ségolène is interested in how the cognitive processes of interpretation work.

Invited Speaker: Dr. Floris Bex (Dundee)

Dr. Bex’s studies argumentation using evidence, he handed us a concise two page summary of the issues in dialogue and argumentation.

It is important to visualise the structure of an argument to be able to identify the points where counter-arguments can be found. It is important when building an argument to be able to identify evidence. It is also important to have a grasp of what the assumptions are, the points in which the argument is being built from.

He gives an example of argument visualisations where he shows how experts make inferences about a document, based on a piece of evidence, philological knowledge and assumptions. Counter-arguments can be done by questioning any of the steps of the argument, for example denying the assumptions, or questioning the ability of the expert.

Structuring reasoning in the form of an argument allows us to understand each other easily and share knowledge. We can also easily support our ideas and provide the framework for others to make the reasoning better, or even come up with a better theory.

Dr. Bex gives a demo on a new project in which parts of an argument are explicitly declared. Others can then question, disagree, and partake on the argument in a structured way.

Dr. Bex recommends reading Edward Tufte.

QP

This system ignores the confidence levels in an argument. Where there is some doubt but a reasonable argument with made with some level, but not complete certainty.

Respondent: Dr. Sanjay Modgil (KCL)

Dr. Sanjay argues that argumentation theory is common sense. They come from established work in logic and AI. This is important because it is a accessible resource for communities of research in all areas. Argumentation theory schemes can map out the strength of an argument in respect to a specific topic. They support users in fulfilling their dialectical obligations. It also offers a standard arguments to assess arguments and counter-arguments. It also offers a clear path in a dialogue to where additional arguments or counter-arguments should be made.

The intention is not at all to replace the role of experts in interpreting the texts, but to make it more efficient. That implies that we first need to understand what it is.

Restoration, Palaeographical Knowledge Bases and Classification of Letter-Forms

eSAD Research Question: Henriette Roued-Cunliffe (Oxford)

Henriette’s work is on character recognition. She is working on APELLO a web-based search engine that offers suggestions for pattern recognition using EpiDoc based corpora.

Update: I missunderstood Henriette’s work, she corrected me. My apologies and thanks for the update.

I just wanted to rectify this. I do not do any work what so ever on character recognition. It is something that we work on as a part of the project, but I am not directly involved with it. My work is instead on building a Decision Support System for reading ancient documents and as a part of this I could use different knowledge bases such as a character recognition system. I am working on one such word search knowledge base.

Invited Speaker Dr. Peter Stokes (Cambridge)

Dr. Stokes is interested in recognising authors, styles, dates… in text. There is a standard criticism of palaeography where experts are accused of hiding any doubt from their arguments, regarding their authority as the ultimate defence.

Computers can help the process but they will never replace the work of an analyst. Computers cannot become an authority, or even a witness. We can also create databases for palaeography, where we keep information about different types of writings. This information will help experts identify authors and build evidence based arguments to support their conclusions.

He is working on a web-based service for a palaeographical database. The database includes images, and descriptions of the letter-forms. It allows a researcher to search through the database using a variety of parameters. Ideally, the library needs to be a standard resource. If arguments are built with the help of this service, it needs to be permanently attached to it. Copyrights issues might be an obstacle.

Respondent: Dr. Gabriel Bodard (KCL)

Historians are concerned about the computer’s ability to replace an expert analyst. All they can do is help less-experienced people do a valid analysis.

Another issue is dating. Epigraphy dating is constantly being questioned. Work in epigraphy always seems to be in progress. Hard conclusions are hard to come by.

How do we make evidence available to non-experts. Seeing evidence might convince people in other areas or non-experts of your conclusions.

QP

Gathering data for the database is an extremely long and detailed process. Is it scalable? The value in compiling this kind of data for an individual researcher is great, but it does require a great amount of resources.

How far are we from having a sufficient database in order to be able to use machine learning algorithms — How do you know the computer is right?

How do you recognise an author when they deliberately write in different manners according to the context.

Wrap-up

Prof. Alan Bowman (Oxford)

Prof. Bowman concluded the workshop and thanked the speakers.

Technical advances are exciting, and provide new opportunities for research. Trial and error is important. How do we make the technology useful to more people?

Sometimes the question is not to get a good reproduction, but to obtain something that is better than the original.

Extracting meaning from documents requires sometimes checking multiple documents, to manipulate them. It is easy to make a mistake, it is quite hard to recognise it. How do we argue the meaning of text? How do we include interpretations of images in this dialogue?