3.2 Computer science research
The authors have found very few articles that specifically examine the enhancement of papyri legibility from visible light images. Sparavigna [
175] applies an edge detector based on the magnitude of dipole moments to increase the contrast of shape outlines, and thereby further enhancing the effect of a similar edge detection neural mechanism already present in the human visual system, known as the Mach bands [
26, pp. 362–363]. The anisotropic filtering of the image in the frequency domain was tested as a preprocessing step to suppress papyri fibers and gaps; unfortunately, this also removed character segments collinear with the fibers.
The best papyri enhancements are obtained using multi-spectral imaging, as noted above in the papyrological practice overview. Moreover, in conjunction with multispectral images, color processing has also been used for papyri legibility enhancement specifically and for document processing more generally, as will be shown in the next section. We will here widen the survey to encompass other methods, topics, and applications of potential interest to this article, within the field of document processing and beyond, and conclude by discussing the applicability of machine learning.
Because noise—whether biological, mechanical, electrical, or digital—is a fundamental issue in communication, substantial research has aimed at improving document legibility by suppressing a great variety of what are usually considered manifestations of
noise, including: ink bleed-through [
46], see-through [
48], foxing [
163], shadows [
95], termite bites [
151], cross-outs [
30], photocopied low-contrast carbon copies [
36], low resolution raster images [
141], and background texture interference [
133] (for a history of image denoising, see [
111]). The type of visual
media can also dictate the typology of enhancement methods (e.g., methods for pictures and for movies differ in whether the time dimension is available as a source of contextual information for optimizing image processing) [
24]. In applied science contexts, such as in the photography and video equipment industry, there is interest in developing enhancement methods predicated on an understanding of the
nature of noise, e.g., optical, mechanical, and electronic noise sources in cameras [
139]. Advances in image quality measurement [
87,
192] have benefited from research into
visual perception and
neuroscience [
19,
188], as well as models of
scene statistics [
189]. The role of
tasks in image enhancement is of particular interest in cultural heritage applications [
180]. The
systemic and
critical approaches to document analysis advocated in this article have been the subject of exemplary research over more than a half-century in two domains, each with specific aims, constraints, and solutions. First, the legibility of flight documentation and instrumentation plays a critical role in
aviation performance and safety; here, optimization has been approached mainly through psychophysical experimentation [
55]. Second, research on the enhancement of
medical radiographic images stands out in terms of the extent to which the impact on clinical diagnosis of technologies and perception have been investigated, including the role of visual illusions [
23,
158].
In addition to methods, prominent areas of focus in this field are datasets, benchmarks, and ground truthing [
117]. Visual confirmation of the attenuation of conspicuous artifacts is a typical means of comparing methods, supple- mented with numerical characterization if feasible (i.e. given the availability of reference images or appropriateness of reference-free image quality measurements). User evaluations are rare, and vary from a few participants for historical documents [
12,
28] to thousands in online campaigns for industrial applications [
63,
144]. Legibility has been systematically studied since the early 20th century in experimental psychology (notably for the design of traffic signage and car plates, flight documentation and instrumentation, and typography for visually impaired [
14,
114]), in works on teaching handwriting [
159], as well as in relation to optical character recognition [
68] and document image quality [
4]. Document and image quality differ from writing legibility (e.g., a noise-free, but unfamiliar writing style may be difficult to read), although they conjointly affect the processing of visual information.
While the use of machine learning for image enhancement is well established [
178], its application to document image enhancement, in particular historical, is still rare [
72,
74,
115,
130,
131]. Notwithstanding results comparable to the state of the art, the challenges it faces are considerable. Historical data is scarce and heterogeneous, and ground truthing is time-consuming, which hinders the generalizability of the learned models [
54]. In respect to the task of critical vision, an implementable machine learning methodology has yet to be devised that can deal with images that are ambiguous, and their interpretations multiple, variable, or unknown. For scholarly applications, the resulting enhancements have to be, furthermore, trustworthy.