1 Introduction
-
Do the types of temporal expressions vary diachronically in scientific writing, and if so how is this manifested linguistically?
-
What are typical temporal expressions of specific time periods and do these change over time?
-
Are different types of temporal expressions, e.g., duration expressions and date expressions referring to points in time, equally affected by a potential change over time?
2 Related Work
3 Data
Period | Coverage | Tokens | Documents |
---|---|---|---|
1650 | 1665–1699 | 2,589,536 | 1,326 |
1700 | 1700–1749 | 3,433,838 | 1,702 |
1750 | 1750–1799 | 6,759,764 | 1,831 |
1800 | 1800–1849 | 10,699,270 | 2,778 |
1850 | 1850–1869 | 11,676,281 | 2,176 |
1950 | 1966–1989 | 18,998,645 | 3,028 |
2000 | 2000–2007 | 20,201,053 | 2,111 |
4 Processing Temporal Information
4.1 Temporal Expressions
-
Date expressions refer to a point in time of the granularity equal or coarser than ‘day’ (e.g., March 11, 2017, March 2017 or 2017).
-
Time expressions refer to a point in time of any granularity smaller than ‘day’ (e.g., Saturday morning or 10:30 am).
-
Duration expressions refer to the length of a time interval and can be of different granularity (e.g., two hours, three weeks, four years).
-
Set expressions refer to the periodical aspect of an event, describing set of times/dates (e.g., every Saturday) or a frequency within a time interval (e.g., twice a day).
4.2 Temporal Tagging
P
(or PT
in case of time level durations) followed by a number and an abbreviation of the granularity (e.g., years: Y
, month: M
, week: W
, days: D
; hours: H
, minutes: M
). In addition, fuzzy expressions are referred to by X
instead of precise numbers, e.g., several weeks is normalized to PXW
, monthly is normalized to XXXX-XX
and annually to XXXX
.4.3 Extraction Quality
Period | RIGHT | OTHER | WRONG | Precision |
---|---|---|---|---|
1650 | 219 | 13 | 18 | 0.928 |
1700 | 210 | 20 | 20 | 0.920 |
1750 | 218 | 21 | 11 | 0.956 |
1800 | 186 | 37 | 27 | 0.892 |
1850 | 181 | 48 | 22 | 0.912 |
1950 | 116 | 114 | 20 | 0.920 |
2000 | 145 | 96 | 9 | 0.964 |
5 Typicality of Temporal Expressions
6 Analysis
6.1 Frequency-Based Diachronic Tendencies
6.2 Diachronic Tendencies of ‘Typical’ Temporal Expressions
Period | POS sequence | Example | Comp. |
---|---|---|---|
1650 | DT NN |
in
the Spring
| 5 |
NP |
in
Winter
| 5 | |
RB |
now
| 4 | |
1700 | NP |
in
Summer
| 5 |
NP CD |
March 8
| 5 | |
DT JJ IN NP |
the 6th of March
| 4 | |
1750 | NP CD, CD |
June 3, 1769
| 6 |
NP CD |
April 19
| 5 | |
CD NP |
2 June
| 4 | |
DT NN |
the Spring
| 4 | |
1800 | NP CD, CD |
June 18, 1784
| 5 |
1850 | CD |
in
1858
| 4 |
1950 | JJ |
current
work
| 5 |
JJ JJ NN |
mid seventeenth century
| 4 | |
2000 | DT JJ NNS |
the last decades
| 5 |
DT NNS |
the 1990s
| 5 | |
JJ JJ NN |
late seventeenth century
| 5 |
-
In Winter it will need longer infusion, than in the Spring or Autumn. \(\mathrm {(1650)}\)
-
The difference between these two plants is this; the papaver corniculatum dies to the root in the winter, and sprouts again from its root in the spring; \(\mathrm {(1750)}\)
-
March 4, 1783. With a 7-feet reflector, I viewed the nebula near the 5th Serpentis, discovered by Mr. MESSIER, in 1764. \(\mathrm {(1750)}\)
-
In the 1970s, Rabin [38] and Solovay and Strassen [44] developed fast probabilistic algorithms for testing primality and other problems. \(\mathrm {(2000)}\)
-
There is a significant confusion in the current literature on “cellular” or “tessellation arrays” concerning the concept of a “Garden-of-Eden configuration”. \(\mathrm {(1950)}\)
Period | POS sequence | Example | Comp. |
---|---|---|---|
1750 | NP NN |
Sunday morning
| 5 |
JJ NN |
next morning
| 5 | |
1800 | CD NN |
10 A.M.
| 5 |
1850 | CD NN |
7 A.M.
| 5 |
DT NN IN DT JJ IN NP |
the evening of the 28th of August
| 4 | |
IN CD NN |
about 8 A.M.
| 4 |
-
Monday morning she appeared well, her pulse was calm, and she had no particular pain. \(\mathrm {(1750)}\)
-
There being usually but one assistant, it was impossible to observe during the whole twenty-four hours; the hours of observation selected were therefore from 3 A.M. to 9 P.M. inclusive. \(\mathrm {(1850)}\)
-
After the eleven Months, the Owner having a mind to try, how the Animal would do upon Italian Earth, it died three days after it had changed the Earth. \(\mathrm {(1650)}\)
-
[...] the Opium, being cut into very thin slices, [...] is to be put into, and well mixed with, the liquor, (first made luke-warm) and fermented with a moderate Heat for eight or ten Days, [...]. \(\mathrm {(1650)}\)
-
June 4, the weather continued much the same, and about 9h 30 in the evening, we had a shock of an earthquake, which lasted about four seconds, and alarmed all the inhabitants of the island. \(\mathrm {(1750)}\)
-
[...] the glass produced by this fusion was in about twelve hours dissolved, by boiling it in a proper quantity of muriatic acid. \(\mathrm {(1800)}\)
-
In a few hours a mass of fawn-coloured crystals was deposited; \(\mathrm {(1850)}\)
-
The patient is then switched to the re-breathing system containing 133 Xenon at 5 mCi/1 for a period of one minute, and then returned to room air for a period of ten minutes. \(\mathrm {(1950)}\)
-
For each speaker, performance was observed across numerous repetitions of the vocabulary set within a single session, as well as across a 2-week time period. \(\mathrm {(1950)}\)
-
It constitutes the usual drift-diffusion transport equation that has been successfully used in device modeling for the last two decades. \(\mathrm {(1950)}\)
-
Provably correct and efficient algorithms for learning DNF from random examples would be a powerful tool for the design of learning systems, and over the past two decades many researchers have sought such algorithms. \(\mathrm {(2000)}\)
-
Besides this, you may there see, that every day the Sun sensibly passes one degree from West to East, [...]. \(\mathrm {(1650)}\)
-
In order to determine the annual variations of the barometer, I have taken the mean of the observations in each month, [...]. \(\mathrm {(1800)}\)
-
The mean was then taken in every month of every lunar hour (attending to the signs), and the monthly means were collected into yearly means. \(\mathrm {(1850)}\)
-
A disk resident file of all current recipient numbers is created monthly from the eligibility tape file supplied by Medical Services Administration. \(\mathrm {(1950)}\)