1 Introduction
1.1 The Intelligence Field
1.2 Research Questions and Outline
-
Is PhraseBrowser, a specific instantiation of a text analysis tool, perceived as useful for solving typical analytical tasks?
-
Does the use of the text analysis tool improve the quality of a typical analytical deliverable?
2 Background
2.1 Intelligence Analysis
As noted, management of information and its related uncertainty play a central role, and the means to measure precision, quality, and utility—so-called information awareness [5]—is crucial.Intelligence analysis has the potential to become an applied science. Its purpose would be managing the uncertainty in assessments of threats and possibilities based on incomplete, unreliable, or uncertain data in a context in which demand requires those assessments irrespective of the limitations. Defined in these terms, intelligence analysis stands out as a genuine cross-disciplinary science in-being, with a theoretical basis and a set of methods not limited to any single subject matter or field of analysis but rather adapted to every specific application.
2.2 Psychological Operations
2.3 Related Work
3 PhraseBrowser
3.1 Overview
3.2 Phrases
Theme/Type | Phrase | # |
---|---|---|
General Phrases | All
| 2,806,084 |
lab says | 9297 | |
used in Skripal poisoning | 8842 | |
produced in Russia | 5284 | |
disinformation campaign in Britain | 4326 | |
was in US | 4205 | |
Porton Down research laboratory | 3401 | |
research laboratory has told Sky News | 3373 | |
to Sergei Skripal | 3122 | |
that Christopher Steele | 2803 | |
of Yulia Skripal | 2563 | |
Counted Things/Persons | All
| 105,931 |
60 Russian diplomats | 1406 | |
30 questions | 927 | |
two weeks | 861 | |
14 simple questions | 823 | |
hundred narratives | 383 | |
two BBC colleagues | 294 | |
20 European countries | 280 | |
2800 Russian bots | 232 | |
23 British diplomats | 212 | |
two people poisoned | 203 | |
Entities | All
| 5,601,461 |
Skripal | 373,735 | |
Russia | 113,772 | |
UK | 107,982 | |
Yulia | 97,560 | |
Salisbury | 32,903 | |
Porton | 25,029 | |
Novichok | 22,236 | |
Putin | 22,017 | |
OPCW | 19,292 | |
Theresa May | 14,136 | |
Explicit untruths | All
| 62,579 |
propaganda | 4532 | |
Boris Johnson lied about Skripal | 1641 | |
Moscow’s lies | 427 | |
Kremlin propaganda | 417 | |
UK lies | 204 | |
Theresa May is lying | 155 | |
“Russia bot” narrative | 153 | |
the Skripal narrative | 142 | |
lying about the source of Novichok | 111 | |
Downing Street spin-master | 82 |
3.3 Predefined Phrase Types and Filtering
-
“General Phrases” tries to capture any kind of content based on part of speech tags. This is an example of a phrase type that results in many phrases—perhaps too many. It would probably be useful to exchange or complement this phrase type with a machine learning method. For now this is the most general phrase type, primarily used to explore content without looking for any of the specific content that most of the other phrase types try to capture.
-
“Counted Things/Persons” is defined using other phrase types capturing counts, and at the same time “things” and/or “persons”. One possible use of this phrase type is to look for differing numbers being given in some context. Sources may exaggerate the number of protesters at an event, for instance.
-
“Entities” such as “Person”, “Location”, and “Organization” are found by an entity detector (see Sect. 3.5). These entities are reused by several of the other phrase types, e.g., the “Counted Persons” phrase type mentioned above.
-
“Explicit Untruths” captures phrases that use any word in a long list of words explicitly related to deception, propaganda, misinformation, fake news, etc. The idea is that it is potentially interesting whenever someone writes that something is a lie, fake, misinformation, etc.: either the statement is true, or the person writing it has an agenda…Hence, these phrases often contain suspicious/interesting statements that can potentially be used as a starting point for further analysis. In Sect. 3.7 some simplified rule examples for “Explicit Untruths” are presented.
3.4 PhraseBrowser System Overview
3.5 Text Processing
3.6 PhraseBrowser Rule Language
3.7 PhraseBrowser Rule Language Examples
# | Phrase type and rules | Example text |
---|---|---|
name:violence
| ||
1 | violence | London has seen a lot of violence. |
2 | fist fight | There is a fist fight in London. |
name:violence_in_location
| ||
3 | !a(LOCATION) !any[*] !a(violence) | London has seen a lot of violence. |
4 | !a(violence) !any[*] !a(LOCATION) | There is a fist fight in London. |
The second rule (row 4 in Table 2) can be interpreted analogously. In practice the “violence_in_location” rules would detect too many uninteresting phrases, as sentences may be long and the mentioning of a location does not necessarily relate to where the “violence” is taking place. To overcome this problem, the rule language has features for stopping phrases that contain certain tokens (or phrase types) between parts of the rules, and it also allows for requiring the presence or absence of certain tokens (or phrase types) within the current sentence.a location entity, followed by zero or more appearances of any token, followed by either “violence” or “fist fight”.
# | Phrase type and rules | Example text |
---|---|---|
name:u_basic
| ||
1 | lie | Boris Johnson lied about Skripal |
2 | propaganda | It’s all Kremlin propaganda
|
3 | spread rumor | John spreads rumors about Paul |
4 | misinformation | The fight against misinformation
|
name:u_obj
| ||
5 | !a(PERSON) | Boris Johnson lied about Skripal
|
6 | !a(LOCATION) | It’s all Kremlin propaganda |
name:untruth
| ||
7 | !a(u_basic) | That’s just lies
|
8 | !a(u_obj) !a(u_basic) | It’s all Kremlin propaganda
|
Many UK lies today | ||
9 | !a(u_obj) ’s !a(u_basic) | Have you heard Moscow’s lies? |
We are used to John’s misinformation
| ||
10 | !a(u_obj) !a(u_basic) about !a(u_obj) | Boris Johnson lied about Skripal
|
John spreads rumors about Paul
|
3.8 New and Improved Rules
3.9 PhraseBrowser Interface
4 Method
4.1 Research Design
Age | Gender | Military category | Work role/title | Months on job | Educational level |
---|---|---|---|---|---|
34 | Male | Civilian | Section chief | 30 | B.Sc. |
35 | Male | Civilian | Analyst | 24 | B.Sc. |
34 | Female | Civilian | Analyst | 12 | M.Sc. |
27 | Female | Civilian | Analyst | 15 | M.Sc. |
Age | Gender | Military category | Work role/title | Months on job | Educational level |
---|---|---|---|---|---|
26 | Female | Civilian | Analyst | 18 | B.Sc. |
29 | Male | Civilian | Analyst | 3 | M.Sc. |
37 | Female | Reserve | Analyst | 24 | University |
23 | Male | Soldier | Analyst | 48 | High school |
4.2 Intelligence Assessment Task
4.3 Study Setup
4.4 Observational Study Execution
5 Results
5.1 Perceived Usefulness
5.2 Subject Matter Expert Evaluation
5.3 Information Fragment Value
5.4 General Observations of Work
5.5 Observations of Work Related to the Use of PhraseBrowser
6 Discussion
6.1 Theory
6.2 Validity of the Study
6.3 Open Source Intelligence as a Data Source for Text Mining
6.4 Potential Improvements of PhraseBrowser
7 Conclusions
-
Is PhraseBrowser, a specific instantiation of a text analysis tool, perceived as useful for solving typical analytical tasks?
-
Does the use of the text analysis tool improve the quality of a typical analytical deliverable?