Introduction
IN:PLAY_MUSIC
, and then b) recognizing task-specific named entities Paradise and Coldplay, respectively tagging these elements (slots) as SL:MUSIC_TRACK_TITLE
and SL:MUSIC_ARTIST_NAME
. Intent detection has traditionally been approached as text classification, where the entire utterance serves as input, while slot recognition has been formulated as a sequence tagging challenge [1‐3].IN:GET_LOCATION_HOME
) before estimating the duration to the destination (IN:GET_ESTIMATED_DURATION
). Hence, there is a requirement for a semantic representation capable of managing multiple intents per utterance, where slots encapsulate nested intents.-
We develop innovative shift-reduce semantic parsers for task-oriented dialogues utilizing Stack-Transformers and deep contextualized word embeddings derived from RoBERTa.
-
We adapt various transition systems from the constituency parsing literature to handle TOP annotations and conduct a comprehensive comparison against the original top-down approach, demonstrating the superiority of the in-order algorithm across all scenarios.
-
We evaluate our approach on both low-resource and high-resource settings of the Facebook TOP datasets, pushing the boundaries of the state of the art in task-oriented parsing and narrowing the divide with sequence-to-sequence models.
-
Upon acceptance, we will make our system’s source code freely available for public use.
Related Work
Methodology
Transition Systems for Task-Oriented Semantic Parsing
SL:SOURCE
, {my, apartment}) and (SL:DESTINATION
, {San, Diego}) are constituents extracted from the TOP tree depicted in Fig. 1b. Additionally, in our specific scenario, two distinct types of constituents emerge: intents and slots, with non-terminal labels respectively prefixed with IN:
and SL:
. Finally, tree structures must adhere to certain constraints to be deemed a valid TOP representation:-
The root constituent, which encompasses the entire utterance, must be an intent node.
-
Only tokens and/or slot constituents can serve as child nodes of an intent node.
-
A slot node may have either words (one or several) or a single intent constituent as child nodes.
-
C denotes the set of possible state configurations, defining the data structures necessary for the parser.
-
\(c_0\) represents the initial configuration of the parsing process.
-
\(C_f\) is the set of final configurations reached at the end of the parsing process.
-
T signifies the set of available transitions (or actions) that can be applied to transition the parser from one state configuration to another.
Transition | Stack | Buffer |
---|---|---|
[ ] | [Play, Paradise, by, Coldplay] | |
NT- IN:PLAY_MUSIC | [ IN:PLAY_MUSIC ] | [Play, Paradise, by, Coldplay] |
Shift | [ IN:PLAY_MUSIC , Play ] | [ Paradise, by, Coldplay ] |
NT- SL:TITLE | [ IN:PLAY_MUSIC , Play, SL:TITLE ] | [ Paradise, by, Coldplay ] |
Shift | [ IN:PLAY_MUSIC , Play, SL:TITLE , Paradise ] | [ by, Coldplay ] |
Reduce | [ IN:PLAY_MUSIC , Play, SL:TITLE \(_{Paradise}\) ] | [ by, Coldplay ] |
Shift | [ IN:PLAY_MUSIC , Play, SL:TITLE \(_{Paradise}\), by ] | [ Coldplay ] |
NT- SL:ARTIST | [ IN:PLAY_MUSIC , Play, SL:TITLE \(_{Paradise}\), by, SL:ARTIST ] | [ Coldplay ] |
Shift | [ IN:PLAY_MUSIC , Play, SL:TITLE \(_{Paradise}\), by, SL:ARTIST , Coldplay] | [ ] |
Reduce | [ IN:PLAY_MUSIC , Play, SL:TITLE \(_{Paradise}\), by, SL:ARTIST \(_{Coldplay}\) ] | [ ] |
Reduce | [ IN:PLAY_MUSIC \(_{Play\ \texttt {SL:TITLE}\ by\ \texttt {SL:ARTIST}}\) ] | [ ] |
-
State configurations within C are structured as \(c=\langle {\Sigma }, {B} \rangle \), where \(\Sigma \) denotes a stack (responsible for storing non-terminal symbols, constituents, and partially processed tokens), and B represents a buffer (containing unprocessed tokens to be read from the input).
-
At the initial configuration \(c_0\), the buffer B encompasses all tokens from the input utterance, while the stack \(\Sigma \) remains empty.
-
Final configurations within \(C_f\) are structured as \(c=\langle [ I ], \emptyset \rangle \), where the buffer is empty (indicating that all words have been processed), and the stack contains a single item I. This item represents an intent constituent spanning the entire utterance, as the root node of a valid TOP tree must be an intent.
-
The set of available transitions T consists of three actions:Please note that the original work by [4] did not provide specific transition constraints tailored to generating valid TOP representations. Therefore, we undertook a complete redesign of the original top-down algorithm [7] for task-oriented semantic parsing to incorporate these task-specific transition constraints.
-
The Non-Terminal-L transition involves pushing a non-terminal node labeled L onto the stack, transitioning the system from state configurations of the form \(\langle {\Sigma }, B \rangle \) to \(\langle {\Sigma | L }, B \rangle \) (where \(\Sigma | L\) denotes a stack with item L placed on top and \(\Sigma \) as the tail). Unlike in constituency parsing, this transition can generate intent and slot non-terminals (with labels L prefixed with
IN:
orSL:
, respectively). Therefore, it must adhere to specific constraints to produce a well-formed TOP tree:\(*\)Since the root node must be an intent constituent, the first Non-Terminal-L transition must introduce an intent non-terminal onto the stack.\(*\)A Non-Terminal-L transition that inserts an intent non-terminal onto the stack is permissible only if the last pushed non-terminal was a slot, performed in the preceding state configuration. This condition ensures that the resulting intent constituent from this transition becomes the sole child node of that preceding slot, as required by the TOP formalism.\(*\)A Non-Terminal-L transition adding a slot non-terminal to the stack is allowed only if the last inserted non-terminal was an intent. -
A Shift action is employed to retrieve tokens from the input by transferring words from the buffer to the stack. This operation transitions the parser from state configurations \(\langle {\Sigma }, {w_i | B} \rangle \) to \(\langle {\Sigma | w_i}, {B} \rangle \) (where \({w_i | B}\) denotes a buffer with token \(w_i\) on top and B as the tail, and conversely, \(\Sigma | w_i\) represents a stack with \(\Sigma \) as the tail and \(w_i\) as the top). This transition is permissible only if the buffer is not empty. Specifically for task-oriented semantic parsing, this action will not be available in state configurations where the last non-terminal added to the stack was a slot, and an intent constituent was already created as its first child node. This constraint ensures that slot constituents have only one intent as their child node.
-
Additionally, a Reduce transition is necessary to construct a new constituent by removing all items (including tokens and constituents) from the stack until a non-terminal symbol is encountered, then grouping them as child nodes of that non-terminal. This results in a new constituent placed on top of the stack, transitioning the parser from configurations \(\langle {\Sigma } | L| e_k| \dots | e_0, B \rangle \) to \(\langle {\Sigma } | L_{e_k \dots e_0}, B \rangle \) (where \(L_{e_k \dots e_0}\) denotes a constituent with non-terminal label L and child nodes \(e_k \dots e_0\)). This transition can be executed only if there is at least one non-terminal symbol and one item (token or constituent) in the stack.
-
Transition | Stack | Buffer | f |
---|---|---|---|
[ ] | [Play, Paradise, by, Coldplay] | false | |
Shift | [ Play ] | [Paradise, by, Coldplay] | false |
Shift | [ Play, Paradise ] | [ by, Coldplay ] | false |
Re#1- SL:TITLE | [ Play, SL:TITLE \(_{Paradise}\) ] | [ by, Coldplay ] | false |
Shift | [ Play, SL:TITLE \(_{Paradise}\), by ] | [ Coldplay ] | false |
Shift | [ Play, SL:TITLE \(_{Paradise}\), by, Coldplay ] | [ ] | false |
Re#1- SL:ARTIST | [Play, SL:TITLE \(_{Paradise}\), by, SL:ARTIST \(_{Coldplay}\)] | [ ] | false |
Re#4- IN:PLAY_MUSIC | [ IN:PLAY_MUSIC \(_{Play\ \texttt {SL:TITLE}\ by\ \texttt {SL:ARTIST}}\) ] | [ ] | false |
Finish | [ IN:PLAY_MUSIC \(_{Play\ \texttt {SL:TITLE}\ by\ \texttt {SL:ARTIST}}\) ] | [ ] | true |
-
State configurations have the form \(c=\langle {\Sigma }, {B}, f \rangle \), where \(\Sigma \) is a stack, B is a buffer, as described for the top-down algorithm, and f is a boolean variable indicating whether a state configuration is terminal or not.
-
In the initial configuration \(c_0\), the buffer contains the entire user utterance, the stack is empty, and f is false.
-
Final configurations in \(C_f\) have the form \(c\!=\!\langle [I], \emptyset , \textit{true} \rangle \), where the stack holds a single intent constituent, the buffer is empty, and f is true. Following a bottom-up algorithm, we can continue building constituents on top of a single intent node in the stack, even when it spans the whole input utterance. To avoid that, this transition system requires the inclusion of variable f in state configurations to indicate the end of the parsing process.
-
Actions provided by this bottom-up algorithm are as follows:
-
Similar to the top-down approach, a Shift action moves tokens from the buffer to the stack, transitioning the parser from state configurations \(\langle {\Sigma },\! {w_i | B}, \!\textit{false} \rangle \) to \(\langle {\Sigma | w_i}, {B}, \textit{false} \rangle \). This operation is not permissible under the following conditions:\(*\)When the buffer is empty and there are no more words to read.\(*\)When the top item on the stack is an intent node and, since slots must have only one intent child node, the parser needs to build a slot constituent on top of it before shifting more input tokens.
-
A Reduce#k-L transition (parameterized with the non-terminal label L and an integer k) is used to create a new constituent by popping k items from the stack and combining them into a new constituent on top of the stack. This transitions the parser from state configurations \(\langle {\Sigma } | e_{k-1}| \dots | e_0, B, \textit{false} \rangle \) to \(\langle {\Sigma } | L_{e_{k-1} \dots e_0}, B, \textit{false} \rangle \). To ensure a valid TOP representation, this transition can only be applied under the following conditions:\(*\)When the Reduce#k-L action creates an intent constituent (i.e., L is prefixed with
IN:
), it is permissible only if there are no intent nodes among the k items popped from the stack (as an intent constituent cannot have other intents as child nodes).\(*\)When the Reduce#k-L transition builds a slot node (i.e., L is prefixed withSL:
), it is allowed only if there are no slot constituents among the k elements affected by this operation (as slots cannot have other slots as child nodes). Additionally, if the item on top of the stack is an intent node, only the Reduce action with k equal to 1 is permissible (since slots can only contain a single intent constituent as a child node). -
Lastly, a Finish action is used to signal the end of the parsing process by changing the value of f, transitioning the system from configurations \(\langle {\Sigma }, B, \textit{false} \rangle \) to final configurations \(\langle {\Sigma }, B, \textit{true} \rangle \). This operation is only allowed if the stack contains a single intent constituent and the buffer is empty.
-
-
Configurations maintain the same format as the bottom-up algorithm (i.e., \(c=\langle {\Sigma }, {B}, f \rangle \)).
-
In the initial configuration \(c_0\), the buffer contains the entire user utterance, the stack is empty, and the value of f is false.
-
Final configurations take the form \(c=\langle [I], \emptyset , \textit{true} \rangle \). Similar to the bottom-up approach, the in-order algorithm may continue creating additional constituents above the intent node left on the stack indefinitely. Hence, a flag is necessary to indicate the completion of the parsing process.
-
The available transitions are adopted from both top-down and bottom-up algorithms, but some of them exhibit different behaviors or are applied in a different order:
-
A Non-Terminal-L transition involves pushing a non-terminal symbol L onto the stack, transitioning the system from state configurations represented as \(\langle {\Sigma }, B, \textit{false} \rangle \) to \(\langle {\Sigma | L }, B, \textit{false} \rangle \). However, unlike the top-down algorithm, this transition can only occur if the initial child node of the upcoming constituent is fully constructed on top of the stack. Furthermore, it must meet other task-specific constraints to generate valid TOP representations:\(*\)A Non-Terminal-L transition that introduces an intent non-terminal to the stack (i.e., L prefixed with
IN:
) is valid only if its first child node atop the stack is not an intent constituent.\(*\)A Non-Terminal-L transition that places a slot non-terminal on the stack (i.e., L prefixed withSL:
) is permissible only if the fully-created item atop the stack is not a slot node. -
Similarly to other transition systems, a Shift operation is used to retrieve tokens from the buffer. However, unlike those algorithms, this action is restricted if the upcoming constituent has already been labeled as a slot (by a non-terminal previously added to the stack) and its first child node is an intent constituent already present in the stack. This condition aims to prevent slot constituents from having more than one child node when the item at the top of the stack is an intent.
-
A Reduce transition is employed to generate intent or slot constituents. Specifically, it removes all elements from the stack until a non-terminal symbol is encountered, which is simultaneously replaced by the preceding item to form a new constituent at the top of the stack. Consequently, it guides the parser from state configurations represented as \(\langle {\Sigma } | e_k| L| e_{k-1}| \dots | e_0, B, \textit{false} \rangle \) to \(\langle {\Sigma } | L_{e_k \dots e_0}, B, \textit{false} \rangle \). This transition is only applicable if there is a non-terminal in the stack (preceded by its first child constituent according to the in-order algorithm). Additionally, this transition must comply with specific constraints for task-oriented semantic parsing:\(*\)When the Reduce operation results in an intent constituent (as determined by the last non-terminal label added to the stack), it is permissible only if there are no intent nodes among the preceding \(k-1\) items (since the first child \(e_k\) already adheres to the TOP formalism, as verified during the application of the Non-Terminal-L transition).\(*\)When the Reduce transition produces a slot constituent, it is allowed only if there are no other slot nodes within the preceding \(k-1\) elements that will be removed by this operation. This condition also encompasses scenarios where the initial child node \(e_k\) of the upcoming slot constituent is an intent and, since the Shift transition is not permitted under such circumstances, only the Reduce action can construct a slot with a single intent.
-
Lastly, akin to the bottom-up approach, a Finish action is utilized to finalize the parsing process. This action is only permissible if the stack contains a single intent constituent and the buffer is empty.
-
Transition | Stack | Buffer | f |
---|---|---|---|
[ ] | [ Play, Paradise, by, Coldplay ] | false | |
Shift | [ Play ] | [ Paradise, by, Coldplay ] | false |
NT- IN:PLAY_MUSIC | [ Play, IN:PLAY_MUSIC ] | [ Paradise, by, Coldplay ] | false |
Shift | [ Play, IN:PLAY_MUSIC , Paradise ] | [ by, Coldplay ] | false |
NT- SL:TITLE | [ Play, IN:PLAY_MUSIC , Paradise, SL:TITLE ] | [ by, Coldplay ] | false |
Reduce | [ Play, IN:PLAY_MUSIC , SL:TITLE \(_{Paradise}\) ] | [ by, Coldplay ] | false |
Shift | [ Play, IN:PLAY_MUSIC , SL:TITLE \(_{Paradise}\), by ] | [ Coldplay ] | false |
Shift | [ Play, IN:PLAY_MUSIC , SL:TITLE \(_{Paradise}\), by, Coldplay ] | [ ] | false |
NT- SL:ARTIST | [ Play, IN:PLAY_MUSIC , ..., Coldplay, SL:ARTIST ] | [ ] | false |
Reduce | [ Play, IN:PLAY_MUSIC , ..., by, SL:ARTIST \(_{Coldplay}\) ] | [ ] | false |
Reduce | [ IN:PLAY_MUSIC \(_{Play\ \texttt {SL:TITLE}\ by\ \texttt {SL:ARTIST}}\) ] | [ ] | false |
Finish | [ IN:PLAY_MUSIC \(_{Play\ \texttt {SL:TITLE}\ by\ \texttt {SL:ARTIST}}\) ] | [ ] | true |
Neural Parsing Model
-
If the action \(a_{t-1}\) is a Shift transition, the first token in \(m^{\textit{buffer}}\) will be masked out and added to \(m^{\textit{stack}}\). This applies to all proposed transition systems, as the Shift transition behaves consistently across them.
-
When a Non-terminal-L transition is applied, it affects the stack structure in \(c_t\) but has no effect on \(m^{\textit{stack}}\). This is because attention heads only attend to input tokens, and non-terminals are artificial symbols not present in the user utterance.
-
For a Reduce transition (including the Reduce#k-L action from the non-binary bottom-up transition system), all tokens in \(m^{\textit{stack}}\) that form the upcoming constituent will be masked out, except for the initial word representing the resultant constituent (since artificial non-terminals cannot be considered by the attention heads).
Experiments
Setup
TOP
)2 was introduced by [4], who annotated utterances with multiple nested intents across two domains: event and navigation. This was further extended by [19] in the second version (TOPv2
),3 which added six additional domains: alarm, messaging, music, reminder, timer, and weather. While the first version presents user queries with a high degree of compositionality, the extension TOPv2
introduced some domains (such as music and weather) where all utterances can be parsed with flat trees. Table 4 provides some statistics of the TOP
and TOPv2
datasets.TOPv2
offers specific splits designed to evaluate task-oriented semantic parsers in a low-resource domain adaptation scenario. The conventional approach involves utilizing some samples from the reminder and weather domains as target domains, while considering the remaining six full domains (including event and navigation from TOP
) as source domains if necessary. Moreover, instead of selecting a fixed number of training samples per target domain, TOPv2
adopts a SPIS (samples per intent and slot) strategy. For example, a 25 SPIS strategy entails randomly selecting the necessary number of samples to ensure at least 25 training instances for each intent and slot of the target domain. To facilitate a fair comparison, we evaluate our approach on the training, test, and validation splits at both 25 SPIS and 500 SPIS for the target domains reminder and weather, as provided in TOPv2
. Additionally, following the methodology proposed by [19], we employ a joint training strategy in the 25 SPIS setting, wherein the training data from the source domain is combined with the training split from the target domain.TOPv2
dataset (referred to as TOPv2
\(^*\)). This variant comprises domains with a high percentage of hierarchical structures: alarm, messaging, and reminder. Our aim is to rigorously test the three proposed transition systems on complex compositional queries, excluding those domains that can be fully parsed with flat trees, which are more easily handled by traditional slot-filling methods.Dataset | Domain | Training | Valid | Test | Intents | Slots | %Compos |
---|---|---|---|---|---|---|---|
TOP | Event | 9170 | 1336 | 2654 | 11 | 17 | 20% |
Navigation | 20,998 | 2971 | 6075 | 17 | 33 | 43% | |
TOPv2 | Alarm | 20,430 | 2935 | 7123 | 8 | 9 | 16% |
Messaging | 10,018 | 1536 | 3048 | 12 | 27 | 16% | |
Music | 11,563 | 1573 | 4184 | 15 | 9 | 0% | |
Reminder | 17,840 | 2526 | 5767 | 19 | 32 | 21% | |
Timer | 11,524 | 1616 | 4252 | 11 | 5 | 4% | |
Weather | 23,054 | 2667 | 5682 | 7 | 11 | 0% |
-
Exact match accuracy (EM), which measures the percentage of full trees correctly built.
-
Labeled bracketing F\(_1\) score (F\(_1\)), which compares the non-terminal label and span of each predicted constituent against the gold standard. This is similar to the scoring method provided by the EVALB script4 for constituency parsing [44], but it also includes pre-terminal nodes in the evaluation.
-
Tree-labeled F\(_1\) score (TF\(_1\)), which evaluates the subtree structure of each predicted constituent against the gold tree.
inverse-sqrt
scheduling scheme, with a minimum of 1\(e^{-9}\) [10]. Additionally, we applied a label smoothing rate of 0.01, a dropout rate of 0.3, and trained for 90 epochs. Furthermore, we averaged the weights from the three best checkpoints based on the validation split using greedy decoding and employed a beam size of 10 for evaluation on the test set. All models were trained and tested on a single Nvidia TESLA P40 GPU with 24 GB of memory.TOPv2
splits. Lastly, we incorporate the recent state-of-the-art approach by [33], which employs a language model enhanced with semantic structured information, into both high-resource and low-resource comparisons.TOP
and TOPv2
\(^{*}\) test splitsTOP | TOPv2 \(^{*}\) | |||||
---|---|---|---|---|---|---|
Transition system | EM | F\(_1\) | TF\(_1\) | EM | F\(_1\) | TF\(_1\) |
Top-down | 86.66±0.06 | 95.33±0.07 | 90.80±0.01 | 87.98±0.09 | 94.51±0.03 | 90.99±0.09 |
Bottom-up | 85.89±0.10 | 94.86±0.06 | 90.33±0.05 | 86.27±0.03 | 93.56±0.06 | 89.57±0.02 |
In-order | 87.15±0.01 | 95.57±0.15 | 91.18±0.13 | 88.11±0.07 | 94.60±0.04 | 91.11±0.07 |
TOP
test setParser | EM |
---|---|
(Sequence-to-sequence models) | |
Rongali et al. [11] + RoBERTa\(_\textsc {fine-tuned}\) | 86.67 |
Aghajanyan et al. [12] + RoBERTa\(_\textsc {fine-tuned}\) | 84.52 |
Aghajanyan et al. [12] + BART\(_\textsc {fine-tuned}\) | 87.10 |
Zhu et al. [27] + RoBERTa\(_\textsc {fine-tuned}\) | 86.74 |
Shrivastava et al. [29] + RoBERTa\(_\textsc {fine-tuned}\) | 85.07 |
Oh et al. [30] + BERT\(_\textsc {fine-tuned}\) | 86.00 |
Shrivastava et al. [31] + RoBERTa\(_\textsc {fine-tuned}\) | 86.14 |
(Shift-reduce models) | |
Einolghozati et al. [22] + ELMo | 83.93 |
Top-down shift-reduce parser + RoBERTa | 86.66 |
Bottom-up shift-reduce parser + RoBERTa | 85.89 |
In-order shift-reduce parser + RoBERTa | 87.15 |
Einolghozati et al. [22] + ELMo + ensemble | 86.26 |
Einolghozati et al. [22] + ELMo + ensemble + SVM-Rank | 87.25 |
Do et al. [33] + RoBERTa\(_\textsc {fine-tuned}^\textsc {+ hierarchical information}\) | 88.18 |
Results
TOP
and TOPv2
\(^*\) datasets in Table 5. Regardless of the metric, the in-order algorithm consistently outperforms the other two alternatives on both datasets. Although the TOP
dataset contains a higher percentage of compositional queries than TOPv2
\(^*\), the in-order parser shows a more significant accuracy advantage over the top-down parser on TOP
(0.49 EM accuracy points) compared to TOPv2
\(^*\) (0.13 EM accuracy points). The bottom-up approach notably underperforms compared to the other transition systems on both datasets.TOP
dataset. Using frozen RoBERTa-based word embeddings, the in-order shift-reduce parser outperforms all existing methods under similar conditions, including sequence-to-sequence models that fine-tune language models for task-oriented parsing. Specifically, it surpasses the single-model and ensemble variants of the shift-reduce parser by [22] by 3.22 and 0.89 EM accuracy points, respectively. Additionally, our best transition system achieves improvements of 0.41 and 0.05 EM accuracy points over top-performing sequence-to-sequence baselines initialized with RoBERTa [27] and BART [12], respectively. The exceptions are the enhanced variant of [22] (which uses an ensemble of seven parsers and an SVM language model ranker) and the two-staged system by [33] that employs an augmented language model with hierarchical information, achieving the best accuracy to date on the TOP
dataset.Reminder | Weather | |||
---|---|---|---|---|
Parser | 25 SPIS | 500 SPIS | 25 SPIS | 500 SPIS |
(Sequence-to-sequence models) | ||||
Chen et al. [19] + RoBERTa\(_\textsc {fine-tuned}\) | − | 71.9 | − | 83.5 |
Chen et al. [19] + BART\(_\textsc {fine-tuned}\) | 57.1 | 71.9 | 71.0 | 84.9 |
(Shift-reduce models) | ||||
Top-down S-R parser + RoBERTa | 57.39±0.27 | 79.79±0.19 | 71.22±1.03 | 83.19±0.20 |
Bottom-up S-R parser + RoBERTa | 40.45±0.88 | 68.65±0.39 | 68.58±0.99 | 74.13±0.35 |
In-order S-R parser + RoBERTa | 60.56±0.12 | 79.79±0.27 | 73.36±0.08 | 85.44±0.21 |
Do et al. [33]+RoBERTa\(_\textsc {fine-tuned}^\textsc {+hierar. inform.}\) | 72.12 | 82.28 | 77.96 | 88.08 |
TOP
test split were not well-formed. Although [19] did not document this information, we anticipate a significant increase in invalid trees in the low-resource setting. Finally, it is worth mentioning that techniques such as ensembling, re-ranking, or fine-tuning pre-trained language models are orthogonal to our approach and, while they may consume more resources, they can be directly implemented to further enhance performance.Analysis
TOP
dataset.SL:DATE_TIME_DEPARTURE
and SL:DATE_TIME_ARRIVAL
. The only exceptions are slot constituents SL:SOURCE
and SL:CATEGORY_EVENT
, where the bottom-up and top-down algorithms respectively achieve higher accuracy.