We gratefully acknowledge support from
the Simons Foundation
and member institutions

Computation and Language

New submissions

[ total of 22 entries: 1-22 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 20 Apr 18

[1]  arXiv:1804.06868 [pdf, other]
Title: Learning to Map Context-Dependent Sentences to Executable Formal Queries
Comments: NAACL-HLT 2018
Subjects: Computation and Language (cs.CL)

We propose a context-dependent model to map utterances within an interaction to executable formal queries. To incorporate interaction history, the model maintains an interaction-level encoder that updates after each turn, and can copy sub-sequences of previously predicted queries during generation. Our approach combines implicit and explicit modeling of references between utterances. We evaluate our model on the ATIS flight planning interactions, and demonstrate the benefits of modeling context and explicit references.

[2]  arXiv:1804.06870 [pdf, other]
Title: Object Ordering with Bidirectional Matchings for Visual Reasoning
Authors: Hao Tan, Mohit Bansal
Comments: NAACL 2018 (8 pages)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

Visual reasoning with compositional natural language instructions, e.g., based on the newly-released Cornell Natural Language Visual Reasoning (NLVR) dataset, is a challenging task, where the model needs to have the ability to create an accurate mapping between the diverse phrases and the several objects placed in complex arrangements in the image. Further, this mapping needs to be processed to answer the question in the statement given the ordering and relationship of the objects across three similar images. In this paper, we propose a novel end-to-end neural model for the NLVR task, where we first use joint bidirectional attention to build a two-way conditioning between the visual information and the language phrases. Next, we use an RL-based pointer network to sort and process the varying number of unordered objects (so as to match the order of the statement phrases) in each of the three images and then pool over the three decisions. Our model achieves strong improvements (of 4-6% absolute) over the state-of-the-art on both the structured representation and raw image versions of the dataset.

[3]  arXiv:1804.06876 [pdf, other]
Title: Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods
Comments: NAACL '18 Camera Ready
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

We introduce a new benchmark, WinoBias, for coreference resolution focused on gender bias. Our corpus contains Winograd-schema style sentences with entities corresponding to people referred by their occupation (e.g. the nurse, the doctor, the carpenter). We demonstrate that a rule-based, a feature-rich, and a neural coreference system all link gendered pronouns to pro-stereotypical entities with higher accuracy than anti-stereotypical entities, by an average difference of 21.1 in F1 score. Finally, we demonstrate a data-augmentation approach that, in combination with existing word-embedding debiasing techniques, removes the bias demonstrated by these systems in WinoBias without significantly affecting their performance on existing coreference benchmark datasets. Our dataset and code are available at this http URL

[4]  arXiv:1804.06898 [pdf, other]
Title: Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input
Comments: 9
Journal-ref: NAACL 2018
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

We demonstrate that current state-of-the-art approaches to Automated Essay Scoring (AES) are not well-suited to capturing adversarially crafted input of grammatical but incoherent sequences of sentences. We develop a neural model of local coherence that can effectively learn connectedness features between sentences, and propose a framework for integrating and jointly training the local coherence model with a state-of-the-art AES model. We evaluate our approach against a number of baselines and experimentally demonstrate its effectiveness on both the AES task and the task of flagging adversarial input, further contributing to the development of an approach that strengthens the validity of neural essay scoring models.

[5]  arXiv:1804.06922 [pdf, other]
Title: Sentences with Gapping: Parsing and Reconstructing Elided Predicates
Comments: To be presented at NAACL 2018
Journal-ref: Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2018)
Subjects: Computation and Language (cs.CL)

Sentences with gapping, such as Paul likes coffee and Mary tea, lack an overt predicate to indicate the relation between two or more arguments. Surface syntax representations of such sentences are often produced poorly by parsers, and even if correct, not well suited to downstream natural language understanding tasks such as relation extraction that are typically designed to extract information from sentences with canonical clause structure. In this paper, we present two methods for parsing to a Universal Dependencies graph representation that explicitly encodes the elided material with additional nodes and edges. We find that both methods can reconstruct elided material from dependency trees with high accuracy when the parser correctly predicts the existence of a gap. We further demonstrate that one of our methods can be applied to other languages based on a case study on Swedish.

[6]  arXiv:1804.06987 [pdf, other]
Title: Improving Distantly Supervised Relation Extraction using Word and Entity Based Attention
Subjects: Computation and Language (cs.CL)

Relation extraction is the problem of classifying the relationship between two entities in a given sentence. Distant Supervision (DS) is a popular technique for developing relation extractors starting with limited supervision. We note that most of the sentences in the distant supervision relation extraction setting are very long and may benefit from word attention for better sentence representation. Our contributions in this paper are threefold. Firstly, we propose two novel word attention models for distantly- supervised relation extraction: (1) a Bi-directional Gated Recurrent Unit (Bi-GRU) based word attention model (BGWA), (2) an entity-centric attention model (EA), and (3) a combination model which combines multiple complementary models using weighted voting method for improved relation extraction. Secondly, we introduce GDS, a new distant supervision dataset for relation extraction. GDS removes test data noise present in all previous distant- supervision benchmark datasets, making credible automatic evaluation possible. Thirdly, through extensive experiments on multiple real-world datasets, we demonstrate the effectiveness of the proposed methods.

[7]  arXiv:1804.07000 [pdf, other]
Title: Utilizing Neural Networks and Linguistic Metadata for Early Detection of Depression Indications in Text Sequences
Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. 13 pages, 3 figures, 7 tables
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

Depression is ranked as the largest contributor to global disability and is also a major reason for suicide. Still, many individuals suffering from forms of depression are not treated for various reasons. Previous studies have shown that depression also has an effect on language usage and that many depressed individuals use social media platforms or the internet in general to get information or discuss their problems. This paper addresses the early detection of depression using machine learning models based on messages on a social platform. In particular, a convolutional neural network based on different word embeddings is evaluated and compared to a classification based on user-level linguistic metadata. An ensemble of both approaches is shown to achieve state-of-the-art results in a current early detection task. Furthermore, the currently popular ERDE score as metric for early detection systems is examined in detail and its drawbacks in the context of shared tasks are illustrated. A slightly modified metric is proposed and compared to the original score. Finally, a new word embedding was trained on a large corpus of the same domain as the described task and is evaluated as well.

[8]  arXiv:1804.07007 [pdf, ps, other]
Title: Incorporating Pseudo-Parallel Data for Quantifiable Sequence Editing
Comments: 11 pages, 1 figure, 7 tables
Subjects: Computation and Language (cs.CL)

In the task of quantifiable sequence editing (QuaSE), a model needs to edit an input sentence to generate an output that satisfies a given outcome, which is a numerical value measuring a certain property of the output. For example, for review sentences, the outcome could be review ratings; for advertisement, the outcome could be click-through rate. We propose a framework which performs QuaSE by incorporating pseudo-parallel data. Our framework can capture the content similarity and the outcome differences by exploiting pseudo-parallel sentence pairs, which enables a better disentanglement of the latent factors that are relevant to the outcome and thus provides a solid basis to generate output satisfying the desired outcome. The dual reconstruction structure further enhances the capability of generating expected output by exploiting the coupling of latent factors of pseudo-parallel sentences. We prepare a dataset of Yelp review sentences with the ratings as outcome. Experimental results show that our framework can outperform state-of-the-art methods under both sentiment polarity accuracy and target value errors.

[9]  arXiv:1804.07036 [pdf, other]
Title: Learning to Extract Coherent Summary via Deep Reinforcement Learning
Comments: 8 pages, 1 figure, presented at AAAI-2018
Subjects: Computation and Language (cs.CL)

Coherence plays a critical role in producing a high-quality summary from a document. In recent years, neural extractive summarization is becoming increasingly attractive. However, most of them ignore the coherence of summaries when extracting sentences. As an effort towards extracting coherent summaries, we propose a neural coherence model to capture the cross-sentence semantic and syntactic coherence patterns. The proposed neural coherence model obviates the need for feature engineering and can be trained in an end-to-end fashion using unlabeled data. Empirical results show that the proposed neural coherence model can efficiently capture the cross-sentence coherence patterns. Using the combined output of the neural coherence model and ROUGE package as the reward, we design a reinforcement learning method to train a proposed neural extractive summarizer which is named Reinforced Neural Extractive Summarization (RNES) model. The RNES model learns to optimize coherence and informative importance of the summary simultaneously. Experimental results show that the proposed RNES outperforms existing baselines and achieves state-of-the-art performance in term of ROUGE on CNN/Daily Mail dataset. The qualitative evaluation indicates that summaries produced by RNES are more coherent and readable.

[10]  arXiv:1804.07068 [pdf, ps, other]
Title: Consistent CCG Parsing over Multiple Sentences for Improved Logical Reasoning
Comments: 6 pages. short paper accepted to NAACL2018
Subjects: Computation and Language (cs.CL)

In formal logic-based approaches to Recognizing Textual Entailment (RTE), a Combinatory Categorial Grammar (CCG) parser is used to parse input premises and hypotheses to obtain their logical formulas. Here, it is important that the parser processes the sentences consistently; failing to recognize a similar syntactic structure results in inconsistent predicate argument structures among them, in which case the succeeding theorem proving is doomed to failure. In this work, we present a simple method to extend an existing CCG parser to parse a set of sentences consistently, which is achieved with an inter-sentence modeling with Markov Random Fields (MRF). When combined with existing logic-based systems, our method always shows improvement in the RTE experiments on English and Japanese languages.

[11]  arXiv:1804.07097 [pdf, other]
Title: Putting Question-Answering Systems into Practice: Transfer Learning for Efficient Domain Customization
Comments: Submitted to ACM TMIS
Subjects: Computation and Language (cs.CL)

Traditional information retrieval (such as that offered by web search engines) impedes users with information overload from extensive result pages and the need to manually locate the desired information therein. Conversely, question-answering systems change how humans interact with information systems: users can now ask specific questions and obtain a tailored answer - both conveniently in natural language. Despite obvious benefits, their use is often limited to an academic context, largely because of expensive domain customizations, which means that the performance in domain-specific applications often fails to meet expectations. This paper presents cost-efficient remedies: a selection mechanism increases the precision of document retrieval and a fused approach to transfer learning is proposed in order to improve the performance of answer extraction. Here knowledge is inductively transferred from a related, yet different, tasks to the domain-specific application, while accounting for potential differences in the sample sizes across both tasks. The resulting performance is demonstrated with an actual use case from a finance company, where fewer than 400 question-answer pairs had to be annotated in order to yield significant performance gains. As a direct implication to management, this presents a promising path to better leveraging of knowledge stored in information systems.

[12]  arXiv:1804.07212 [pdf, other]
Title: Learning Disentangled Representations of Texts with Application to Biomedical Abstracts
Subjects: Computation and Language (cs.CL)

We propose a method for learning disentangled sets of vector representations of texts that capture distinct aspects. We argue that such representations afford model transfer and interpretability. To induce disentangled embeddings, we propose an adversarial objective based on the (dis)similarity between triplets of documents w.r.t. specific aspects. Our motivating application concerns embedding abstracts describing clinical trials in a manner that disentangles the populations, interventions, and outcomes in a given trial. We show that the induced representations indeed encode these targeted clinically salient aspects and that they can be effectively used to perform aspect-specific retrieval. We demonstrate that the approach generalizes beyond this motivating example via experiments on two multi-aspect review corpora.

[13]  arXiv:1804.07253 [pdf, other]
Title: Helping or Hurting? Predicting Changes in Users' Risk of Self-Harm Through Online Community Interactions
Comments: 10 pages, 4 figures, 5 tables, accepted for publication at the CLPsych workshop at NAACL-HLT 2018
Subjects: Computation and Language (cs.CL)

In recent years, online communities have formed around suicide and self-harm prevention. While these communities offer support in moment of crisis, they can also normalize harmful behavior, discourage professional treatment, and instigate suicidal ideation. In this work, we focus on how interaction with others in such a community affects the mental state of users who are seeking support. We first build a dataset of conversation threads between users in a distressed state and community members offering support. We then show how to construct a classifier to predict whether distressed users are helped or harmed by the interactions in the thread, and we achieve a macro-F1 score of up to 0.69.

Cross-lists for Fri, 20 Apr 18

[14]  arXiv:1804.07247 (cross-list from cs.SI) [pdf, other]
Title: Identifying Compromised Accounts on Social Media Using Statistical Text Analysis
Comments: 10 pages
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Cryptography and Security (cs.CR)

Compromised social media accounts are legitimate user accounts that have been hijacked by a third (malicious) party and can cause various kinds of damage. Early detection of such compromised accounts is very important in order to control the damage. In this work we propose a novel general framework for discovering compromised accounts by utilizing statistical text analysis. The framework is built on the observation that users will use language that is measurably different from the language that a hacker (or spammer) would use, when the account is compromised. We use the framework to develop specific algorithms based on language modeling and use the similarity of language models of users and spammers as features in a supervised learning setup to identify compromised accounts. Evaluation results on a large Twitter corpus of over 129 million tweets show promising results of the proposed approach.

Replacements for Fri, 20 Apr 18

[15]  arXiv:1603.08865 (replaced) [html]
Title: Compilation as a Typed EDSL-to-EDSL Transformation
Authors: Emil Axelsson
Subjects: Computation and Language (cs.CL)
[16]  arXiv:1704.05958 (replaced) [pdf, ps, other]
Title: Global Relation Embedding for Relation Extraction
Comments: Accepted to NAACL HLT 2018
Subjects: Computation and Language (cs.CL)
[17]  arXiv:1710.06371 (replaced) [pdf, other]
Title: Specialising Word Vectors for Lexical Entailment
Comments: NAACL-HLT 2018 (long paper)
Subjects: Computation and Language (cs.CL)
[18]  arXiv:1711.00768 (replaced) [pdf, other]
Title: SRL4ORL: Improving Opinion Role Labeling using Multi-task Learning with Semantic Role Labeling
Comments: Published in NAACL 2018
Journal-ref: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT)
Subjects: Computation and Language (cs.CL)
[19]  arXiv:1803.02400 (replaced) [pdf, other]
Title: Natural Language to Structured Query Generation via Meta-Learning
Comments: in NAACL HLT 2018
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
[20]  arXiv:1804.05388 (replaced) [pdf, other]
Title: Introducing two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness
Comments: The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2018)
Subjects: Computation and Language (cs.CL)
[21]  arXiv:1804.06610 (replaced) [pdf, other]
Title: End-to-end Graph-based TAG Parsing with Neural Networks
Comments: NAACL 2018
Subjects: Computation and Language (cs.CL)
[22]  arXiv:1803.01271 (replaced) [pdf, other]
Title: An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[ total of 22 entries: 1-22 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)