# Computation and Language

## New submissions

[ total of 48 entries: 1-48 ]
[ showing up to 2000 entries per page: fewer | more ]

### New submissions for Tue, 25 Apr 17

[1]
Title: Improving Semantic Composition with Offset Inference
Comments: to appear at ACL 2017 (short papers)
Subjects: Computation and Language (cs.CL)

Count-based distributional semantic models suffer from sparsity due to unobserved but plausible co-occurrences in any text collection. This problem is amplified for models like Anchored Packed Trees (APTs), that take the grammatical type of a co-occurrence into account. We therefore introduce a novel form of distributional inference that exploits the rich type structure in APTs and infers missing data by the same mechanism that is used for semantic composition.

[2]
Title: Lexical Features in Coreference Resolution: To be Used With Caution
Subjects: Computation and Language (cs.CL)

Lexical features are a major source of information in state-of-the-art coreference resolvers. Lexical features implicitly model some of the linguistic phenomena at a fine granularity level. They are especially useful for representing the context of mentions. In this paper we investigate a drawback of using many lexical features in state-of-the-art coreference resolvers. We show that if coreference resolvers mainly rely on lexical features, they can hardly generalize to unseen domains. Furthermore, we show that the current coreference resolution evaluation is clearly flawed by only evaluating on a specific split of a specific dataset in which there is a notable overlap between the training, development and test sets.

[3]
Title: Sarcasm SIGN: Interpreting Sarcasm with Sentiment Based Monolingual Machine Translation
Subjects: Computation and Language (cs.CL)

Sarcasm is a form of speech in which speakers say the opposite of what they truly mean in order to convey a strong sentiment. In other words, "Sarcasm is the giant chasm between what I say, and the person who doesn't get it.". In this paper we present the novel task of sarcasm interpretation, defined as the generation of a non-sarcastic utterance conveying the same message as the original sarcastic one. We introduce a novel dataset of 3000 sarcastic tweets, each interpreted by five human judges. Addressing the task as monolingual machine translation (MT), we experiment with MT algorithms and evaluation measures. We then present SIGN: an MT based sarcasm interpretation algorithm that targets sentiment words, a defining element of textual sarcasm. We show that while the scores of n-gram based automatic measures are similar for all interpretation models, SIGN's interpretations are scored higher by humans for adequacy and sentiment polarity. We conclude with a discussion on future research directions for our new task.

[4]
Title: Medical Text Classification using Convolutional Neural Networks
Subjects: Computation and Language (cs.CL)

We present an approach to automatically classify clinical text at a sentence level. We are using deep convolutional neural networks to represent complex features. We train the network on a dataset providing a broad categorization of health information. Through a detailed evaluation, we demonstrate that our method outperforms several approaches widely used in natural language processing tasks by about 15%.

[5]
Title: Affect-LM: A Neural Language Model for Customizable Affective Text Generation
Subjects: Computation and Language (cs.CL)

Human verbal communication includes affective messages which are conveyed through use of emotionally colored words. There has been a lot of research in this direction but the problem of integrating state-of-the-art neural language models with affective information remains an area ripe for exploration. In this paper, we propose an extension to an LSTM (Long Short-Term Memory) language model for generating conversational text, conditioned on affect categories. Our proposed model, Affect-LM enables us to customize the degree of emotional content in generated sentences through an additional design parameter. Perception studies conducted using Amazon Mechanical Turk show that Affect-LM generates naturally looking emotional sentences without sacrificing grammatical correctness. Affect-LM also learns affect-discriminative word representations, and perplexity experiments show that additional affective information in conversational text can improve language model prediction.

[6]
Title: Deep Multitask Learning for Semantic Dependency Parsing
Subjects: Computation and Language (cs.CL)

We present a deep neural architecture that parses sentences into three semantic dependency graph formalisms. By using efficient, nearly arc-factored inference and a bidirectional-LSTM composed with a multi-layer perceptron, our base system is able to significantly improve the state of the art for semantic dependency parsing, without using hand-engineered features or syntax. We then explore two multitask learning approaches---one that shares parameters across formalisms, and one that uses higher-order structures to predict the graphs jointly. We find that both approaches improve performance across formalisms on average, achieving a new state of the art. Our code is open-source and available at https://github.com/Noahs-ARK/NeurboParser.

[7]
Title: Argument Mining with Structured SVMs and RNNs
Comments: Accepted for publication at ACL 2017. 11 pages, 5 figures. Code at this https URL and data at this http URL
Subjects: Computation and Language (cs.CL)

We propose a novel factor graph model for argument mining, designed for settings in which the argumentative relations in a document do not necessarily form a tree structure. (This is the case in over 20% of the web comments dataset we release.) Our model jointly learns elementary unit type classification and argumentative relation prediction. Moreover, our model supports SVM and RNN parametrizations, can enforce structure constraints (e.g., transitivity), and can express dependencies between adjacent relations and propositions. Our approaches outperform unstructured baselines in both web comments and argumentative essay datasets.

[8]
Title: Learning to Skim Text
Subjects: Computation and Language (cs.CL); Learning (cs.LG)

Recurrent Neural Networks are showing much promise in many sub-areas of natural language processing, ranging from document classification to machine translation to automatic question answering. Despite their promise, many recurrent models have to read the whole text word by word, making it slow to handle long documents. For example, it is difficult to use a recurrent network to read a book and answer questions about it. In this paper, we present an approach of reading text while skipping irrelevant information if needed. The underlying model is a recurrent network that learns how far to jump after reading a few words of the input text. We employ a standard policy gradient method to train the model to make discrete jumping decisions. In our benchmarks on four different tasks, including number prediction, sentiment analysis, news article classification and automatic Q\&A, our proposed model, a modified LSTM with jumping, is up to 6 times faster than the standard sequential LSTM, while maintaining the same or even better accuracy.

[9]
Title: Deep Keyphrase Generation
Comments: 11 pages. Accepted by ACL2017
Subjects: Computation and Language (cs.CL)

Keyphrase provides highly-summative information that can be effectively used for understanding, organizing and retrieving text content. Though previous studies have provided many workable solutions for automated keyphrase extraction, they commonly divided the to-be-summarized content into multiple text chunks, then ranked and selected the most meaningful ones. These approaches could neither identify keyphrases that do not appear in the text, nor capture the real semantic meaning behind the text. We propose a generative model for keyphrase prediction with an encoder-decoder framework, which can effectively overcome the above drawbacks. We name it as deep keyphrase generation since it attempts to capture the deep semantic meaning of the content with a deep learning method. Empirical analysis on six datasets demonstrates that our proposed model not only achieves a significant performance boost on extracting keyphrases that appear in the source text, but also can generate absent keyphrases based on the semantic meaning of the text. Code and dataset are available at https://github.com/memray/seq2seq-keyphrase.

[10]
Title: Learning weakly supervised multimodal phoneme embeddings
Subjects: Computation and Language (cs.CL); Learning (cs.LG)

Recent works have explored deep architectures for learning multimodal speech representation (e.g. audio and images, articulation and audio) in a supervised way. Here we investigate the role of combining different speech modalities, i.e. audio and visual information representing the lips movements, in a weakly supervised way using Siamese networks and lexical same-different side information. In particular, we ask whether one modality can benefit from the other to provide a richer representation for phone recognition in a weakly supervised setting. We introduce mono-task and multi-task methods for merging speech and visual modalities for phone recognition. The mono-task learning consists in applying a Siamese network on the concatenation of the two modalities, while the multi-task learning receives several different combinations of modalities at train time. We show that multi-task learning enhances discriminability for visual and multimodal inputs while minimally impacting auditory inputs. Furthermore, we present a qualitative analysis of the obtained phone embeddings, and show that cross-modal visual input can improve the discriminability of phonological features which are visually discernable (rounding, open/close, labial place of articulation), resulting in representations that are closer to abstract linguistic features than those based on audio only.

[11]
Title: Neural Machine Translation via Binary Code Prediction
Comments: Accepted as a long paper at ACL2017
Subjects: Computation and Language (cs.CL)

In this paper, we propose a new method for calculating the output layer in neural machine translation systems. The method is based on predicting a binary code for each word and can reduce computation time/memory requirements of the output layer to be logarithmic in vocabulary size in the best case. In addition, we also introduce two advanced approaches to improve the robustness of the proposed model: using error-correcting codes and combining softmax and binary codes. Experiments on two English-Japanese bidirectional translation tasks show proposed models achieve BLEU scores that approach the softmax, while reducing memory usage to the order of less than 1/10 and improving decoding speed on CPUs by x5 to x10.

[12]
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)

In this paper, we study a new learning paradigm for Neural Machine Translation (NMT). Instead of maximizing the likelihood of the human translation as in previous works, we minimize the distinction between human translation and the translation given by a NMT model. To achieve this goal, inspired by the recent success of generative adversarial networks (GANs), we employ an adversarial training architecture and name it as Adversarial-NMT. In Adversarial-NMT, the training of the NMT model is assisted by an adversary, which is an elaborately designed Convolutional Neural Network (CNN). The goal of the adversary is to differentiate the translation result generated by the NMT model from that by human. The goal of the NMT model is to produce high quality translations so as to cheat the adversary. A policy gradient method is leveraged to co-train the NMT model and the adversary. Experimental results on English$\rightarrow$French and German$\rightarrow$English translation tasks show that Adversarial-NMT can achieve significantly better translation quality than several strong baselines.

[13]
Title: A* CCG Parsing with a Supertag and Dependency Factored Model
Comments: long paper (11 pages) accepted to ACL 2017
Subjects: Computation and Language (cs.CL)

We propose a new A* CCG parsing model in which the probability of a tree is decomposed into factors of CCG categories and its syntactic dependencies both defined on bi-directional LSTMs. Our factored model allows the precomputation of all probabilities and runs very efficiently, while modeling sentence structures explicitly via dependencies. Our model achieves the state-of-the-art results on English and Japanese CCG parsing.

[14]
Title: Naturalizing a Programming Language via Interactive Learning
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Learning (cs.LG)

Our goal is to create a convenient natural language interface for performing well-specified but complex actions such as analyzing data, manipulating text, and querying databases. However, existing natural language interfaces for such tasks are quite primitive compared to the power one wields with a programming language. To bridge this gap, we start with a core programming language and allow users to "naturalize" the core language incrementally by defining alternative, more natural syntax and increasingly complex concepts in terms of compositions of simpler ones. In a voxel world, we show that a community of users can simultaneously teach a common system a diverse language and use it to build hundreds of complex voxel structures. Over the course of three days, these users went from using only the core language to using the naturalized language in 85.9\% of the last 10K utterances.

[15]
Title: Translating Neuralese
Comments: To appear in ACL 2017
Subjects: Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)

Several approaches have recently been proposed for learning decentralized deep multiagent policies that coordinate via a differentiable communication channel. While these policies are effective for many tasks, interpretation of their induced communication strategies has remained a challenge. Here we propose to interpret agents' messages by translating them. Unlike in typical machine translation problems, we have no parallel data to learn from. Instead we develop a translation model based on the insight that agent messages and natural language strings mean the same thing if they induce the same belief about the world in a listener. We present theoretical guarantees and empirical evidence that our approach preserves both the semantics and pragmatics of messages by ensuring that players communicating through a translation layer do not suffer a substantial loss in reward relative to players with a common language.

[16]
Title: Differentiable Scheduled Sampling for Credit Assignment
Comments: Accepted at ACL2017 (this http URL)
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

We demonstrate that a continuous relaxation of the argmax operation can be used to create a differentiable approximation to greedy decoding for sequence-to-sequence (seq2seq) models. By incorporating this approximation into the scheduled sampling training procedure (Bengio et al., 2015)--a well-known technique for correcting exposure bias--we introduce a new training objective that is continuous and differentiable everywhere and that can provide informative gradients near points where previous decoding decisions change their value. In addition, by using a related approximation, we demonstrate a similar approach to sampled-based training. Finally, we show that our approach outperforms cross-entropy training and scheduled sampling procedures in two sequence prediction tasks: named entity recognition and machine translation.

[17]
Title: Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling
Subjects: Computation and Language (cs.CL)

Fixed-vocabulary language models fail to account for one of the most characteristic statistical facts of natural language: the frequent creation and reuse of new word types. Although character-level language models offer a partial solution in that they can create word types not attested in the training corpus, they do not capture the "bursty" distribution of such words. In this paper, we augment a hierarchical LSTM language model that generates sequences of word tokens character by character with a caching mechanism that learns to reuse previously generated words. To validate our model we construct a new open-vocabulary language modeling corpus (the Multilingual Wikipedia Corpus, MWC) from comparable Wikipedia articles in 7 typologically diverse languages and demonstrate the effectiveness of our model across this range of languages.

[18]
Title: Fast and Accurate Neural Word Segmentation for Chinese
Subjects: Computation and Language (cs.CL)

Neural models with minimal feature engineering have achieved competitive performance against traditional methods for the task of Chinese word segmentation. However, both training and working procedures of the current neural models are computationally inefficient. This paper presents a greedy neural word segmenter with balanced word and character embedding inputs to alleviate the existing drawbacks. Our segmenter is truly end-to-end, capable of performing segmentation much faster and even more accurate than state-of-the-art neural models on Chinese benchmark datasets.

[19]
Title: Using Global Constraints and Reranking to Improve Cognates Detection
Comments: 10 pages, 6 figures, 6 tables; to appear in the Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)

Global constraints and reranking have not been used in cognates detection research to date. We propose methods for using global constraints by performing rescoring of the score matrices produced by state of the art cognates detection systems. Using global constraints to perform rescoring is complementary to state of the art methods for performing cognates detection and results in significant performance improvements beyond current state of the art performance on publicly available datasets with different language pairs and various conditions such as different levels of baseline state of the art performance and different data size conditions, including with more realistic large data size conditions than have been evaluated with in the past.

[20]
Title: Selective Encoding for Abstractive Sentence Summarization
Comments: 10 pages; To appear in ACL 2017
Subjects: Computation and Language (cs.CL)

We propose a selective encoding model to extend the sequence-to-sequence framework for abstractive sentence summarization. It consists of a sentence encoder, a selective gate network, and an attention equipped decoder. The sentence encoder and decoder are built with recurrent neural networks. The selective gate network constructs a second level sentence representation by controlling the information flow from encoder to decoder. The second level representation is tailored for sentence summarization task, which leads to better performance. We evaluate our model on the English Gigaword, DUC 2004 and MSR abstractive sentence summarization datasets. The experimental results show that the proposed selective encoding model outperforms the state-of-the-art baseline models.

[21]
Title: Robust Incremental Neural Semantic Graph Parsing
Comments: 12 pages; Accepted to ACL 2017
Subjects: Computation and Language (cs.CL)

Parsing sentences to linguistically-expressive semantic representations is a key goal of Natural Language Processing. Yet statistical parsing has focused almost exclusively on bilexical dependencies or domain-specific logical forms. We propose a neural encoder-decoder transition-based parser which is the first full-coverage semantic graph parser for Minimal Recursion Semantics (MRS). The model architecture uses stack-based embedding features, predicting graphs jointly with unlexicalized predicates and their token alignments. Our parser is more accurate than attention-based baselines on MRS, and on an additional Abstract Meaning Representation (AMR) benchmark, and GPU batch processing makes it an order of magnitude faster than a high-precision grammar-based parser. Further, the 86.69% Smatch score of our MRS parser is higher than the upper-bound on AMR parsing, making MRS an attractive choice as a semantic representation.

[22]
Title: Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

Visual question answering (QA) has attracted a lot of attention lately, seen essentially as a form of (visual) Turing test that artificial intelligence should strive to achieve. In this paper, we study a crucial component of this task: how can we design good datasets for the task? We focus on the design of multiple-choice based datasets where the learner has to select the right answer from a set of candidate ones including the target (i.e. the correct one) and the decoys (i.e. the incorrect ones). Through careful analysis of the results attained by state-of-the-art learning models and human annotators on existing datasets, we show the design of the decoy answers has a significant impact on how and what the learning models learn from the datasets. In particular, the resulting learner can ignore the visual information, the question, or the both while still doing well on the task. Inspired by this, we propose automatic procedures to remedy such design deficiencies. We apply the procedures to re-construct decoy answers for two popular visual QA datasets as well as to create a new visual QA dataset from the Visual Genome project, resulting in the largest dataset for this task. Extensive empirical studies show that the design deficiencies have been alleviated in the remedied datasets and the performance on them is likely a more faithful indicator of the difference among learning models. The datasets are released and publicly available via this http URL

[23]
Title: An Analysis of Action Recognition Datasets for Language and Vision Tasks
Comments: To appear in Proceedings of ACL 2017, 8 pages
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

A large amount of recent research has focused on tasks that combine language and vision, resulting in a proliferation of datasets and methods. One such task is action recognition, whose applications include image annotation, scene under- standing and image retrieval. In this survey, we categorize the existing ap- proaches based on how they conceptualize this problem and provide a detailed review of existing datasets, highlighting their di- versity as well as advantages and disad- vantages. We focus on recently devel- oped datasets which link visual informa- tion with linguistic resources and provide a fine-grained syntactic and semantic anal- ysis of actions in images.

[24]
Title: Learning Symmetric Collaborative Dialogue Agents with Dynamic Knowledge Graph Embeddings
Subjects: Computation and Language (cs.CL)

We study a symmetric collaborative dialogue setting in which two agents, each with private knowledge, must strategically communicate to achieve a common goal. The open-ended dialogue state in this setting poses new challenges for existing dialogue systems. We collected a dataset of 11K human-human dialogues, which exhibits interesting lexical, semantic, and strategic elements. To model both structured knowledge and unstructured language, we propose a neural model with dynamic knowledge graph embeddings that evolve as the dialogue progresses. Automatic and human evaluations show that our model is both more effective at achieving the goal and more human-like than baseline neural and rule-based models.

[25]
Title: Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search
Authors: Chris Hokamp, Qun Liu
Comments: Accepted as a long paper at ACL 2017
Subjects: Computation and Language (cs.CL)

We present Grid Beam Search (GBS), an algorithm which extends beam search to allow the inclusion of pre-specified lexical constraints. The algorithm can be used with any model that generates a sequence $\mathbf{\hat{y}} = \{y_{0}\ldots y_{T}\}$, by maximizing $p(\mathbf{y} | \mathbf{x}) = \prod\limits_{t}p(y_{t} | \mathbf{x}; \{y_{0} \ldots y_{t-1}\})$. Lexical constraints take the form of phrases or words that must be present in the output sequence. This is a very general way to incorporate additional knowledge into a model's output without requiring any modification of the model parameters or training data. We demonstrate the feasibility and flexibility of Lexically Constrained Decoding by conducting experiments on Neural Interactive-Predictive Translation, as well as Domain Adaptation for Neural Machine Translation. Experiments show that GBS can provide large improvements in translation quality in interactive scenarios, and that, even without any user input, GBS can be used to achieve significant gains in performance in domain adaptation scenarios.

[26]
Title: Found in Translation: Reconstructing Phylogenetic Language Trees from Translations
Subjects: Computation and Language (cs.CL)

Translation has played an important role in trade, law, commerce, politics, and literature for thousands of years. Translators have always tried to be invisible; ideal translations should look as if they were written originally in the target language. We show that traces of the source language remain in the translation product to the extent that it is possible to uncover the history of the source language by looking only at the translation. Specifically, we automatically reconstruct phylogenetic language trees from monolingual texts (translated from several source languages). The signal of the source language is so powerful that it is retained even after two phases of translation. This strongly indicates that source language interference is the most dominant characteristic of translated texts, overshadowing the more subtle signals of universal properties of translation.

[27]
Title: Semi-supervised Multitask Learning for Sequence Labeling
Authors: Marek Rei
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

We propose a sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset. This language modeling objective incentivises the system to learn general-purpose patterns of semantic and syntactic composition, which are also useful for improving accuracy on different sequence labeling tasks. The architecture was evaluated on a range of datasets, covering the tasks of error detection in learner texts, named entity recognition, chunking and POS-tagging. The novel language modeling objective provided consistent performance improvements on every benchmark, without requiring any additional annotated or unannotated data.

[28]
Title: Watset: Automatic Induction of Synsets from a Graph of Synonyms
Comments: 12 pages, 3 figures, 6 tables, accepted to ACL 2017
Subjects: Computation and Language (cs.CL)

This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings. First, we build a weighted graph of synonyms extracted from commonly available resources, such as Wiktionary. Second, we apply word sense induction to deal with ambiguous words. Finally, we cluster the disambiguated version of the ambiguous input graph into synsets. Our meta-clustering approach lets us use an efficient hard clustering algorithm to perform a fuzzy clustering of the graph. Despite its simplicity, our approach shows excellent results, outperforming five competitive state-of-the-art methods in terms of F-score on three gold standard datasets for English and Russian derived from large-scale manually constructed lexical resources.

[29]
Title: What is the Essence of a Claim? Cross-Domain Claim Identification
Subjects: Computation and Language (cs.CL)

Argument mining has become a popular research area in NLP. It typically includes the identification of argumentative components, e.g. claims, as the central component of an argument. We perform a qualitative analysis across six different datasets and show that these appear to conceptualize claims quite differently. To learn about the consequences of such different conceptualizations of claim for practical applications, we carried out extensive experiments using state-of-the-art feature-rich and deep learning systems, to identify claims in a cross-domain fashion. While the divergent perception of claims in different datasets is indeed harmful to cross-domain classification, we show that there are shared properties on the lexical level as well as system configurations that can help to overcome these gaps.

[30]
Title: Turing at SemEval-2017 Task 8: Sequential Approach to Rumour Stance Classification with Branch-LSTM
Comments: SemEval 2017 RumourEval: Determining rumour veracity and support for rumours (SemEval 2017 Task 8, Subtask A)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

This paper describes team Turing's submission to SemEval 2017 RumourEval: Determining rumour veracity and support for rumours (SemEval 2017 Task 8, Subtask A). Subtask A addresses the challenge of rumour stance classification, which involves identifying the attitude of Twitter users towards the truthfulness of the rumour they are discussing. Stance classification is considered to be an important step towards rumour verification, therefore performing well in this task is expected to be useful in debunking false rumours. In this work we classify a set of Twitter posts discussing rumours into either supporting, denying, questioning or commenting on the underlying rumours. We propose a LSTM-based sequential model that, through modelling the conversational structure of tweets, which achieves an accuracy of 0.784 on the RumourEval test set outperforming all other systems in Subtask A.

[31]
Title: Joint Modeling of Text and Acoustic-Prosodic Cues for Neural Parsing
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Sound (cs.SD)

In conversational speech, the acoustic signal provides cues that help listeners disambiguate difficult parses. For automatically parsing a spoken utterance, we introduce a model that integrates transcribed text and acoustic-prosodic features using a convolutional neural network over energy and pitch trajectories coupled with an attention-based recurrent neural network that accepts text and word-based prosodic features. We find that different types of acoustic-prosodic features are individually helpful, and together improve parse F1 scores significantly over a strong text-only baseline. For this study with known sentence boundaries, error analysis shows that the main benefit of acoustic-prosodic features is in sentences with disfluencies and that attachment errors are most improved.

[32]
Title: A Trie-Structured Bayesian Model for Unsupervised Morphological Segmentation
Comments: 12 pages, accepted and presented at the CICLING 2017 - 18th International Conference on Intelligent Text Processing and Computational Linguistics
Subjects: Computation and Language (cs.CL)

In this paper, we introduce a trie-structured Bayesian model for unsupervised morphological segmentation. We adopt prior information from different sources in the model. We use neural word embeddings to discover words that are morphologically derived from each other and thereby that are semantically similar. We use letter successor variety counts obtained from tries that are built by neural word embeddings. Our results show that using different information sources such as neural word embeddings and letter successor variety as prior information improves morphological segmentation in a Bayesian model. Our model outperforms other unsupervised morphological segmentation models on Turkish and gives promising results on English and German for scarce resources.

### Replacements for Tue, 25 Apr 17

[33]  arXiv:1601.03896 (replaced) [pdf, ps, other]
Title: Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures
Comments: Journal of Artificial Intelligence Research 55, 409-442, 2016
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[34]  arXiv:1604.04562 (replaced) [pdf, other]
Title: A Network-based End-to-End Trainable Task-oriented Dialogue System
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
[35]  arXiv:1606.01549 (replaced) [pdf, other]
Title: Gated-Attention Readers for Text Comprehension
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
[36]  arXiv:1606.03632 (replaced) [pdf, other]
Title: Natural Language Generation in Dialogue using Lexicalized and Delexicalized Data
Subjects: Computation and Language (cs.CL)
[37]  arXiv:1608.05604 (replaced) [pdf, ps, other]
Title: Modeling Human Reading with Neural Attention
Comments: EMNLP 2016, pp. 85-95, Austin, TX
Subjects: Computation and Language (cs.CL)
[38]  arXiv:1611.00020 (replaced) [pdf, other]
Title: Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)
[39]  arXiv:1611.08034 (replaced) [pdf, other]
Title: Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
[40]  arXiv:1702.01569 (replaced) [pdf, other]
Title: Neural Semantic Parsing over Multiple Knowledge-bases
Subjects: Computation and Language (cs.CL)
[41]  arXiv:1702.02206 (replaced) [pdf, other]
Title: Semi-Supervised QA with Generative Domain-Adaptive Nets
Comments: Accepted as a long paper at ACL2017
Subjects: Computation and Language (cs.CL); Learning (cs.LG)
[42]  arXiv:1702.03525 (replaced) [pdf, other]
Title: Learning to Parse and Translate Improves Neural Machine Translation
Comments: Accepted as a short paper at the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017)
Subjects: Computation and Language (cs.CL)
[43]  arXiv:1704.04100 (replaced) [pdf, ps, other]
Title: Cross-lingual and cross-domain discourse segmentation of entire documents
Comments: To appear in Proceedings of ACL 2017
Subjects: Computation and Language (cs.CL)
[44]  arXiv:1704.06104 (replaced) [pdf, ps, other]
Title: Neural End-to-End Learning for Computational Argumentation Mining
Comments: To be published at ACL 2017
Subjects: Computation and Language (cs.CL)
[45]  arXiv:1408.4245 (replaced) [pdf, other]
Title: Towards crowdsourcing and cooperation in linguistic resources
Authors: Dmitry Ustalov
Comments: 11 pages, 2 figures, accepted to RuSSIR 2014, the final publication is available at link.springer.com
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL)
[46]  arXiv:1611.04642 (replaced) [pdf, other]
Title: Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)
[47]  arXiv:1611.08669 (replaced) [pdf, other]
Title: Visual Dialog
Comments: 23 pages, 18 figures, CVPR 2017 camera-ready, results on VisDial v0.9 dataset, Webpage: this http URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)
[48]  arXiv:1702.03274 (replaced) [pdf, other]
Title: Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning
Comments: Accepted as a long paper for the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017)
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[ total of 48 entries: 1-48 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)