We gratefully acknowledge support from
the Simons Foundation
and member institutions

Computation and Language

New submissions

[ total of 12 entries: 1-12 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 22 Sep 17

[1]  arXiv:1709.07104 [pdf, ps, other]
Title: On the Use of Machine Translation-Based Approaches for Vietnamese Diacritic Restoration
Comments: 4 pages, 2 figures, 4 tables, submitted to IALP 2017
Subjects: Computation and Language (cs.CL)

This paper presents an empirical study of two machine translation-based approaches for Vietnamese diacritic restoration problem, including phrase-based and neural-based machine translation models. This is the first work that applies neural-based machine translation method to this problem and gives a thorough comparison to the phrase-based machine translation method which is the current state-of-the-art method for this problem. On a large dataset, the phrase-based approach has an accuracy of 97.32% while that of the neural-based approach is 96.15%. While the neural-based method has a slightly lower accuracy, it is about twice faster than the phrase-based method in terms of inference speed. Moreover, neural-based machine translation method has much room for future improvement such as incorporating pre-trained word embeddings and collecting more training data.

[2]  arXiv:1709.07109 [pdf, other]
Title: Deconvolutional Latent-Variable Model for Text Sequence Matching
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)

A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives. To alleviate typical optimization challenges in latent-variable models for text, we employ deconvolutional networks as the sequence decoder (generator), providing learned latent codes with more semantic information and better generalization. Our model, trained in an unsupervised manner, yields stronger empirical predictive performance than a decoder based on Long Short-Term Memory (LSTM), with less parameters and considerably faster training. Further, we apply it to text sequence-matching problems. The proposed model significantly outperforms several strong sentence-encoding baselines, especially in the semi-supervised setting.

[3]  arXiv:1709.07276 [pdf, other]
Title: Speech Recognition Challenge in the Wild: Arabic MGB-3
Subjects: Computation and Language (cs.CL)

This paper describes the Arabic MGB-3 Challenge - Arabic Speech Recognition in the Wild. Unlike last year's Arabic MGB-2 Challenge, for which the recognition task was based on more than 1,200 hours broadcast TV news recordings from Aljazeera Arabic TV programs, MGB-3 emphasises dialectal Arabic using a multi-genre collection of Egyptian YouTube videos. Seven genres were used for the data collection: comedy, cooking, family/kids, fashion, drama, sports, and science (TEDx). A total of 16 hours of videos, split evenly across the different genres, were divided into adaptation, development and evaluation data sets. The Arabic MGB-Challenge comprised two tasks: A) Speech transcription, evaluated on the MGB-3 test set, along with the 10 hour MGB-2 test set to report progress on the MGB-2 evaluation; B) Arabic dialect identification, introduced this year in order to distinguish between four major Arabic dialects - Egyptian, Levantine, North African, Gulf, as well as Modern Standard Arabic. Two hours of audio per dialect were released for development and a further two hours were used for evaluation. For dialect identification, both lexical features and i-vector bottleneck features were shared with participants in addition to the raw audio recordings. Overall, thirteen teams submitted ten systems to the challenge. We outline the approaches adopted in each system, and summarise the evaluation results.

[4]  arXiv:1709.07357 [pdf]
Title: Retrofitting Concept Vector Representations of Medical Concepts to Improve Estimates of Semantic Similarity and Relatedness
Comments: To appear in: Proceedings of the 16th World Congress on Medical and Health Informatics 21st-25th August Hangzhou, China (2017). Please visit and cite the canonical version once available
Subjects: Computation and Language (cs.CL)

Estimation of semantic similarity and relatedness between biomedical concepts has utility for many informatics applications. Automated methods fall into two categories: methods based on distributional statistics drawn from text corpora, and methods using the structure of existing knowledge resources. Methods in the former category disregard taxonomic structure, while those in the latter fail to consider semantically relevant empirical information. In this paper, we present a method that retrofits distributional context vector representations of biomedical concepts using structural information from the UMLS Metathesaurus, such that the similarity between vector representations of linked concepts is augmented. We evaluated it on the UMNSRS benchmark. Our results demonstrate that retrofitting of concept vector representations leads to better correlation with human raters for both similarity and relatedness, surpassing the best results reported to date. They also demonstrate a clear improvement in performance on this reference standard for retrofitted vector representations, as compared to those without retrofitting.

[5]  arXiv:1709.07403 [pdf, other]
Title: Inducing Distant Supervision in Suggestion Mining through Part-of-Speech Embeddings
Subjects: Computation and Language (cs.CL)

Mining suggestion expressing sentences from a given text is a less investigated sentence classification task, and therefore lacks hand labeled benchmark datasets. In this work, we propose and evaluate two approaches for distant supervision in suggestion mining. The distant supervision is obtained through a large silver standard dataset, constructed using the text from wikiHow and Wikipedia. Both the approaches use a LSTM based neural network architecture to learn a classification model for suggestion mining, but vary in their method to use the silver standard dataset. The first approach directly trains the classifier using this dataset, while the second approach only learns word embeddings from this dataset. In the second approach, we also learn POS embeddings, which interestingly gives the best classification accuracy.

[6]  arXiv:1709.07434 [pdf, other]
Title: Analyzing users' sentiment towards popular consumer industries and brands on Twitter
Comments: 8 pages, 11 figures, 1 table, 2017 IEEE International Conference on Data Mining Workshops (ICDMW 2017), ICDM Sentiment Elicitation from Natural Text for Information Retrieval and Extraction (ICDM SENTIRE) 2017 workshop
Journal-ref: 2017 IEEE International Conference on Data Mining Workshops (ICDMW 2017)
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Social and Information Networks (cs.SI)

Social media serves as a unified platform for users to express their thoughts on subjects ranging from their daily lives to their opinion on consumer brands and products. These users wield an enormous influence in shaping the opinions of other consumers and influence brand perception, brand loyalty and brand advocacy. In this paper, we analyze the opinion of 19M Twitter users towards 62 popular industries, encompassing 12,898 enterprise and consumer brands, as well as associated subject matter topics, via sentiment analysis of 330M tweets over a period spanning a month. We find that users tend to be most positive towards manufacturing and most negative towards service industries. In addition, they tend to be more positive or negative when interacting with brands than generally on Twitter. We also find that sentiment towards brands within an industry varies greatly and we demonstrate this using two industries as use cases. In addition, we discover that there is no strong correlation between topic sentiments of different industries, demonstrating that topic sentiments are highly dependent on the context of the industry that they are mentioned in. We demonstrate the value of such an analysis in order to assess the impact of brands on social media. We hope that this initial study will prove valuable for both researchers and companies in understanding users' perception of industries, brands and associated topics and encourage more research in this field.

Cross-lists for Fri, 22 Sep 17

[7]  arXiv:1709.07432 (cross-list from cs.NE) [pdf, other]
Title: Dynamic Evaluation of Neural Sequence Models
Subjects: Neural and Evolutionary Computing (cs.NE); Computation and Language (cs.CL)

We present methodology for using dynamic evaluation to improve neural sequence models. Models are adapted to recent history via a gradient descent based mechanism, allowing them to assign higher probabilities to re-occurring sequential patterns. Dynamic evaluation is demonstrated to compare favourably with existing adaptation approaches for language modelling. We apply dynamic evaluation to improve the state of the art word-level perplexities on the Penn Treebank and WikiText-2 datasets to 51.1 and 44.3 respectively, and the state of the art character-level cross-entropy on the Hutter prize dataset to 1.17 bits/character.

Replacements for Fri, 22 Sep 17

[8]  arXiv:1705.08947 (replaced) [pdf, other]
Title: Deep Voice 2: Multi-Speaker Neural Text-to-Speech
Comments: Accepted in NIPS 2017
Subjects: Computation and Language (cs.CL)
[9]  arXiv:1708.07241 (replaced) [pdf, other]
Title: NNVLP: A Neural Network-Based Vietnamese Language Processing Toolkit
Comments: 4 pages, 5 figures, 5 tables, accepted to IJCNLP 2017, updated experiment results
Subjects: Computation and Language (cs.CL)
[10]  arXiv:1708.09803 (replaced) [pdf, other]
Title: Transfer Learning across Low-Resource, Related Languages for Neural Machine Translation
Subjects: Computation and Language (cs.CL)
[11]  arXiv:1709.00103 (replaced) [pdf, other]
Title: Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning
Comments: 12 pages, 5 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
[12]  arXiv:1709.04558 (replaced) [pdf]
Title: Using NLU in Context for Question Answering: Improving on Facebook's bAbI Tasks
Authors: John S. Ball
Comments: 38 Pages, 10 Tables
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
[ total of 12 entries: 1-12 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)