We gratefully acknowledge support from
the Simons Foundation
and member institutions

Computation and Language

New submissions

[ total of 12 entries: 1-12 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 26 May 17

[1]  arXiv:1705.08942 [pdf, other]
Title: Joint PoS Tagging and Stemming for Agglutinative Languages
Comments: 12 pages with 3 figures, accepted and presented at the CICLING 2017 - 18th International Conference on Intelligent Text Processing and Computational Linguistics
Journal-ref: CICLING 2017
Subjects: Computation and Language (cs.CL)

The number of word forms in agglutinative languages is theoretically infinite and this variety in word forms introduces sparsity in many natural language processing tasks. Part-of-speech tagging (PoS tagging) is one of these tasks that often suffers from sparsity. In this paper, we present an unsupervised Bayesian model using Hidden Markov Models (HMMs) for joint PoS tagging and stemming for agglutinative languages. We use stemming to reduce sparsity in PoS tagging. Two tasks are jointly performed to provide a mutual benefit in both tasks. Our results show that joint POS tagging and stemming improves PoS tagging scores. We present results for Turkish and Finnish as agglutinative languages and English as a morphologically poor language.

[2]  arXiv:1705.08947 [pdf, other]
Title: Deep Voice 2: Multi-Speaker Neural Text-to-Speech
Comments: Submitted to NIPS 2017
Subjects: Computation and Language (cs.CL)

We introduce a technique for augmenting neural text-to-speech (TTS) with lowdimensional trainable speaker embeddings to generate different voices from a single model. As a starting point, we show improvements over the two state-ofthe-art approaches for single-speaker neural TTS: Deep Voice 1 and Tacotron. We introduce Deep Voice 2, which is based on a similar pipeline with Deep Voice 1, but constructed with higher performance building blocks and demonstrates a significant audio quality improvement over Deep Voice 1. We improve Tacotron by introducing a post-processing neural vocoder, and demonstrate a significant audio quality improvement. We then demonstrate our technique for multi-speaker speech synthesis for both Deep Voice 2 and Tacotron on two multi-speaker TTS datasets. We show that a single neural TTS system can learn hundreds of unique voices from less than half an hour of data per speaker, while achieving high audio quality synthesis and preserving the speaker identities almost perfectly.

[3]  arXiv:1705.09054 [pdf, ps, other]
Title: Max-Cosine Matching Based Neural Models for Recognizing Textual Entailment
Journal-ref: DASFAA (1) 2017: 295-308
Subjects: Computation and Language (cs.CL)

Recognizing textual entailment is a fundamental task in a variety of text mining or natural language processing applications. This paper proposes a simple neural model for RTE problem. It first matches each word in the hypothesis with its most-similar word in the premise, producing an augmented representation of the hypothesis conditioned on the premise as a sequence of word pairs. The LSTM model is then used to model this augmented sequence, and the final output from the LSTM is fed into a softmax layer to make the prediction. Besides the base model, in order to enhance its performance, we also proposed three techniques: the integration of multiple word-embedding library, bi-way integration, and ensemble based on model averaging. Experimental results on the SNLI dataset have shown that the three techniques are effective in boosting the predicative accuracy and that our method outperforms several state-of-the-state ones.

[4]  arXiv:1705.09189 [pdf, other]
Title: Jointly Learning Sentence Embeddings and Syntax with Unsupervised Tree-LSTMs
Subjects: Computation and Language (cs.CL)

We introduce a neural network that represents sentences by composing their words according to induced binary parse trees. We use Tree-LSTM as our composition function, applied along a tree structure found by a fully differentiable natural language chart parser. Our model simultaneously optimises both the composition function and the parser, thus eliminating the need for externally-provided parse trees which are normally required for Tree-LSTM. It can therefore be seen as a tree-based RNN that is unsupervised with respect to the parse trees. As it is fully differentiable, our model is easily trained with an off-the-shelf gradient descent method and backpropagation. We demonstrate that it achieves better performance compared to various supervised Tree-LSTM architectures on a textual entailment task and a reverse dictionary task.

[5]  arXiv:1705.09207 [pdf, other]
Title: Learning Structured Text Representations
Subjects: Computation and Language (cs.CL)

In this paper, we focus on learning structure-aware document representations from data without recourse to a discourse parser or additional annotations. Drawing inspiration from recent efforts to empower neural networks with a structural bias, we propose a model that can encode a document while automatically inducing rich structural dependencies. Specifically, we embed a differentiable non-projective parsing algorithm into a neural model and use attention mechanisms to incorporate the structural biases. Experimental evaluation across different tasks and datasets shows that the proposed model achieves state-of-the-art results on document modeling tasks while inducing intermediate structures which are both interpretable and meaningful.

Cross-lists for Fri, 26 May 17

[6]  arXiv:1705.08992 (cross-list from cs.DM) [pdf, other]
Title: Matroids Hitting Sets and Unsupervised Dependency Grammar Induction
Subjects: Discrete Mathematics (cs.DM); Computation and Language (cs.CL); Data Structures and Algorithms (cs.DS)

This paper formulates a novel problem on graphs: find the minimal subset of edges in a fully connected graph, such that the resulting graph contains all spanning trees for a set of specifed sub-graphs. This formulation is motivated by an un-supervised grammar induction problem from computational linguistics. We present a reduction to some known problems and algorithms from graph theory, provide computational complexity results, and describe an approximation algorithm.

[7]  arXiv:1705.09037 (cross-list from cs.NE) [pdf, other]
Title: Deriving Neural Architectures from Sequence and Graph Kernels
Comments: to appear at ICML 2017; includes additional discussions
Subjects: Neural and Evolutionary Computing (cs.NE); Computation and Language (cs.CL)

The design of neural architectures for structured objects is typically guided by experimental insights rather than a formal process. In this work, we appeal to kernels over combinatorial structures, such as sequences and graphs, to derive appropriate neural operations. We introduce a class of deep recurrent neural operations and formally characterize their associated kernel spaces. Our recurrent modules compare the input to virtual reference objects (cf. filters in CNN) via the kernels. Similar to traditional neural operations, these reference objects are parameterized and directly optimized in end-to-end training. We empirically evaluate the proposed class of neural architectures on standard applications such as language modeling and molecular graph regression, achieving state-of-the-art or competitive results across these applications. We also draw connections to existing architectures such as LSTMs.

[8]  arXiv:1705.09222 (cross-list from cs.HC) [pdf, other]
Title: Towards a Knowledge Graph based Speech Interface
Comments: Under Review in International Workshop on Grounding Language Understanding, Satellite of Interspeech 2017
Subjects: Human-Computer Interaction (cs.HC); Computation and Language (cs.CL)

Applications which use human speech as an input require a speech interface with high recognition accuracy. The words or phrases in the recognised text are annotated with a machine-understandable meaning and linked to knowledge graphs for further processing by the target application. These semantic annotations of recognised words can be represented as a subject-predicate-object triples which collectively form a graph often referred to as a knowledge graph. This type of knowledge representation facilitates to use speech interfaces with any spoken input application, since the information is represented in logical, semantic form, retrieving and storing can be followed using any web standard query languages. In this work, we develop a methodology for linking speech input to knowledge graphs and study the impact of recognition errors in the overall process. We show that for a corpus with lower WER, the annotation and linking of entities to the DBpedia knowledge graph is considerable. DBpedia Spotlight, a tool to interlink text documents with the linked open data is used to link the speech recognition output to the DBpedia knowledge graph. Such a knowledge-based speech recognition interface is useful for applications such as question answering or spoken dialog systems.

Replacements for Fri, 26 May 17

[9]  arXiv:1701.07481 (replaced) [pdf, other]
Title: Learning Word-Like Units from Joint Audio-Visual Analysis
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[10]  arXiv:1705.07368 (replaced) [pdf, other]
Title: Mixed Membership Word Embeddings for Computational Social Science
Authors: James Foulds
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)
[11]  arXiv:1608.07187 (replaced) [pdf, other]
Title: Semantics derived automatically from language corpora contain human-like biases
Comments: 14 pages, 3 figures
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computers and Society (cs.CY); Learning (cs.LG)
[12]  arXiv:1705.07136 (replaced) [pdf, ps, other]
Title: Softmax Q-Distribution Estimation for Structured Prediction: A Theoretical Interpretation for RAML
Comments: Under Review of NIPS 2017
Subjects: Learning (cs.LG); Computation and Language (cs.CL); Machine Learning (stat.ML)
[ total of 12 entries: 1-12 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)