We gratefully acknowledge support from
the Simons Foundation
and member institutions

Computation and Language

New submissions

[ total of 9 entries: 1-9 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Thu, 23 Mar 17

[1]  arXiv:1703.07438 [pdf, other]
Title: The NLTK FrameNet API: Designing for Discoverability with a Rich Linguistic Resource
Subjects: Computation and Language (cs.CL)

A new Python API, integrated within the NLTK suite, offers access to the FrameNet 1.7 lexical database. The lexicon (structured in terms of frames) as well as annotated sentences can be processed programatically, or browsed with human-readable displays via the interactive Python prompt.

[2]  arXiv:1703.07476 [pdf, other]
Title: Topic Identification for Speech without ASR
Comments: 5 pages, 2 figures, submitted to Interspeech 2017
Subjects: Computation and Language (cs.CL)

Modern topic identification (topic ID) systems for speech use automatic speech recognition (ASR) to produce speech transcripts, and perform supervised classification on such ASR outputs. However, under resource-limited conditions, the manually transcribed speech required to develop standard ASR systems can be severely limited or unavailable. In this paper, we investigate alternative unsupervised solutions to obtaining tokenizations of speech in terms of a vocabulary of automatically discovered word-like or phoneme-like units, without depending on the supervised training of ASR systems. Moreover, using automatic phoneme-like tokenizations, we demonstrate that a convolutional neural network based framework for learning spoken document representations provides competitive performance compared to a standard bag-of-words representation, as evidenced by comprehensive topic ID evaluations on both single-label and multi-label classification tasks.

[3]  arXiv:1703.07713 [pdf, other]
Title: Hierarchical RNN with Static Sentence-Level Attention for Text-Based Speaker Change Detection
Subjects: Computation and Language (cs.CL)

Traditional speaker change detection in dialogues is typically based on audio input. In some scenarios, however, researchers can only obtain text, and do not have access to raw audio signals. Moreover, with the increasing need of deep semantic processing, text-based dialogue understanding is attracting more attention in the community. These raise the problem of text-based speaker change detection. In this paper, we formulate the task as a matching problem of utterances before and after a certain decision point; we propose a hierarchical recurrent neural network (RNN) with static sentence-level attention. Our model comprises three main components: a sentence encoder with a long short term memory (LSTM)-based RNN, a context encoder with another LSTM-RNN, and a static sentence-level attention mechanism, which allows rich information interaction. Experimental results show that neural networks consistently achieve better performance than feature-based approaches, and that our attention-based model significantly outperforms non-attention neural networks.

[4]  arXiv:1703.07754 [pdf, other]
Title: Direct Acoustics-to-Word Models for English Conversational Speech Recognition
Comments: Submitted to Interspeech-2017
Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Recent work on end-to-end automatic speech recognition (ASR) has shown that the connectionist temporal classification (CTC) loss can be used to convert acoustics to phone or character sequences. Such systems are used with a dictionary and separately-trained Language Model (LM) to produce word sequences. However, they are not truly end-to-end in the sense of mapping acoustics directly to words without an intermediate phone representation. In this paper, we present the first results employing direct acoustics-to-word CTC models on two well-known public benchmark tasks: Switchboard and CallHome. These models do not require an LM or even a decoder at run-time and hence recognize speech with minimal complexity. However, due to the large number of word output units, CTC word models require orders of magnitude more data to train reliably compared to traditional systems. We present some techniques to mitigate this issue. Our CTC word model achieves a word error rate of 13.0%/18.8% on the Hub5-2000 Switchboard/CallHome test sets without any LM or decoder compared with 9.6%/16.0% for phone-based CTC with a 4-gram LM. We also present rescoring results on CTC word model lattices to quantify the performance benefits of a LM, and contrast the performance of word and phone CTC models.

Cross-lists for Thu, 23 Mar 17

[5]  arXiv:1703.07588 (cross-list from cs.SD) [pdf, other]
Title: Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries
Comments: 5 pages
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Learning (cs.LG)

In this paper we analyze the gate activation signals inside the gated recurrent neural networks, and find the temporal structure of such signals is highly correlated with the phoneme boundaries. This correlation is further verified by a set of experiments for phoneme segmentation, in which better results compared to standard approaches were obtained.

Replacements for Thu, 23 Mar 17

[6]  arXiv:1607.01963 (replaced) [pdf, other]
Title: Sequence Training and Adaptation of Highway Deep Neural Networks
Authors: Liang Lu
Comments: 6 pages, 3 figures, published at IEEE SLT 2016. arXiv admin note: text overlap with arXiv:1610.05812
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
[7]  arXiv:1612.01556 (replaced) [pdf]
Title: The Evolution of Sentiment Analysis - A Review of Research Topics, Venues, and Top Cited Papers
Comments: 28 pages, 14 figures
Subjects: Computation and Language (cs.CL); Digital Libraries (cs.DL); Social and Information Networks (cs.SI)
[8]  arXiv:1703.03906 (replaced) [pdf, other]
Title: Massive Exploration of Neural Machine Translation Architectures
Comments: 9 pages, 2 figures, 8 tables, submitted to ACL 2017, open source code at this https URL
Subjects: Computation and Language (cs.CL)
[9]  arXiv:1606.07006 (replaced) [pdf, other]
Title: Using Word Embeddings in Twitter Election Classification
Comments: NeuIR Workshop 2016
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
[ total of 9 entries: 1-9 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)