We gratefully acknowledge support from
the Simons Foundation
and member institutions

Computation and Language

New submissions

[ total of 5 entries: 1-5 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 19 Jan 18

[1]  arXiv:1801.06024 [pdf, other]
Title: Natural Language Multitasking: Analyzing and Improving Syntactic Saliency of Hidden Representations
Comments: The 31st Annual Conference on Neural Information Processing (NIPS) - Workshop on Learning Disentangled Features: from Perception to Control, Long Beach, CA, December 2017
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

We train multi-task autoencoders on linguistic tasks and analyze the learned hidden sentence representations. The representations change significantly when translation and part-of-speech decoders are added. The more decoders a model employs, the better it clusters sentences according to their syntactic similarity, as the representation space becomes less entangled. We explore the structure of the representation space by interpolating between sentences, which yields interesting pseudo-English sentences, many of which have recognizable syntactic structure. Lastly, we point out an interesting property of our models: The difference-vector between two sentences can be added to change a third sentence with similar features in a meaningful way.

[2]  arXiv:1801.06146 [pdf, ps, other]
Title: Fine-tuned Language Models for Text Classification
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)

Transfer learning has revolutionized computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch. We propose Fine-tuned Language Models (FitLaM), an effective transfer learning method that can be applied to any task in NLP, and introduce techniques that are key for fine-tuning a state-of-the-art language model. Our method significantly outperforms the state-of-the-art on five text classification tasks, reducing the error by 18-24% on the majority of datasets. We open-source our pretrained models and code to enable adoption by the community.

[3]  arXiv:1801.06172 [pdf, ps, other]
Title: Contextual and Position-Aware Factorization Machines for Sentiment Classification
Subjects: Computation and Language (cs.CL)

While existing machine learning models have achieved great success for sentiment classification, they typically do not explicitly capture sentiment-oriented word interaction, which can lead to poor results for fine-grained analysis at the snippet level (a phrase or sentence). Factorization Machine provides a possible approach to learning element-wise interaction for recommender systems, but they are not directly applicable to our task due to the inability to model contexts and word sequences. In this work, we develop two Position-aware Factorization Machines which consider word interaction, context and position information. Such information is jointly encoded in a set of sentiment-oriented word interaction vectors. Compared to traditional word embeddings, SWI vectors explicitly capture sentiment-oriented word interaction and simplify the parameter learning. Experimental results show that while they have comparable performance with state-of-the-art methods for document-level classification, they benefit the snippet/sentence-level sentiment analysis.

[4]  arXiv:1801.06176 [pdf, other]
Title: Integrating planning for task-completion dialogue policy learning
Comments: 11 pages, 6 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Training a task-completion dialogue agent with real users via reinforcement learning (RL) could be prohibitively expensive, because it requires many interactions with users. One alternative is to resort to a user simulator, while the discrepancy of between simulated and real users makes the learned policy unreliable in practice. This paper addresses these challenges by integrating planning into the dialogue policy learning based on Dyna-Q framework, and provides a more sample-efficient approach to learn the dialogue polices. The proposed agent consists of a planner trained on-line with limited real user experience that can generate large amounts of simulated experience to supplement with limited real user experience, and a policy model trained on these hybrid experiences. The effectiveness of our approach is validated on a movie-booking task in both a simulation setting and a human-in-the-loop setting.

Replacements for Fri, 19 Jan 18

[5]  arXiv:1712.01769 (replaced) [pdf, other]
Title: State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[ total of 5 entries: 1-5 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)