We gratefully acknowledge support from
the Simons Foundation
and member institutions

Computer Science

New submissions

[ total of 215 entries: 1-215 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 18 May 18

[1]  arXiv:1805.06454 [pdf, other]
Title: Unsupervised Machine Learning Based on Non-Negative Tensor Factorization for Analyzing Reactive-Mixing
Comments: 27 pages
Subjects: Computational Engineering, Finance, and Science (cs.CE)

Analysis of reactive-diffusion simulations requires a large number of independent model runs. For each high-fidelity simulation, inputs are varied and the predicted mixing behavior is represented by changes in species concentration. It is then required to discern how the model inputs impact the mixing process. This task is challenging and typically involves interpretation of large model outputs. However, the task can be automated and substantially simplified by applying Machine Learning (ML) methods. In this paper, we present an application of an unsupervised ML method (called NTFk) using Non-negative Tensor Factorization (NTF) coupled with a custom clustering procedure based on k-means to reveal hidden features in product concentration. An attractive aspect of the proposed ML method is that it ensures the extracted features are non-negative, which are important to obtain a meaningful deconstruction of the mixing processes. The ML method is applied to a large set of high-resolution FEM simulations representing reaction-diffusion processes in perturbed vortex-based velocity fields. The applied FEM ensures that species concentration are always non-negative. The simulated reaction is a fast irreversible bimolecular reaction. The reactive-diffusion model input parameters that control mixing include properties of velocity field, anisotropic dispersion, and molecular diffusion. We demonstrate the applicability of the ML method to produce a meaningful deconstruction of model outputs to discriminate between different physical processes impacting the reactants, their mixing, and the spatial distribution of the product. The presented ML analysis allowed us to identify additive features that characterize mixing behavior.

[2]  arXiv:1805.06474 [pdf, other]
Title: Understanding Federation: An Analytical Framework for the Interoperability of Social Networking Sites
Subjects: Social and Information Networks (cs.SI)

Although social networking has become a remarkable feature in the Web, full interoperability has not arrived. This work explores the main 5 paradigms of interoperability across social networking sites, corresponding to the layers in which we an find interoperability. Building on those, a novel analytical framework for SNS interoperability is introduced. Seven representative interoperability SNS technologies are compared using the proposed framework. The analysis exposes an overwhelming disparity and fragmentation in the solutions for tackling the same problems. Although there are a few solutions where consensus is reached and are widely adopted (e.g. in object IDs), there are multiple central issues that are still far from being widely standarized (e.g. in profile representation). In addition, several areas have been identified where there is clear room for improvement, such as privacy controls or data synchronization.

[3]  arXiv:1805.06485 [pdf, other]
Title: QuaterNet: A Quaternion-based Recurrent Model for Human Motion
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Deep learning for predicting or generating 3D human pose sequences is an active research area. Previous work regresses either joint rotations or joint positions. The former strategy is prone to error accumulation along the kinematic chain, as well as discontinuities when using Euler angle or exponential map parameterizations. The latter requires re-projection onto skeleton constraints to avoid bone stretching and invalid configurations. This work addresses both limitations. Our recurrent network, QuaterNet, represents rotations with quaternions and our loss function performs forward kinematics on a skeleton to penalize absolute position errors instead of angle errors. On short-term predictions, QuaterNet improves the state-of-the-art quantitatively. For long-term generation, our approach is qualitatively judged as realistic as recent neural strategies from the graphics literature.

[4]  arXiv:1805.06499 [pdf, ps, other]
Title: QoE-Aware Beamforming Design for Massive MIMO Heterogeneous Networks
Comments: Submitted to IEEE Transactions on Vehicular Technology
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

One of the main goals of the future wireless networks is improving the users quality of experience (QoE). In this paper, we consider the problem of QoE-based resource allocation in the downlink of a massive multiple-input multiple-output (MIMO) heterogeneous network (HetNet). The network consists of a macro cell with a number of small cells embedded in it. The small cells base stations (BSs) are equipped with a few antennas, while the macro BS is equipped with a massive number of antennas. We consider the two services Video and Web Browsing and design the beamforming vectors at the BSs. The objective is to maximize the aggregated Mean Opinion Score (MOS) of the users under constraints on the BSs powers and the required quality of service (QoS) of the users. We also consider extra constraints on the QoE of users to more strongly enforce the QoE in the beamforming design. To reduce the complexity of the optimization problem, we suggest suboptimal and computationally efficient solutions. Our results illustrate that increasing the number of antennas at the BSs and also increasing the number of small cells antennas in the network leads to a higher user satisfaction.

[5]  arXiv:1805.06502 [pdf, other]
Title: First Experiments with Neural Translation of Informal to Formal Mathematics
Comments: Submission to CICM'2018
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG); Logic in Computer Science (cs.LO)

We report on our first experiments to train deep neural networks that automatically translate informalized $\LaTeX{}$-written Mizar texts into the formal Mizar language. Using Luong et al.'s neural machine translation model (NMT), we tested our aligned informal-formal corpora against various hyperparameters and evaluated their results. Our experiments show that NMT is able to generate correct Mizar statements on more than 60 percent of the inference data, indicating that formalization through artificial neural network is a promising approach for automated formalization of mathematics. We present several case studies to illustrate our results.

[6]  arXiv:1805.06503 [pdf, other]
Title: Weight Initialization in Neural Language Models
Comments: 17 pages, 20 figures and/or tables
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Semantic Similarity is an important application which finds its use in many downstream NLP applications. Though the task is mathematically defined, semantic similarity's essence is to capture the notions of similarity impregnated in humans. Machines use some heuristics to calculate the similarity between words, but these are typically corpus dependent or are useful for specific domains. The difference between Semantic Similarity and Semantic Relatedness motivates the development of new algorithms. For a human, the word car and road are probably as related as car and bus. But this may not be the case for computational methods. Ontological methods are good at encoding Semantic Similarity and Vector Space models are better at encoding Semantic Relatedness. There is a dearth of methods which leverage ontologies to create better vector representations. The aim of this proposal is to explore in the direction of a hybrid method which combines statistical/vector space methods like Word2Vec and Ontological methods like WordNet to leverage the advantages provided by both.

[7]  arXiv:1805.06504 [pdf, other]
Title: Analogical Reasoning on Chinese Morphological and Semantic Relations
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Analogical reasoning is effective in capturing linguistic regularities. This paper proposes an analogical reasoning task on Chinese. After delving into Chinese lexical knowledge, we sketch 68 implicit morphological relations and 28 explicit semantic relations. A big and balanced dataset CA8 is then built for this task, including 17813 questions. Furthermore, we systematically explore the influences of vector representations, context features, and corpora on analogical reasoning. With the experiments, CA8 is proved to be a reliable benchmark for evaluating Chinese word embeddings.

[8]  arXiv:1805.06510 [pdf, other]
Title: Facebook Reaction-Based Emotion Classifier as Cue for Sarcasm Detection
Comments: 10 pages ACM format
Subjects: Computation and Language (cs.CL); Computers and Society (cs.CY)

Online social media users react to content in them based on context. Emotions or mood play a significant part of these reactions, which has filled these platforms with opinionated content. Different approaches and applications to make better use of this data are continuously being developed. However, due to the nature of the data, the variety of platforms, and dynamic online user behavior, there are still many issues to be dealt with. It remains a challenge to properly obtain a reliable emotional status from a user prior to posting a comment. This work introduces a methodology that explores semi-supervised multilingual emotion detection based on the overlap of Facebook reactions and textual data. With the resulting emotion detection system we evaluate the possibility of using emotions and user behavior features for the task of sarcasm detection. More than 1 million English and Chinese comments from over 62,000 public Facebook pages posts have been collected and processed, conducted experiments show acceptable performance metrics.

[9]  arXiv:1805.06511 [pdf, ps, other]
Title: Improving End-of-turn Detection in Spoken Dialogues by Detecting Speaker Intentions as a Secondary Task
Comments: ICASSP 2018
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

This work focuses on the use of acoustic cues for modeling turn-taking in dyadic spoken dialogues. Previous work has shown that speaker intentions (e.g., asking a question, uttering a backchannel, etc.) can influence turn-taking behavior and are good predictors of turn-transitions in spoken dialogues. However, speaker intentions are not readily available for use by automated systems at run-time; making it difficult to use this information to anticipate a turn-transition. To this end, we propose a multi-task neural approach for predicting turn- transitions and speaker intentions simultaneously. Our results show that adding the auxiliary task of speaker intention prediction improves the performance of turn-transition prediction in spoken dialogues, without relying on additional input features during run-time.

[10]  arXiv:1805.06515 [pdf, ps, other]
Title: Remote Source Coding under Gaussian Noise : Dueling Roles of Power and Entropy Power
Subjects: Information Theory (cs.IT)

The distributed remote source coding (so-called CEO) problem is studied in the case where the underlying source has finite differential entropy and the observation noise is Gaussian. The main result is a new lower bound for the sum-rate-distortion function under arbitrary distortion measures. When specialized to the case of mean-squared error, it is shown that the bound exactly mirrors a corresponding upper bound, except that the upper bound has the source power (variance) whereas the lower bound has the source entropy power. Bounds exhibiting this pleasing duality of power and entropy power have been well known for direct and centralized source coding since Shannon's work.

[11]  arXiv:1805.06521 [pdf, ps, other]
Title: Composite Semantic Relation Classification
Comments: 12 pages, 3 figures, conference
Subjects: Computation and Language (cs.CL)

Different semantic interpretation tasks such as text entailment and question answering require the classification of semantic relations between terms or entities within text. However, in most cases it is not possible to assign a direct semantic relation between entities/terms. This paper proposes an approach for composite semantic relation classification, extending the traditional semantic relation classification task. Different from existing approaches, which use machine learning models built over lexical and distributional word vector features, the proposed model uses the combination of a large commonsense knowledge base of binary relations, a distributional navigational algorithm and sequence classification to provide a solution for the composite semantic relation classification problem.

[12]  arXiv:1805.06522 [pdf, other]
Title: Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual Semantic Relatedness Using Machine Translation
Comments: 10 pages, 1 figure, conference
Subjects: Computation and Language (cs.CL)

This paper provides a comparative analysis of the performance of four state-of-the-art distributional semantic models (DSMs) over 11 languages, contrasting the native language-specific models with the use of machine translation over English-based DSMs. The experimental results show that there is a significant improvement (average of 16.7% for the Spearman correlation) by using state-of-the-art machine translation approaches. The results also show that the benefit of using the most informative corpus outweighs the possible errors introduced by the machine translation. For all languages, the combination of machine translation over the Word2Vec English distributional model provided the best results consistently (average Spearman correlation of 0.68).

[13]  arXiv:1805.06523 [pdf, other]
Title: End-to-end Learning of a Convolutional Neural Network via Deep Tensor Decomposition
Comments: 29 pages, 12 figures
Subjects: Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)

In this paper we study the problem of learning the weights of a deep convolutional neural network. We consider a network where convolutions are carried out over non-overlapping patches with a single kernel in each layer. We develop an algorithm for simultaneously learning all the kernels from the training data. Our approach dubbed Deep Tensor Decomposition (DeepTD) is based on a rank-1 tensor decomposition. We theoretically investigate DeepTD under a realizable model for the training data where the inputs are chosen i.i.d. from a Gaussian distribution and the labels are generated according to planted convolutional kernels. We show that DeepTD is data-efficient and provably works as soon as the sample size exceeds the total number of convolutional weights in the network. We carry out a variety of numerical experiments to investigate the effectiveness of DeepTD and verify our theoretical findings.

[14]  arXiv:1805.06524 [pdf]
Title: Hybrid Adaptive Fuzzy Extreme Learning Machine for text classification
Comments: 2 pages
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)

In traditional ELM and its improved versions suffer from the problems of outliers or noises due to overfitting and imbalance due to distribution. We propose a novel hybrid adaptive fuzzy ELM(HA-FELM), which introduces a fuzzy membership function to the traditional ELM method to deal with the above problems. We define the fuzzy membership function not only basing on the distance between each sample and the center of the class but also the density among samples which based on the quantum harmonic oscillator model. The proposed fuzzy membership function overcomes the shortcoming of the traditional fuzzy membership function and could make itself adjusted according to the specific distribution of different samples adaptively. Experiments show the proposed HA-FELM can produce better performance than SVM, ELM, and RELM in text classification.

[15]  arXiv:1805.06525 [pdf]
Title: Text classification based on ensemble extreme learning machine
Comments: 10 pages, 9 figures
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG)

In this paper, we propose a novel approach based on cost-sensitive ensemble weighted extreme learning machine; we call this approach AE1-WELM. We apply this approach to text classification. AE1-WELM is an algorithm including balanced and imbalanced multiclassification for text classification. Weighted ELM assigning the different weights to the different samples improves the classification accuracy to a certain extent, but weighted ELM considers the differences between samples in the different categories only and ignores the differences between samples within the same categories. We measure the importance of the documents by the sample information entropy, and generate cost-sensitive matrix and factor based on the document importance, then embed the cost-sensitive weighted ELM into the AdaBoost.M1 framework seamlessly. Vector space model(VSM) text representation produces the high dimensions and sparse features which increase the burden of ELM. To overcome this problem, we develop a text classification framework combining the word vector and AE1-WELM. The experimental results show that our method provides an accurate, reliable and effective solution for text classification.

[16]  arXiv:1805.06529 [pdf]
Title: Slicing Virtualized EPC-based 5G Core Network for Content Delivery
Subjects: Networking and Internet Architecture (cs.NI)

Traditional Content Delivery Networks (CDNs) built with traditional Internet technology are less and less able to cope with today's tremendous growth of content. Information Centric Networks (ICN), a proposed future Internet technology, may aid in remedying the situation. Unlike the current Internet, it decouples information from its sources and provides in-network storage. We expect traditional CDN and ICN-based CDN to co-exist in the foreseeable future, especially as it is now known that it might be possible to evolve traditional CDNs to gain the benefits promised by ICN. 5G providers must therefore aim to offer core network slices on which both ICN-based CDNs and traditional CDNs can be built. These slices could of course also be offered to providers of other applications with requirements similar to those of content delivery. This paper tackles the problem of slicing 5G for content delivery over ICN-based CDNs and traditional CDNs. Only virtualized Evolved Packet Core (EPC)-based 5G is considered. The problem is defined as a resource allocation problem which aims at minimizing the cost of slice assignment, while meeting QoS requirements. An Integer linear programming (ILP) formulation is provided and evaluated in a small-scale scenario.

[17]  arXiv:1805.06530 [pdf, other]
Title: Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising
Comments: To appear at the 35th International Conference on Machine Learning (ICML), 2018
Subjects: Learning (cs.LG); Machine Learning (stat.ML)

The Gaussian mechanism is an essential building block used in multitude of differentially private data analysis algorithms. In this paper we revisit the Gaussian mechanism and show that the original analysis has several important limitations. Our analysis reveals that the variance formula for the original mechanism is far from tight in the high privacy regime ($\varepsilon \to 0$) and it cannot be extended to the low privacy regime ($\varepsilon \to \infty$). We address this limitations by developing an analytic Gaussian mechanism whose variance is calibrated directly using the Gaussian cumulative density function instead of a tail bound approximation. We also propose to equip the Gaussian mechanism with a post-processing step based on adaptive denoising estimators by leveraging that the variance of the perturbation is known. Our experiments show that analytical calibration removes at least a third of the variance of the noise compared to the classical Gaussian mechanism, and that denoising dramatically improves the accuracy of the Gaussian mechanism in the high-dimensional regime.

[18]  arXiv:1805.06532 [pdf, other]
Title: Beyond 5G with UAVs: Foundations of a 3D Wireless Cellular Network
Subjects: Information Theory (cs.IT)

In this paper, a novel concept of three-dimensional (3D) cellular networks, that integrate drone base stations (drone-BS) and cellular-connected drone users (drone-UEs), is introduced. For this new 3D cellular architecture, a novel framework for network planning for drone-BSs as well as latency-minimal cell association for drone-UEs is proposed. For network planning, a tractable method for drone-BSs' deployment based on the notion of truncated octahedron shapes is proposed that ensures full coverage for a given space with minimum number of drone-BSs. In addition, to characterize frequency planning in such 3D wireless networks, an analytical expression for the feasible integer frequency reuse factors is derived. Subsequently, an optimal 3D cell association scheme is developed for which the drone-UEs' latency, considering transmission, computation, and backhaul delays, is minimized. To this end, first, the spatial distribution of the drone-UEs is estimated using a kernel density estimation method, and the parameters of the estimator are obtained using a cross-validation method. Then, according to the spatial distribution of drone-UEs and the locations of drone-BSs, the latency-minimal 3D cell association for drone-UEs is derived by exploiting tools from optimal transport theory. Simulation results show that the proposed approach reduces the latency of drone-UEs compared to the classical cell association approach that uses a signal-to-interference-plus-noise ratio (SINR) criterion. In particular, the proposed approach yields a reduction of up to 46% in the average latency compared to the SINR-based association. The results also show that the proposed latency-optimal cell association improves the spectral efficiency of a 3D wireless cellular network of drones.

[19]  arXiv:1805.06533 [pdf, other]
Title: Modeling Naive Psychology of Characters in Simple Commonsense Stories
Comments: Accepted to ACL 2018 (long paper)
Subjects: Computation and Language (cs.CL)

Understanding a narrative requires reading between the lines and reasoning about the unspoken but obvious implications about events and people's mental states - a capability that is trivial for humans but remarkably hard for machines. To facilitate research addressing this challenge, we introduce a new annotation framework to explain naive psychology of story characters as fully-specified chains of mental states with respect to motivations and emotional reactions. Our work presents a new large-scale dataset with rich low-level annotations and establishes baseline performance on several new tasks, suggesting avenues for future research.

[20]  arXiv:1805.06534 [pdf, other]
Title: Career Transitions and Trajectories: A Case Study in Computing
Comments: To appear in KDD 2018
Subjects: Social and Information Networks (cs.SI)

From artificial intelligence to network security to hardware design, it is well-known that computing research drives many important technological and societal advancements. However, less is known about the long-term career paths of the people behind these innovations. What do their careers reveal about the evolution of computing research? Which institutions were and are the most important in this field, and for what reasons? Can insights into computing career trajectories help predict employer retention?
In this paper we analyze several decades of post-PhD computing careers using a large new dataset rich with professional information, and propose a versatile career network model, R^3, that captures temporal career dynamics. With R^3 we track important organizations in computing research history, analyze career movement between industry, academia, and government, and build a powerful predictive model for individual career transitions. Our study, the first of its kind, is a starting point for understanding computing research careers, and may inform employer recruitment and retention mechanisms at a time when the demand for specialized computational expertise far exceeds supply.

[21]  arXiv:1805.06536 [pdf, other]
Title: Are BLEU and Meaning Representation in Opposition?
Comments: ACL 2018; 10 pages + 2 page supplementary
Subjects: Computation and Language (cs.CL)

One of possible ways of obtaining continuous-space sentence representations is by training neural machine translation (NMT) systems. The recent attention mechanism however removes the single point in the neural network from which the source sentence representation can be extracted. We propose several variations of the attentive NMT architecture bringing this meeting point back. Empirical evaluation suggests that the better the translation quality, the worse the learned sentence representations serve in a wide range of classification and similarity tasks.

[22]  arXiv:1805.06539 [pdf, ps, other]
Title: Generalized Strucutral Causal Models
Subjects: Artificial Intelligence (cs.AI); Methodology (stat.ME); Machine Learning (stat.ML)

Structural causal models are a popular tool to describe causal relations in systems in many fields such as economy, the social sciences, and biology. In this work, we show that these models are not flexible enough in general to give a complete causal representation of equilibrium states in dynamical systems that do not have a unique stable equilibrium independent of initial conditions. We prove that our proposed generalized structural causal models do capture the essential causal semantics that characterize these systems. We illustrate the power and flexibility of this extension on a dynamical system corresponding to a basic enzymatic reaction. We motivate our approach further by showing that it also efficiently describes the effects of interventions on functional laws such as the ideal gas law.

[23]  arXiv:1805.06541 [pdf, other]
Title: Towards Malware Detection via CPU Power Consumption: Data Collection Design and Analytics (Extended Version)
Comments: Published version appearing in IEEE TrustCom-18. This version contains more details on mathematics and data collection
Subjects: Cryptography and Security (cs.CR)

This paper presents an experimental design and data analytics approach aimed at power-based malware detection on general-purpose computers. Leveraging the fact that malware executions must consume power, we explore the postulate that malware can be accurately detected via power data analytics. Our experimental design and implementation allow for programmatic collection of CPU power profiles for fixed tasks during uninfected and infected states using five different rootkits. To characterize the power consumption profiles, we use both simple statistical and novel, sophisticated features. We test a one-class anomaly detection ensemble (that baselines non-infected power profiles) and several kernel-based SVM classifiers (that train on both uninfected and infected profiles) in detecting previously unseen malware and clean profiles. The anomaly detection system exhibits perfect detection when using all features and tasks, with smaller false detection rate than the supervised classifiers. The primary contribution is the proof of concept that baselining power of fixed tasks can provide accurate detection of rootkits. Moreover, our treatment presents engineering hurdles needed for experimentation and allows analysis of each statistical feature individually. This work appears to be the first step towards a viable power-based detection capability for general-purpose computers, and presents next steps toward this goal.

[24]  arXiv:1805.06542 [pdf, other]
Title: Dancing Pigs or Externalities? Measuring the Rationality of Security Decisions
Journal-ref: 2018 ACM Conference on Economics and Computation
Subjects: Computer Science and Game Theory (cs.GT); Cryptography and Security (cs.CR); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)

Accurately modeling human decision-making in security is critical to thinking about when, why, and how to recommend that users adopt certain secure behaviors. In this work, we conduct behavioral economics experiments to model the rationality of end-user security decision-making in a realistic online experimental system simulating a bank account. We ask participants to make a financially impactful security choice, in the face of transparent risks of account compromise and benefits offered by an optional security behavior (two-factor authentication). We measure the cost and utility of adopting the security behavior via measurements of time spent executing the behavior and estimates of the participant's wage. We find that more than 50% of our participants made rational (e.g., utility optimal) decisions, and we find that participants are more likely to behave rationally in the face of higher risk. Additionally, we find that users' decisions can be modeled well as a function of past behavior (anchoring effects), knowledge of costs, and to a lesser extent, users' awareness of risks and context (R2=0.61). We also find evidence of endowment effects, as seen in other areas of economic and psychological decision-science literature, in our digital-security setting. Finally, using our data, we show theoretically that a "one-size-fits"-all emphasis on security can lead to market losses, but that adoption by a subset of users with higher risks or lower costs can lead to market gains.

[25]  arXiv:1805.06546 [pdf, other]
Title: Joint Classification and Prediction CNN Framework for Automatic Sleep Stage Classification
Comments: Under journal submission
Subjects: Learning (cs.LG); Machine Learning (stat.ML)

Sleep staging plays an important role in assessment and treatment of sleep issues. This work proposes a joint classification-and-prediction framework based on convolutional neural networks (CNNs) for automatic sleep staging, and, subsequently, introduces a simple yet efficient CNN architecture to power the framework. Given a single input epoch, the framework jointly determines its label (classification) and those of its neighboring epochs in the contextual output (prediction). While the proposed framework is orthogonal to the widely adopted classification schemes, which takes one or multiple epochs as a contextual input and produces a single classification decision on the target epoch, we demonstrate its advantages in different ways. First, it leverages the dependency among consecutive sleep epochs while surpassing the problems experienced with the common classification schemes. Second, even with a single model, the framework effortlessly produces multiple decisions as in ensemble-of-models methods which are essential in obtaining a good performance. Probabilistic aggregation techniques are then proposed to leverage the availability of multiple decisions. We demonstrate good performance on the Montreal Archive of Sleep Studies (MASS) dataset consisting 200 subjects with an average accuracy of 83.6%. We also show that the proposed framework not only is superior to the baselines based on the common classification schemes but also outperforms existing deep-learning approaches. To our knowledge, this is the first work going beyond the standard single-output classification to consider neural networks with multiple outputs for automatic sleep staging. This framework could provide avenues for further studies of different neural-network architectures for this problem.

[26]  arXiv:1805.06549 [pdf, other]
Title: Defoiling Foiled Image Captions
Comments: In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2018)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

We address the task of detecting foiled image captions, i.e. identifying whether a caption contains a word that has been deliberately replaced by a semantically similar word, thus rendering it inaccurate with respect to the image being described. Solving this problem should in principle require a fine-grained understanding of images to detect linguistically valid perturbations in captions. In such contexts, encoding sufficiently descriptive image information becomes a key challenge. In this paper, we demonstrate that it is possible to solve this task using simple, interpretable yet powerful representations based on explicit object information. Our models achieve state-of-the-art performance on a standard dataset, with scores exceeding those achieved by humans on the task. We also measure the upper-bound performance of our models using gold standard annotations. Our analysis reveals that the simpler model performs well even without image information, suggesting that the dataset contains strong linguistic bias.

[27]  arXiv:1805.06553 [pdf, other]
Title: A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation
Comments: Accepted to NAACL 2018
Subjects: Computation and Language (cs.CL)

Natural language generation lies at the core of generative dialogue systems and conversational agents. We describe an ensemble neural language generator, and present several novel methods for data representation and augmentation that yield improved results in our model. We test the model on three datasets in the restaurant, TV and laptop domains, and report both objective and subjective evaluations of our best model. Using a range of automatic metrics, as well as human evaluators, we show that our approach achieves better results than state-of-the-art models on the same datasets.

[28]  arXiv:1805.06556 [pdf, other]
Title: Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples
Comments: ACL 2018
Subjects: Computation and Language (cs.CL)

We revisit domain adaptation for parsers in the neural era. First we show that recent advances in word representations greatly diminish the need for domain adaptation when the target domain is syntactically similar to the source domain. As evidence, we train a parser on the Wall Street Jour- nal alone that achieves over 90% F1 on the Brown corpus. For more syntactically dis- tant domains, we provide a simple way to adapt a parser using only dozens of partial annotations. For instance, we increase the percentage of error-free geometry-domain parses in a held-out set from 45% to 73% using approximately five dozen training examples. In the process, we demon- strate a new state-of-the-art single model result on the Wall Street Journal test set of 94.3%. This is an absolute increase of 1.7% over the previous state-of-the-art of 92.6%.

[29]  arXiv:1805.06557 [pdf, other]
Title: Exponential Integrators with Parallel-in-Time Rational Approximations for Climate and Weather Simulations
Subjects: Numerical Analysis (cs.NA); Computational Engineering, Finance, and Science (cs.CE)

High-performance computing trends towards many-core systems are expected to continue over the next decade. As a result, parallel-in-time methods, mathematical formulations which exploit additional degrees of parallelism in the time dimension, have gained increasing interest in recent years. In this work we study a massively parallel rational approximation of exponential integrators (REXI). This method replaces a time integration of stiff linear oscillatory and diffusive systems by the sum of the solutions of many decoupled systems, which can be solved in parallel. Previous numerical studies showed that this reformulation allows taking arbitrarily long time steps for the linear oscillatory parts.
The present work studies the non-linear shallow-water equations on the rotating sphere, a simplified system of equations used to study properties of space and time discretization methods in the context of atmospheric simulations. After introducing time integrators, we first compare the time step sizes to the errors in the simulation, discussing pros and cons of different formulations of REXI. Here, REXI already shows superior properties compared to explicit and implicit time stepping methods. Additionally, we present wallclock-time-to-error results revealing the sweet spots of REXI obtaining either a 3x higher accuracy or a reduced time-to-solution. Finally, a detailed performance analysis of REXI reveals that the communication in time takes only 4% of the overall wallclock time. Our results motivate further explorations of REXI for operational weather/climate systems.

[30]  arXiv:1805.06558 [pdf, other]
Title: Recurrent Neural Network for Learning DenseDepth and Ego-Motion from Video
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Learning-based, single-view depth estimation often generalizes poorly to unseen datasets. While learning-based, two-frame depth estimation solves this problem to some extent by learning to match features across frames, it performs poorly at large depth where the uncertainty is high. There exists few learning-based, multi-view depth estimation methods. In this paper, we present a learning-based, multi-view dense depth map and ego-motion estimation method that uses Recurrent Neural Networks (RNN). Our model is designed for 3D reconstruction from video where the input frames are temporally correlated. It is generalizable to single- or two-view dense depth estimation. Compared to recent single- or two-view CNN-based depth estimation methods, our model leverages more views and achieves more accurate results, especially at large distances. Our method produces superior results to the state-of-the-art learning-based, single- or two-view depth estimation methods on both indoor and outdoor benchmark datasets. We also demonstrate that our method can even work on extremely difficult sequences, such as endoscopic video, where none of the assumptions (static scene, constant lighting, Lambertian reflection, etc.) from traditional 3D reconstruction methods hold.

[31]  arXiv:1805.06561 [pdf, other]
Title: DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images
Comments: Dataset description for DeepGlobe 2018 Challenge at CVPR 2018
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We present the DeepGlobe 2018 Satellite Image Understanding Challenge, which includes three public competitions for segmentation, detection, and classification tasks on satellite images. Similar to other challenges in computer vision domain such as DAVIS and COCO, DeepGlobe proposes three datasets and corresponding evaluation methodologies, coherently bundled in three competitions with a dedicated workshop co-located with CVPR 2018.
We observed that satellite imagery is a rich and structured source of information, yet it is less investigated than everyday images by computer vision researchers. However, bridging modern computer vision with remote sensing data analysis could have critical impact to the way we understand our environment and lead to major breakthroughs in global urban planning or climate change research. Keeping such bridging objective in mind, DeepGlobe aims to bring together researchers from different domains to raise awareness of remote sensing in the computer vision community and vice-versa. We aim to improve and evaluate state-of-the-art satellite image understanding approaches, which can hopefully serve as reference benchmarks for future research in the same topic. In this paper, we analyze characteristics of each dataset, define the evaluation criteria of the competitions, and provide baselines for each task.

[32]  arXiv:1805.06562 [pdf, other]
Title: Efficient compilation of array probabilistic programs
Subjects: Programming Languages (cs.PL); Probability (math.PR)

Probabilistic programming languages are valuable because they allow us to build, run, and change concise probabilistic models that elide irrelevant details. However, current systems are either inexpressive, failing to support basic features needed to write realistic models, or inefficient, taking orders of magnitude more time to run than hand-coded inference. Without resolving this dilemma, model developers are still required to manually rewrite their high-level models into low-level code to get the needed performance.
We tackle this dilemma by presenting an approach for efficient probabilistic programming with arrays. Arrays are a key element of almost any realistic model. Our system extends previous compilation techniques from scalars to arrays. These extensions allow the transformation of high-level programs into known efficient algorithms. We then optimize the resulting code by taking advantage of the domain-specificity of our language. We further JIT-compile the final product using LLVM on a per-execution basis. These steps combined lead to significant new opportunities for specialization. The resulting performance is competitive with manual implementations of the desired algorithms, even though the original program is as concise and expressive as the initial model.

[33]  arXiv:1805.06563 [pdf, other]
Title: NPE: Neural Personalized Embedding for Collaborative Filtering
Comments: IJCAI-ECAI 2018 (to appear)
Subjects: Information Retrieval (cs.IR)

Matrix factorization is one of the most efficient approaches in recommender systems. However, such algorithms, which rely on the interactions between users and items, perform poorly for "cold-users" (users with little history of such interactions) and at capturing the relationships between closely related items. To address these problems, we propose a neural personalized embedding (NPE) model, which improves the recommendation performance for cold-users and can learn effective representations of items. It models a user's click to an item in two terms: the personal preference of the user for the item, and the relationships between this item and other items clicked by the user. We show that NPE outperforms competing methods for top-N recommendations, specially for cold-user recommendations. We also performed a qualitative analysis that shows the effectiveness of the representations learned by the model.

[34]  arXiv:1805.06566 [pdf, other]
Title: Content-based Popularity Prediction of Online Petitions Using a Deep Regression Model
Journal-ref: ACL 2018 (camera ready pre-print)
Subjects: Computation and Language (cs.CL)

Online petitions are a cost-effective way for citizens to collectively engage with policy-makers in a democracy. Predicting the popularity of a petition --- commonly measured by its signature count --- based on its textual content has utility for policy-makers as well as those posting the petition. In this work, we model this task using CNN regression with an auxiliary ordinal regression objective. We demonstrate the effectiveness of our proposed approach using UK and US government petition datasets.

[35]  arXiv:1805.06571 [pdf, ps, other]
Title: Caching With Time-Varying Popularity Profiles: A Learning-Theoretic Perspective
Comments: Article published in IEEE Transactions on Communications, 2018
Subjects: Information Theory (cs.IT)

Content caching at the small-cell base stations (sBSs) in a heterogeneous wireless network is considered. A cost function is proposed that captures the backhaul link load called the `offloading loss', which measures the fraction of the requested files that are not available in the sBS caches. As opposed to the previous approaches that consider time-invariant and perfectly known popularity profile, caching with non-stationary and statistically dependent popularity profiles (assumed unknown, and hence, estimated) is studied from a learning-theoretic perspective. A probably approximately correct result is derived, which presents a high probability bound on the offloading loss difference, i.e., the error between the estimated and the optimal offloading loss. The difference is a function of the Rademacher complexity, the $\beta-$mixing coefficient, the number of time slots, and a measure of discrepancy between the estimated and true popularity profiles. A cache update algorithm is proposed, and simulation results are presented to show its superiority over periodic updates. The performance analyses for Bernoulli and Poisson request models are also presented.

[36]  arXiv:1805.06572 [pdf, ps, other]
Title: FastFCA: A Joint Diagonalization Based Fast Algorithm for Audio Source Separation Using A Full-Rank Spatial Covariance Model
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

A source separation method using a full-rank spatial covariance model has been proposed by Duong et al. ["Under-determined Reverberant Audio Source Separation Using a Full-rank Spatial Covariance Model," IEEE Trans. ASLP, vol. 18, no. 7, pp. 1830-1840, Sep. 2010], which is referred to as full-rank spatial covariance analysis (FCA) in this paper. Here we propose a fast algorithm for estimating the model parameters of the FCA, which is named Fast-FCA, and applicable to the two-source case. Though quite effective in source separation, the conventional FCA has a major drawback of expensive computation. Indeed, the conventional algorithm for estimating the model parameters of the FCA requires frame-wise matrix inversion and matrix multiplication. Therefore, the conventional FCA may be infeasible in applications with restricted computational resources. In contrast, the proposed FastFCA bypasses matrix inversion and matrix multiplication owing to joint diagonalization based on the generalized eigenvalue problem. Furthermore, the FastFCA is strictly equivalent to the conventional algorithm. An experiment has shown that the FastFCA was over 250 times faster than the conventional algorithm with virtually the same source separation performance.

[37]  arXiv:1805.06583 [pdf, other]
Title: Cooperative Limited Feedback Design for Massive Machine-Type Communications
Comments: 11 Pages, 4 figures
Subjects: Information Theory (cs.IT)

Multiuser multiple-input multiple-output (MIMO) systems have been in the spotlight since it is expected to support high connection density in internet of things (IoT) networks. Considering the massive connectivity in IoT networks, the challenge for the multiuser MIMO systems is to obtain accurate channel state information (CSI) at the transmitter in order that the sum-rate throughput can be maximized. However, current communication mechanisms relying upon frequency division duplexing (FDD) might not fully support massive number of machine-type devices due to the rate-constrained limited feedback and complicated time-consuming scheduling. In this paper, we develop a cooperative feedback strategy to maximize the benefits of massive connectivity under limited resource constraint for the feedback link. In the proposed algorithm, two neighboring users form a single cooperation unit to improve the channel quantization performance by sharing some level of channel information. To satisfy the low-latency requirement in IoT networks, the cooperation process is conducted without any transmitter intervention. In addition, we analyze the sum-rate throughput of the multiuser MIMO systems relying upon the proposed feedback strategy to study a cooperation decision-making framework. Based on the analytical studies, we develop a network-adapted cooperation algorithm to turn the user cooperation mode on and off according to network conditions.

[38]  arXiv:1805.06589 [pdf, other]
Title: Supersingular Isogeny Oblivious Transfer
Comments: 26 pages, 4 figures, Submitted
Subjects: Cryptography and Security (cs.CR)

We present an oblivious transfer (OT) protocol that combines the OT scheme of Chou and Orlandi together with thesupersingular isogeny Diffie-Hellman (SIDH) primitive of De Feo, Jao, and Pl\^ut. Our construction is a candidate for post-quantum secure OT and demonstrates that SIDH naturally supports OT functionality. We consider the protocol in the simplest configuration of $\binom{2}{1}$-OT and analyze the protocol to verify its security.

[39]  arXiv:1805.06591 [pdf, ps, other]
Title: Deep Reinforcement Learning for Network Slicing
Comments: 6 figures
Subjects: Networking and Internet Architecture (cs.NI); Learning (cs.LG)

Network slicing means an emerging business to operators and allows them to sell the customized slices to various tenants at different prices. In order to provide better-performing and costefficient services, network slicing involves challenging technical issues and urgently looks forward to intelligent innovations to make the resource management consistent with users' activities per slice. In that regard, deep reinforcement learning (DRL), which focuses on how to interact with the environment by trying alternative actions and reinforces the tendency actions producing more rewarding consequences, is emerging as a promising solution. In this paper, after briefly reviewing the fundamental concepts and evolution-driving factors of DRL, we investigate the application of DRL in some typical resource management scenarios of network slicing, which include radio resource slicing and priority-based core network slicing, and demonstrate the performance advantage of DRL over several competing schemes through extensive simulations. Finally, we also discuss the possible challenges to apply DRL in network slicing from a general perspective.

[40]  arXiv:1805.06593 [pdf, other]
Title: Cross-Target Stance Classification with Self-Attention Networks
Comments: In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL2018)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

In stance classification, the target on which the stance is made defines the boundary of the task, and a classifier is usually trained for prediction on the same target. In this work, we explore the potential for generalizing classifiers between different targets, and propose a neural model that can apply what has been learned from a source target to a destination target. We show that our model can find useful information shared between relevant targets which improves generalization in certain scenarios.

[41]  arXiv:1805.06594 [pdf, ps, other]
Title: Leveraging Social Signal to Improve Item Recommendation for Matrix Factorization
Authors: Ze Wang, Hong Li
Subjects: Information Retrieval (cs.IR); Social and Information Networks (cs.SI)

Although Recommender Systems have been comprehensively studied in the past decade both in industry and academia, most of current recommender systems suffer from the fol- lowing issues: 1) The data sparsity of the user-item matrix seriously affect the recommender system quality. As a result, most of traditional recommender system approaches are not able to deal with the users who have rated few items, which is known as cold start problem in recommender system. 2) Traditional recommender systems assume that users are in- dependently and identically distributed and ignore the social relation between users. However, in real life scenario, due to the exponential growth of social networking service, such as facebook and Twitter, social connections between different users play an significant role for recommender system task. In this work, aiming at providing a better recommender sys- tems by incorporating user social network information, we propose a matrix factorization framework with user social connection constraints. Experimental results on the real-life dataset shows that the proposed method performs signifi- cantly better than the state-of-the-art approaches in terms of MAE and RMSE, especially for the cold start users.

[42]  arXiv:1805.06597 [pdf, other]
Title: ARUM: Polar Coded HARQ Scheme based on Incremental Channel Polarization
Subjects: Information Theory (cs.IT)

A hybrid ARQ (HARQ) scheme for polar code, which is called active-bit relocation under masks (ARUM), is proposed. In each transmission, the data bits are encoded and bit-wisely XOR-masked using a binary vector before being transmitted through the channel. The masking process combines multiple transmissions together which forms another step of inter-transmission channel transform. The reliabilities are updated after every transmission, and the less reliable bits in earlier ones are relocated to the more reliable positions at the latest transmitted block. ARUM is a very flexible HARQ scheme which allows each transmission to have a different mother code length and to adopt independent rate-matching scheme with sufficient channel state feedback in HARQ process. Simulation shows that ARUM can obtain near-optimal coding gain.

[43]  arXiv:1805.06603 [pdf, other]
Title: Machine learning based context-predictive car-to-cloud communication using multi-layer connectivity maps for upcoming 5G networks
Subjects: Networking and Internet Architecture (cs.NI)

While cars were only considered as means of personal transportation for a long time, they are currently transcending to mobile sensor nodes that gather highly up-to-date information for crowdsensing-enabled big data services in a smart city context. Consequently, upcoming 5G communication networks will be confronted with massive increases in Machine-type Communication (MTC) and require resource-efficient transmission methods in order to optimize the overall system performance and provide interference-free coexistence with human data traffic that is using the same public cellular network. In this paper, we bring together mobility prediction and machine learning based channel quality estimation in order to improve the resource-efficiency of car-to-cloud data transfer by scheduling the transmission time of the sensor data with respect to the anticipated behavior of the communication context. In a comprehensive field evaluation campaign, we evaluate the proposed context-predictive approach in a public cellular network scenario where it is able to increase the average data rate by up to 194% while simultaneously reducing the mean uplink power consumption by up to 54%.

[44]  arXiv:1805.06604 [pdf, other]
Title: Accelerating Nonnegative Matrix Factorization Algorithms using Extrapolation
Comments: 19 pages, 6 figures, 6 tables
Subjects: Numerical Analysis (cs.NA); Optimization and Control (math.OC); Machine Learning (stat.ML)

In this paper, we propose a general framework to accelerate significantly the algorithms for nonnegative matrix factorization (NMF). This framework is inspired from the extrapolation scheme used to accelerate gradient methods in convex optimization and from the method of parallel tangents. However, the use of extrapolation in the context of the two-block coordinate descent algorithms tackling the non-convex NMF problems is novel. We illustrate the performance of this approach on two state-of-the-art NMF algorithms, namely, accelerated hierarchical alternating least squares (A-HALS) and alternating nonnegative least squares (ANLS), using synthetic, image and document data sets.

[45]  arXiv:1805.06605 [pdf, other]
Title: Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models
Comments: Published as a conference paper at the Sixth International Conference on Learning Representations (ICLR 2018)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

In recent years, deep neural network approaches have been widely adopted for machine learning tasks, including classification. However, they were shown to be vulnerable to adversarial perturbations: carefully crafted small perturbations can cause misclassification of legitimate images. We propose Defense-GAN, a new framework leveraging the expressive capability of generative models to defend deep neural networks against such attacks. Defense-GAN is trained to model the distribution of unperturbed images. At inference time, it finds a close output to a given image which does not contain the adversarial changes. This output is then fed to the classifier. Our proposed method can be used with any classification model and does not modify the classifier structure or training procedure. It can also be used as a defense against any attack as it does not assume knowledge of the process for generating the adversarial examples. We empirically show that Defense-GAN is consistently effective against different attack methods and improves on existing defense strategies. Our code has been made publicly available at https://github.com/kabkabm/defensegan.

[46]  arXiv:1805.06606 [pdf]
Title: Convolutional Attention Networks for Multimodal Emotion Recognition from Speech and Text Data
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)

Emotion recognition has become a popular topic of interest, especially in the field of human computer interaction. Previous works involve unimodal analysis of emotion, while recent efforts focus on multi-modal emotion recognition from vision and speech. In this paper, we propose a new method of learning about the hidden representations between just speech and text data using convolutional attention networks. Compared to the shallow model which employs simple concatenation of feature vectors, the proposed attention model performs much better in classifying emotion from speech and text data contained in the CMU-MOSEI dataset.

[47]  arXiv:1805.06610 [pdf, other]
Title: A Formulation of Recursive Self-Improvement and Its Possible Efficiency
Authors: Wenyi Wang
Subjects: Artificial Intelligence (cs.AI)

Recursive self-improving (RSI) systems have been dreamed of since the early days of computer science and artificial intelligence. However, many existing studies on RSI systems remain philosophical, and lacks clear formulation and results. In this paper, we provide a formal definition for one class of RSI systems, and then demonstrate the existence of computable and efficient RSI systems on a restricted version. We use simulation to empirically show that we achieve logarithmic runtime complexity with respect to the size of the search space, and these results suggest it is possible to achieve an efficient recursive self-improvement.

[48]  arXiv:1805.06618 [pdf]
Title: Optimization of Transfer Learning for Sign Language Recognition Targeting Mobile Platform
Authors: Dhruv Rathi
Comments: 6 Pages, Journal
Subjects: Computer Vision and Pattern Recognition (cs.CV)

The target of this research is to experiment, iterate and recommend a system that is successful in recognition of American Sign Language (ASL). It is a challenging as well as an interesting problem that if solved will bring a leap in social and technological aspects alike. In this paper, we propose a real-time recognizer of ASL based on a mobile platform, so that it will have more accessibility and provides an ease of use. The technique implemented is Transfer Learning of new data of Hand gestures for alphabets in ASL to be modelled on various pre-trained high- end models and optimize the best model to run on a mobile platform considering the various limitations of the same during optimization. The data used consists of 27,455 images of 24 alphabets of ASL. The optimized model when ran over a memory-efficient mobile application, provides an accuracy of 95.03% of accurate recognition with an average recognition time of 2.42 seconds. This method ensures considerable discrimination in accuracy and recognition time than the previous research.

[49]  arXiv:1805.06619 [pdf, ps, other]
Title: Taxi demand forecasting: A HEDGE based tessellation strategy for improved accuracy
Comments: Under revision in Special Issue on Knowledge Discovery from Mobility Data for Intelligent Transportation Systems (Transactions on ITS)
Subjects: Learning (cs.LG); Applications (stat.AP); Machine Learning (stat.ML)

A key problem in location-based modeling and forecasting lies in identifying suitable spatial and temporal resolutions. In particular, judicious spatial partitioning can play a significant role in enhancing the performance of location-based forecasting models. In this work, we investigate two widely used tessellation strategies for partitioning city space, in the context of real-time taxi demand forecasting. Our study compares (i) Geohash tessellation, and (ii) Voronoi tessellation, using two distinct taxi demand datasets, over multiple time scales. For the purpose of comparison, we employ classical time-series tools to model the spatio-temporal demand. Our study finds that the performance of each tessellation strategy is highly dependent on the city geography, spatial distribution of the data, and the time of the day, and that neither strategy is found to perform optimally across the forecast horizon. We propose a hybrid tessellation algorithm that picks the best tessellation strategy at each instant, based on their performance in the recent past. Our hybrid algorithm is a non-stationary variant of the well-known HEDGE algorithm for choosing the best advice from multiple experts. We show that the hybrid tessellation strategy performs consistently better than either of the two strategies across the data sets considered, at multiple time scales, and with different performance metrics. We achieve an average accuracy of above 80% per km^2 for both data sets considered at 60 minute aggregation levels.

[50]  arXiv:1805.06620 [pdf]
Title: DroidMark: A Tool for Android Malware Detection using Taint Analysis and Bayesian Network
Comments: 5 Pages, Journal
Subjects: Cryptography and Security (cs.CR)

With the increasing user base of Android devices and advent of technologies such as Internet Banking, delicate user data is prone to be misused by malware and spyware applications. As the app developer community increases, the quality reassurance could not be justified for every application and a possibility of data leakage arises. In this research, with the aim to ensure the application authenticity, Deep Learning methods and Taint Analysis are deployed on the applications. The detection system named DroidMark looks for possible sinks and sources of data leakage in the application by modelling Android lifecycle and callbacks, which is done by Reverse Engineering the APK, further monitoring the suspected processes and collecting data in different states of the application. DroidMark is thus designed to extract features from the applications which are fed to a trained Bayesian Network for classification of Malicious and Regular applications. The results indicate a high accuracy of 96.87% and an error rate of 3.13% in the detection of Malware in Android devices.

[51]  arXiv:1805.06621 [pdf, other]
Title: Generative networks as inverse problems with Scattering transforms
Comments: International Conference on Learning Representations, 2018
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Generative Adversarial Nets (GANs) and Variational Auto-Encoders (VAEs) provide impressive image generations from Gaussian white noise, but the underlying mathematics are not well understood. We compute deep convolutional network generators by inverting a fixed embedding operator. Therefore, they do not require to be optimized with a discriminator or an encoder. The embedding is Lipschitz continuous to deformations so that generators transform linear interpolations between input white noise vectors into deformations between output images. This embedding is computed with a wavelet Scattering transform. Numerical experiments demonstrate that the resulting Scattering generators have similar properties as GANs or VAEs, without learning a discriminative network or an encoder.

[52]  arXiv:1805.06625 [pdf, other]
Title: Structure-preserving Guided Retinal Image Filtering and Its Application for Optic Disc Analysis
Comments: Accepted for publication on IEEE Trans. on Medical Imaging
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Retinal fundus photographs have been used in the diagnosis of many ocular diseases such as glaucoma, pathological myopia, age-related macular degeneration and diabetic retinopathy. With the development of computer science, computer aided diagnosis has been developed to process and analyse the retinal images automatically. One of the challenges in the analysis is that the quality of the retinal image is often degraded. For example, a cataract in human lens will attenuate the retinal image, just as a cloudy camera lens which reduces the quality of a photograph. It often obscures the details in the retinal images and posts challenges in retinal image processing and analysing tasks. In this paper, we approximate the degradation of the retinal images as a combination of human-lens attenuation and scattering. A novel structure-preserving guided retinal image filtering (SGRIF) is then proposed to restore images based on the attenuation and scattering model. The proposed SGRIF consists of a step of global structure transferring and a step of global edge-preserving smoothing. Our results show that the proposed SGRIF method is able to improve the contrast of retinal images, measured by histogram flatness measure, histogram spread and variability of local luminosity. In addition, we further explored the benefits of SGRIF for subsequent retinal image processing and analysing tasks. In the two applications of deep learning based optic cup segmentation and sparse learning based cup-to-disc ratio (CDR) computation, our results show that we are able to achieve more accurate optic cup segmentation and CDR measurements from images processed by SGRIF.

[53]  arXiv:1805.06634 [pdf]
Title: Novel Force Estimation-based Bilateral Teleoperation applying Type-2 Fuzzy logic and Moving Horizon Estimation
Comments: 12 pages, 13 figures
Subjects: Robotics (cs.RO)

This paper develops a novel force observer for bilateral teleoperation systems. Type-2 fuzzy logic is used to describe the overall dynamic system, and Moving Horizon Estimation (MHE) is employed to assess clean states as well as the values of dynamic uncertainties, and simultaneously filter out the measurement noises, which guarantee the high degree of accuracy for the observed forces. Compared with the existing methods, the proposed force observer can run without knowing exact mathematical dynamic functions and is robust to different kinds of noises. A force-reflection four-channel teleoperation control laws is also proposed that involving the observed environmental and human force to provide the highly accurate force tracking between the master and the slave in the presence of time delays. Finally, experiments based on two haptic devices demonstrate the superiority of the proposed method through the comparisons with multiple state-to-the-art force observers.

[54]  arXiv:1805.06637 [pdf, ps, other]
Title: How to Dimension 5G Network When Users Are Distributed on Roads Modeled by Poisson Line Process
Authors: Jalal Rachad (LTCI), Ridha Nasri, Laurent Decreusefond (LTCI)
Subjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)

-The fifth generation (5G) New Radio (NR) interface inherits many concepts and techniques from 4G systems such as the Orthogonal Frequency Division Multiplex (OFDM) based waveform and multiple access. Dimensioning 5G NR interface will likely follow the same principles as in 4G networks. It aims at finding the number of radio resources required to carry a forecast data traffic at a target users Quality of Services (QoS). The present paper attempts to provide a new approach of dimension-ing 5G NR radio resource (number of Physical Resource Blocks) considering its congestion probability, qualified as a relevant metric for QoS evaluation. Moreover, 5G users are assumed to be distributed in roads modeled by Poisson Line Process (PLP) instead of the widely-used 2D-Poisson Point Process. We derive the analytical expression of the congestion probability for analyzing its behavior as a function of network parameters. Then, we set its value, often targeted by the operator, in order to find the relation between the necessary resources and the forecast data traffic expressed in terms of cell throughput. Different numerical results are presented to justify this dimensioning approach.

[55]  arXiv:1805.06638 [pdf, other]
Title: Interference Analysis in Dynamic TDD System Combined or not With Cell Clustering Scheme
Authors: Jalal Rachad (LTCI), Ridha Nasri, Laurent Decreusefond (LTCI)
Journal-ref: Vehicular Technology Conference, Jun 2018, Porto, Portugal
Subjects: Networking and Internet Architecture (cs.NI); Performance (cs.PF)

Dynamic Time Division Duplex (TDD) has been introduced as a solution to deal with the uplink and downlink traffic asymmetry, mainly observed for dense heterogeneous network deployments. However, the use of this feature requires new interference mitigation schemes capable to handle two additional types of interferences between cells in opposite transmission cycle: downlink to uplink and uplink to downlink interferences. Among them, Cell clustering has been proposed as an efficient solution to minimize inter-cell interferences in opposite transmission directions and somehow responds to the requirements of enhanced Interference Mitigation and Traffic Adaptation (eIMTA) problem. This work is devoted to provide a new analytical approach to model inter-cell interferences and quantify performances of Dynamic TDD system in terms of SINR (Signal to Interferences plus Noise Ratio) distribution. Analytical system performance investigation concerns two scenarios: i) basic Dynamic TDD without any other feature and ii) Dynamic TDD with interference mitigation schemes.

[56]  arXiv:1805.06641 [pdf, other]
Title: Joint direct estimation of 3D geometry and 3D motion using spatio temporal gradients
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Conventional image motion based structure from motion methods first compute optical flow, then solve for the 3D motion parameters based on the epipolar constraint, and finally recover the 3D geometry of the scene. However, errors in optical flow due to regularization can lead to large errors in 3D motion and structure. This paper investigates whether performance and consistency can be improved by avoiding optical flow estimation in the early stages of the structure from motion pipeline, and it proposes a new direct method based on image gradients (normal flow) only. The main idea lies in a reformulation of the positive-depth constraint, which allows the use of well-known minimization techniques to solve for 3D motion. The 3D motion estimate is then refined and structure estimated adding a regularization based on depth. Experimental comparisons on standard synthetic datasets and the real-world driving benchmark dataset KITTI using three different optic flow algorithms show that the method achieves better accuracy in all but one case. Furthermore, it outperforms existing normal flow based 3D motion estimation techniques. Finally, the recovered 3D geometry is shown to be also very accurate.

[57]  arXiv:1805.06645 [pdf, ps, other]
Title: Performance Analysis and Optimization of Cooperative Full-Duplex D2D Communication Underlaying Cellular Networks
Comments: 31 pages, 10 figures, submitted to IEEE journal for possible publication
Subjects: Information Theory (cs.IT)

This paper investigates the cooperative full-duplex device-to-device (D2D) communication underlaying cellular network, where the cellular user (CU) acts as a full-duplex relay to assist the D2D communication. To simultaneously support D2D relaying and uplink transmission, superposition coding and successive interference cancellation are adopted at the CU and the D2D receiver, respectively. The achievable rate region and joint outage probability are derived to characterize the performance of the considered system. An optimal power allocation scheme is proposed to maximize the minimum achievable rate. Besides, by analyzing the upper bound of the joint outage probability, we study a suboptimal power allocation to improve the outage performance. The simulation results confirm the theoretical analysis and the advantages of the proposed power allocation schemes.

[58]  arXiv:1805.06648 [pdf, ps, other]
Title: Extrapolation in NLP
Subjects: Computation and Language (cs.CL)

We argue that extrapolation to examples outside the training space will often be easier for models that capture global structures, rather than just maximise their local fit to the training data. We show that this is true for two popular models: the Decomposable Attention Model and word2vec.

[59]  arXiv:1805.06652 [pdf, other]
Title: Affective computing using speech and eye gaze: a review and bimodal system proposal for continuous affect prediction
Comments: Submitted to International Journal of Human-Computer Studies
Subjects: Human-Computer Interaction (cs.HC)

Speech has been a widely used modality in the field of affective computing. Recently however, there has been a growing interest in the use of multi-modal affective computing systems. These multi-modal systems incorporate both verbal and non-verbal features for affective computing tasks. Such multi-modal affective computing systems are advantageous for emotion assessment of individuals in audio-video communication environments such as teleconferencing, healthcare, and education. From a review of the literature, the use of eye gaze features extracted from video is a modality that has remained largely unexploited for continuous affect prediction. This work presents a review of the literature within the emotion classification and continuous affect prediction sub-fields of affective computing for both speech and eye gaze modalities. Additionally, continuous affect prediction experiments using speech and eye gaze modalities are presented. A baseline system is proposed using open source software, the performance of which is assessed on a publicly available audio-visual corpus. Further system performance is assessed in a cross-corpus and cross-lingual experiment. The experimental results suggest that eye gaze is an effective supportive modality for speech when used in a bimodal continuous affect prediction system. The addition of eye gaze to speech in a simple feature fusion framework yields a prediction improvement of 6.13% for valence and 1.62% for arousal.

[60]  arXiv:1805.06655 [pdf, other]
Title: Payload-size and Deadline-aware Scheduling for Upcoming 5G Networks: Experimental Validation in High-load Scenarios
Subjects: Networking and Internet Architecture (cs.NI)

High data rates, low latencies, and a widespread availability are the key elements why current cellular network technologies are used for many different applications. However, the coexistence of different data traffic types in the same 4G/5G-based public mobile network results in a significant growth of interfering data traffic competing for transmission. Particularly in the context of time-critical and highly-dynamic Cyber Physical Systems (CPS) and Vehicle-to-Everything (V2X) applications, the compliance with deadlines and therefore the efficient allocation of scarce mobile radio resources is of high importance. Hence, scheduling solutions are required offering a good trade-off between the compliance with deadlines and a resource-efficient allocation of rare resources in mobile networks. In this paper, we present the results of an experimental validation of the Payload-Size and Deadline-Aware (PayDA) scheduling algorithm using a Software-Defined Radio (SDR)-based eNodeB. The results of the experimental validation prove the high efficiency of the proposed PayDA scheduling algorithm for time-critical applications in both miscellaneous and homogeneous data traffic.

[61]  arXiv:1805.06657 [pdf]
Title: Deep-learning Based Modeling of Fault Detachment Stability for Power Grid
Comments: in Chinese
Subjects: Learning (cs.LG); Machine Learning (stat.ML)

The project intends to model the stability of power system with a deep learning algorithm to the problem, aiming to delay the removal of the fault. The so-called "fail-delay cut-off" refers to the occurrence of N-1 backup protection action on the backbone network of the system, resulting in longer time for the removal of the fault. In practice, through the analysis and calculation of a large number of online data, we have found that the N-1 failure system of the main protection action will not be unstable, which is also a guarantee of the operation mode arrangement. In the case of the N-1 backup protection action, there is an approximately 2.5% probability that the system will be destabilized. Therefore, research is needed to improve the operating arrangement.

[62]  arXiv:1805.06661 [pdf, other]
Title: Constructing Customized Multi-Hop Topologies in Dense Wireless Network Testbeds
Comments: 12 pages, 11 figures
Subjects: Networking and Internet Architecture (cs.NI)

Testbeds are a key element in the evaluation of wireless multi-hop networks. In order to relieve researchers from the hassle of deploying their own testbeds, remotely controllable testbeds, such as the FIT/IoT-LAB, are built. However, while the IoT-LAB has a high number of nodes, they are deployed in constraint areas. This, together with the complex nature of radio propagation, makes an ad-hoc construction of multi-hop topologies with a high number of hops difficult. This work presents a strategic approach to solve this problem and proposes algorithms to generate topologies with desired properties. The implementation is evaluated for the IoT-LAB testbeds and is provided as open-source software. The results show that preset topologies of various types can be built even in dense wireless testbeds.

[63]  arXiv:1805.06664 [pdf, ps, other]
Title: Evolutionary RL for Container Loading
Comments: 6 Pages, 2 figures, accepted in ESANN 2018
Subjects: Artificial Intelligence (cs.AI)

Loading the containers on the ship from a yard, is an impor- tant part of port operations. Finding the optimal sequence for the loading of containers, is known to be computationally hard and is an example of combinatorial optimization, which leads to the application of simple heuristics in practice. In this paper, we propose an approach which uses a mix of Evolutionary Strategies and Reinforcement Learning (RL) tech- niques to find an approximation of the optimal solution. The RL based agent uses the Policy Gradient method, an evolutionary reward strategy and a Pool of good (not-optimal) solutions to find the approximation. We find that the RL agent learns near-optimal solutions that outperforms the heuristic solutions. We also observe that the RL agent assisted with a pool generalizes better for unseen problems than an RL agent without a pool. We present our results on synthetic data as well as on subsets of real-world problems taken from container terminal. The results validate that our approach does comparatively better than the heuristics solutions available, and adapts to unseen problems better.

[64]  arXiv:1805.06665 [pdf, other]
Title: Classifying medical relations in clinical text via convolutional neural networks
Authors: Bin He, Yi Guan, Rui Dai
Comments: Accepted by Artificial Intelligence In Medicine
Subjects: Computation and Language (cs.CL)

Deep learning research on relation classification has achieved solid performance in the general domain. This study proposes a convolutional neural network (CNN) architecture with a multi-pooling operation for medical relation classification on clinical records and explores a loss function with a category-level constraint matrix. Experiments using the 2010 i2b2/VA relation corpus demonstrate these models, which do not depend on any external features, outperform previous single-model methods and our best model is competitive with the existing ensemble-based method.

[65]  arXiv:1805.06670 [pdf, ps, other]
Title: Show me the Cache: Optimizing Cache-Friendly Recommendations for Sequential Content Access
Subjects: Networking and Internet Architecture (cs.NI)

Caching has been successfully applied in wired networks, in the context of Content Distribution Networks (CDNs), and is quickly gaining ground for wireless systems. Storing popular content at the edge of the network (e.g. at small cells) is seen as a `win-win' for both the user (reduced access latency) and the operator (reduced load on the transport network and core servers). Nevertheless, the much smaller size of such edge caches, and the volatility of user preferences suggest that standard caching methods do not suffice in this context. What is more, simple popularity-based models commonly used (e.g. IRM) are becoming outdated, as users often consume multiple contents in sequence (e.g. YouTube, Spotify), and this consumption is driven by recommendation systems. The latter presents a great opportunity to bias the recommender to minimize content access cost (e.g. maximizing cache hit rates). To this end, in this paper we first propose a Markovian model for recommendation-driven user requests. We then formulate the problem of biasing the recommendation algorithm to minimize access cost, while maintaining acceptable recommendation quality. We show that the problem is non-convex, and propose an iterative ADMM-based algorithm that outperforms existing schemes, and shows significant potential for performance improvement on real content datasets.

[66]  arXiv:1805.06677 [pdf, other]
Title: Realizing Wireless Communication through Software-defined HyperSurface Environments
Comments: This paper appears at the 19TH IEEE WOWMOM 2018, JUNE 12-15, 2018. (Technical program: this http URL) This work was funded by the European Union via the Horizon 2020: Future Emerging Topics call (FETOPEN-RIA), grant EU736876, project VISORSURF (this http URL) : HyperSurfaces-A Hardware Platform for Software-driven Functional Metasurfaces
Subjects: Emerging Technologies (cs.ET); Networking and Internet Architecture (cs.NI); Software Engineering (cs.SE); Systems and Control (cs.SY)

Wireless communication environments are unaware of the ongoing data exchange efforts within them. Moreover, their effect on the communication quality is intractable in all but the simplest cases. The present work proposes a new paradigm, where indoor scattering becomes software-defined and, subsequently, optimizable across wide frequency ranges. Moreover, the controlled scattering can surpass natural behavior, exemplary overriding Snell's law, reflecting waves towards any custom angle (including negative ones). Thus, path loss and multi-path fading effects can be controlled and mitigated. The core technology of this new paradigm are metasurfaces, planar artificial structures whose effect on impinging electromagnetic waves is fully defined by their macro-structure. The present study contributes the software-programmable wireless environment model, consisting of several HyperSurface tiles controlled by a central, environment configuration server. HyperSurfaces are a novel class of metasurfaces whose structure and, hence, electromagnetic behavior can be altered and controlled via a software interface. Multiple networked tiles coat indoor objects, allowing fine-grained, customizable reflection, absorption or polarization overall. A central server calculates and deploys the optimal electromagnetic interaction per tile, to the benefit of communicating devices. Realistic simulations using full 3D ray-tracing demonstrate the groundbreaking potential of the proposed approach in 2.4 GHz and 60 GHz frequencies.

[67]  arXiv:1805.06680 [pdf]
Title: Detecting cyber threats through social network analysis: short survey
Journal-ref: Lyudmyla Kirichenko, Tamara Radivilova, Anders Carlsson. Detecting cyber threats through social network analysis: short survey. SocioEconomic Challenges, Volume 1, Issue 1, 2017. pp.20-34
Subjects: Social and Information Networks (cs.SI); Cryptography and Security (cs.CR); Computers and Society (cs.CY)

This article considers a short survey of basic methods of social networks analysis, which are used for detecting cyber threats. The main types of social network threats are presented. Basic methods of graph theory and data mining, that deals with social networks analysis are described. Typical security tasks of social network analysis, such as community detection in network, detection of leaders in communities, detection experts in networks, clustering text information and others are considered.

[68]  arXiv:1805.06685 [pdf, other]
Title: The Synchronizing Probability Function for Primitive Sets of Matrices
Comments: 18 pages, 4 figures. Submitted to DLT 2018
Subjects: Formal Languages and Automata Theory (cs.FL); Discrete Mathematics (cs.DM)

Motivated by recent results relating synchronizing automata and primitive sets, we tackle the synchronization process and the related longstanding \v{C}ern\'y conjecture by studying the primitivity phenomenon for sets of nonnegative matrices having neither zero-rows nor zero-columns. We formulate the primitivity process in the setting of a two-player probabilistic game and we make use of convex optimization techniques to describe its behavior. We report numerical results and supported by them we state a conjecture that, if true, would imply an upper bound of n(n-1) on the reset threshold of a certain (broad) class of automata.

[69]  arXiv:1805.06691 [pdf]
Title: Test for penetration in Wi-Fi network: attacks on WPA2-PSK and WPA2-Enterprise
Comments: 4 pages
Journal-ref: T. Radivilova and H. A. Hassan, "Test for penetration in Wi-Fi network: Attacks on WPA2-PSK and WPA2-enterprise," (UkrMiCo), Odessa, 2017, pp. 1-4
Subjects: Cryptography and Security (cs.CR); Networking and Internet Architecture (cs.NI)

In this work the wireless networks security algorithms were analyzed. The fundamentals of the WPA and WPA2 safety algorithms, their weaknesses and ways of attacking WPA and WPA2 Enterprise Wireless Networks are described. Successful attack on the WPA2-PSK and WPA2-Enterprise was carried out during the performance of work. The progress of this attack and its results were described.

[70]  arXiv:1805.06692 [pdf, ps, other]
Title: A Note on Polynomial Identity Testing for Depth-3 Circuits
Subjects: Computational Complexity (cs.CC)

Let $C$ be a depth-3 arithmetic circuit of size at most $s$, computing a polynomial $ f \in \mathbb{F}[x_1,\ldots, x_n] $ (where $\mathbb{F}$ = $\mathbb{Q}$ or $\mathbb{C}$) and the fan-in of the product gates of $C$ is bounded by $d$. We give a deterministic polynomial identity testing algorithm to check whether $f\equiv 0$ or not in time $ 2^d \text{ poly}(n,s) $.

[71]  arXiv:1805.06693 [pdf, other]
Title: Hierarchical Beamforming: Resource Allocation, Fairness and Flow Level Performance
Comments: 34 pages
Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT); Performance (cs.PF); Systems and Control (cs.SY)

We consider hierarchical beamforming in wireless networks. For a given population of flows, we propose computationally efficient algorithms for fair rate allocation including proportional fairness and max-min fairness. We next propose closed-form formulas for flow level performance, for both elastic (with either proportional fairness and max-min fairness) and streaming traffic. We further assess the performance of hierarchical beamforming using numerical experiments. Since the proposed solutions have low complexity compared to conventional beamforming, our work suggests that hierarchical beamforming is a promising candidate for the implementation of beamforming in future cellular networks.

[72]  arXiv:1805.06695 [pdf, other]
Title: Survey on Multi-Access Edge Computing for Internet of Things Realization
Comments: Submitted to IEEE Communications Surveys & Tutorials
Subjects: Networking and Internet Architecture (cs.NI)

The Internet of Things (IoT) has recently advanced from an experimental technology to what will become the backbone of future customer value for both product and service sector businesses. This underscores the cardinal role of IoT on the journey towards the fifth generation (5G) of wireless communication systems. IoT technologies augmented with intelligent and big data analytics are expected to rapidly change the landscape of myriads of application domains ranging from health care to smart cities and industrial automations. The emergence of Multi-Access Edge Computing (MEC) technology aims at extending cloud computing capabilities to the edge of the radio access network, hence providing real-time, high-bandwidth, low-latency access to radio network resources. IoT is identified as a key use case of MEC, given MEC's ability to provide cloud platform and gateway services at the network edge. MEC will inspire the development of myriads of applications and services with demand for ultra low latency and high Quality of Service (QoS) due to its dense geographical distribution and wide support for mobility. MEC is therefore an important enabler of IoT applications and services which require real-time operations. In this survey, we provide a holistic overview on the exploitation of MEC technology for the realization of IoT applications and their synergies. We further discuss the technical aspects of enabling MEC in IoT and provide some insight into various other integration technologies therein.

[73]  arXiv:1805.06698 [pdf, other]
Title: Fuzzy Membership Function Implementation with Memristor
Subjects: Emerging Technologies (cs.ET)

The neuro-fuzzy system is network which resemble human-like operation of the naturally imprecise data and decision-making. This paper proposes implementation of the fundamental module of the neuro-fuzzy system - membership function (MF), realized as a analog electronic hardware. The memristive crossbar arrays are used as the architecture for proposed MF analog circuit. The main advantages of the memristive crossbar circuit are size, energy efficiency and fault tolerance compared to another analog alternatives. The conducted crossbar SPICE simulation with MS model of the memristor results confirm the performance and highlighted benefits of the proposed circuit.

[74]  arXiv:1805.06699 [pdf, other]
Title: Dual parameterization of Weighted Coloring
Comments: 13 pages
Subjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC)

Given a graph $G$, a proper $k$-coloring of $G$ is a partition $c = (S_i)_{i\in [1,k]}$ of $V(G)$ into $k$ stable sets $S_1,\ldots, S_{k}$. Given a weight function $w: V(G) \to \mathbb{R}^+$, the weight of a color $S_i$ is defined as $w(i) = \max_{v \in S_i} w(v)$ and the weight of a coloring $c$ as $w(c) = \sum_{i=1}^{k}w(i)$. Guan and Zhu [Inf. Process. Lett., 1997] defined the weighted chromatic number of a pair $(G,w)$, denoted by $\sigma(G,w)$, as the minimum weight of a proper coloring of $G$. The problem of determining $\sigma(G,w)$ has received considerable attention during the last years, and has been proved to be notoriously hard: for instance, it is NP-hard on split graphs, unsolvable on $n$-vertex trees in time $n^{o(\log n)}$ unless the ETH fails, and W[1]-hard on forests parameterized by the size of a largest tree. In this article we provide some positive results for the problem, by considering its so-called dual parameterization: given a vertex-weighted graph $(G,w)$ and an integer $k$, the question is whether $\sigma(G,w) \leq \sum_{v \in V(G)} w(v) - k$. We prove that this problem is FPT by providing an algorithm running in time $9^k \cdot n^{O(1)}$, and it is easy to see that no algorithm in time $2^{o(k)} \cdot n^{O(1)}$ exists under the ETH. On the other hand, we present a kernel with at most $(2^{k-1}+1) (k-1)$ vertices, and we rule out the existence of polynomial kernels unless ${\sf NP} \subseteq {\sf coNP} / {\sf poly}$, even on split graphs with only two different weights. Finally, we identify some classes of graphs on which the problem admits a polynomial kernel, in particular interval graphs and subclasses of split graphs, and in the latter case we present lower bounds on the degrees of the polynomials.

[75]  arXiv:1805.06701 [pdf, ps, other]
Title: Quadratic Word Equations with Length Constraints, Counter Systems, and Presburger Arithmetic with Divisibility
Comments: 18 pages
Subjects: Logic in Computer Science (cs.LO)

Word equations are a crucial element in the theoretical foundation of constraint solving over strings, which have received a lot of attention in recent years. A word equation relates two words over string variables and constants. Its solution amounts to a function mapping variables to constant strings that equate the left and right hand sides of the equation. While the problem of solving word equations is decidable, the decidability of the problem of solving a word equation with a length constraint (i.e., a constraint relating the lengths of words in the word equation) has remained a long-standing open problem. In this paper, we focus on the subclass of quadratic word equations, i.e., in which each variable occurs at most twice. We first show that the length abstractions of solutions to quadratic word equations are in general not Presburger-definable. We then describe a class of counter systems with Presburger transition relations which capture the length abstraction of a quadratic word equation with regular constraints. We provide an encoding of the effect of a simple loop of the counter systems in the theory of existential Presburger Arithmetic with divisibility (PAD). Since PAD is decidable, we get a decision procedure for quadratic words equations with length constraints for which the associated counter system is \emph{flat} (i.e., all nodes belong to at most one cycle). We show a decidability result (in fact, also an NP algorithm with a PAD oracle) for a recently proposed NP-complete fragment of word equations called regular-oriented word equations, together with length constraints. Decidability holds when the constraints are additionally extended with regular constraints with a 1-weak control structure.

[76]  arXiv:1805.06702 [pdf, other]
Title: Data-Driven Nonlinear Identification of Li-Ion Battery Based on a Frequency Domain Nonparametric Analysis
Journal-ref: IEEE Transactions on Control Systems Technology, Volume: 25, Issue: 5, Sept. 2017
Subjects: Systems and Control (cs.SY)

Lithium ion batteries are attracting significant and growing interest, because their high energy and high power density render them an excellent option for energy storage, particularly in hybrid and electric vehicles. In this brief, a data-driven polynomial nonlinear state-space model is proposed for the operating points at the cusp of linear and nonlinear regimes of the battery's electrical operation, based on the thorough nonparametric frequency domain characterization and quantification of the battery's behavior in terms of its linear and nonlinear behavior at different levels of the state of charge.

[77]  arXiv:1805.06706 [pdf, ps, other]
Title: Systematic encoders for generalized Gabidulin codes and the $q$-analogue of Cauchy matrices
Authors: Alessandro Neri
Comments: 26 pages
Subjects: Information Theory (cs.IT)

We characterize the generator matrix in standard form of generalized Gabidulin codes. The parametrization we get for the non-systematic part of this matrix coincides with the $q$-analogue of generalized Cauchy matrices, leading to the definition of generalized rank Cauchy matrices. These matrices can be represented very conveniently and their representation allows to define new interesting subfamilies of generalized Gabidulin codes whose generator matrix is a structured matrix. In particular, as an application, we construct Gabidulin codes whose generator matrix is the concatenation of an identity block and a Toeplitz/Hankel matrix. In addition, our results allow to give a new efficient criterion to verify whether a rank metric code of dimension $k$ and length $n$ is a generalized Gabidulin code. This criterion is only based on the computation of the rank of one matrix and on the verification of the linear independence of two sets of elements and it requires $\mathcal O(k^2n)$ field operations.

[78]  arXiv:1805.06723 [pdf, other]
Title: On randomized generation of slowly synchronizing automata
Comments: 19 pages, 6 figures. Submitted to MFCS 2018
Subjects: Formal Languages and Automata Theory (cs.FL); Discrete Mathematics (cs.DM)

Motivated by the randomized generation of slowly synchronizing automata, we study automata made of permutation letters and a merging letter of rank $ n- 1 $. We present a constructive randomized procedure to generate minimal synchronizing automata of that kind with (potentially) large alphabet size based on recent results on \textit{primitive} sets of matrices. We report numerical results showing that our algorithm finds automata with much larger reset threshold than a mere uniform random generation and we present new families of automata with reset threshold of $ \Omega(n^2/4) $. We finally report theoretical results on randomized generation of primitive sets of matrices: a set of permutation matrices with a $ 0 $ entry changed into a $ 1 $ is primitive and has exponent of $ O(n\log n) $ with high probability in case of uniform random distribution and the same holds for a random set of binary matrices where each entry is set, independently, equal to $ 1 $ with probability $ p $ and equal to $ 0 $ with probability $ 1-p $, when $ np-\log n\rightarrow\infty $ as $ n\rightarrow\infty $.

[79]  arXiv:1805.06724 [pdf, other]
Title: Exploiting the Superposition Property of Wireless Communication for Max-Consensus Problems in Multi-Agent Systems
Comments: Submitted for IFAC Workshop on Distributed Estimation and Control in Networked Systems
Subjects: Systems and Control (cs.SY)

This paper presents a consensus protocol that achieves max-consensus in multi-agent systems over wireless channels. Interference, a feature of the wireless channel, is exploited: each agent receives a superposition of broadcast data, rather than individual values. With this information, the system endowed with the proposed consensus protocol reaches max-consensus in a finite number of steps. A comparison with traditional approaches shows that the proposed consensus protocol achieves a faster convergence.

[80]  arXiv:1805.06725 [pdf, other]
Title: GANomaly: Semi-Supervised Anomaly Detection via Adversarial Training
Comments: 10 pages, 4 figures. Conference Submission
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Anomaly detection is a classical problem in computer vision, namely the determination of the normal from the abnormal when datasets are highly biased towards one class (normal) due to the insufficient sample size of the other class (abnormal). While this can be addressed as a supervised learning problem, a significantly more challenging problem is that of detecting the unknown/unseen anomaly case that takes us instead into the space of a one-class, semi-supervised learning paradigm. We introduce such a novel anomaly detection model, by using a conditional generative adversarial network that jointly learns the generation of high-dimensional image space and the inference of latent space. Employing encoder-decoder-encoder sub-networks in the generator network enables the model to map the input image to a lower dimension vector, which is then used to reconstruct the generated output image. The use of the additional encoder network maps this generated image to its latent representation. Minimizing the distance between these images and the latent vectors during training aids in learning the data distribution for the normal samples. As a result, a larger distance metric from this learned data distribution at inference time is indicative of an outlier from that distribution - an anomaly. Experimentation over several benchmark datasets, from varying domains, shows the model efficacy and superiority over previous state-of-the-art approaches.

[81]  arXiv:1805.06728 [pdf, other]
Title: A Distributed Algorithm for Finding Hamiltonian Cycles in Random Graphs in O(log n) Time
Authors: Volker Turau
Comments: 17 pages; 4 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

It is known for some time that a random graph $G(n,p)$ contains w.h.p. a Hamiltonian cycle if $p$ is larger than the critical value $p_{crit}= (\log n + \log \log n + \omega_n)/n$. The determination of a concrete Hamiltonian cycle is even for values much larger than $p_{crit}$ a nontrivial task. In this paper we consider random graphs $G(n,p)$ with $p$ in $\tilde{\Omega}(1/\sqrt{n})$, where $\tilde{\Omega}$ hides poly-logarithmic factors in $n$. For this range of $p$ we present a distributed algorithm ${\cal A}_{HC}$ that finds w.h.p. a Hamiltonian cycle in $O(\log n)$ rounds. The algorithm works in the synchronous model and uses messages of size $O(\log n)$ and $O(\log n)$ memory per node.

[82]  arXiv:1805.06736 [pdf, ps, other]
Title: Strict Ideal Completions of the Lambda Calculus
Authors: Patrick Bahr
Subjects: Logic in Computer Science (cs.LO)

The infinitary lambda calculi pioneered by Kennaway et al. extend the basic lambda calculus by metric completion to infinite terms and reductions. Depending on the chosen metric, the resulting infinitary calculi exhibit different notions of strictness. To obtain infinitary normalisation and infinitary confluence properties for these calculi, Kennaway et al. extend $\beta$-reduction with infinitely many `$\bot$-rules', which contract meaningless terms directly to $\bot$. Three of the resulting B\"ohm reduction calculi have unique infinitary normal forms corresponding to B\"ohm-like trees.
In this paper we develop a corresponding theory of infinitary lambda calculi based on ideal completion instead of metric completion. We show that each of our calculi conservatively extends the corresponding metric-based calculus. Three of our calculi are infinitarily normalising and confluent; their unique infinitary normal forms are exactly the B\"ohm-like trees of the corresponding metric-based calculi. Our calculi dispense with the infinitely many $\bot$-rules of the metric-based calculi. The fully non-strict calculus (called $111$) consists of only $\beta$-reduction, while the other two calculi (called $001$ and $101$) require two additional rules that precisely state their strictness properties: $\lambda x.\bot \to \bot$ (for $001$) and $\bot\,M \to \bot$ (for $001$ and $101$).

[83]  arXiv:1805.06737 [pdf, other]
Title: A Robust Background Initialization Algorithm with Superpixel Motion Detection
Comments: submitted to Elsevier Signal Processing: Image Communication
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Scene background initialization allows the recovery of a clear image without foreground objects from a video sequence, which is generally the first step in many computer vision and video processing applications. The process may be strongly affected by some challenges such as illumination changes, foreground cluttering, intermittent movement, etc. In this paper, a robust background initialization approach based on superpixel motion detection is proposed. Both spatial and temporal characteristics of frames are adopted to effectively eliminate foreground objects. A subsequence with stable illumination condition is first selected for background estimation. Images are segmented into superpixels to preserve spatial texture information and foreground objects are eliminated by superpixel motion filtering process. A low-complexity density-based clustering is then performed to generate reliable background candidates for final background determination. The approach has been evaluated on SBMnet dataset and it achieves a performance superior or comparable to other state-of-the-art works with faster processing speed. Moreover, in those complex and dynamic categories, the algorithm produces the best results showing the robustness against very challenging scenarios.

[84]  arXiv:1805.06741 [pdf, ps, other]
Title: Minimum Margin Loss for Deep Face Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Face recognition has achieved great progress owing to the fast development of the deep neural network in the past a few years. As the baton in a deep neural network, a number of the loss functions have been proposed which significantly improve the state-of-the-art methods. In this paper, we proposed a new loss function called Minimum Margin Loss (MML) which aims at enlarging the margin of those over-close class centre pairs so as to enhance the discriminative ability of the deep features. MML supervises the training process together with the Softmax loss and the Centre loss, and also makes up the defect of Softmax + Centre loss. The experimental results on LFW and YTF datasets show that the proposed method achieves the state-of-the-art performance, which demonstrates the effectiveness of the proposed MML.

[85]  arXiv:1805.06745 [pdf]
Title: Web Resource for Storing Collective Experience
Authors: Olegs Verhodubs
Subjects: Information Retrieval (cs.IR)

Experience is what makes our life more effective that is why it is necessary to share experience among people. The use of information technologies is the most technological way to work with experience, and the use of the Web is the best way for sharing it. This paper describes a web resource designed for storing, sharing and using experience that is obtained from different people in the Web. The main purpose of this paper is to present this web resource in order to evaluate the interest in such a web resource.

[86]  arXiv:1805.06747 [pdf, other]
Title: ReCA: an Efficient Reconfigurable Cache Architecture for Storage Systems with Online Workload Characterization
Journal-ref: IEEE TPDS 2018
Subjects: Performance (cs.PF); Hardware Architecture (cs.AR)

In recent years, SSDs have gained tremendous attention in computing and storage systems due to significant performance improvement over HDDs. The cost per capacity of SSDs, however, prevents them from entirely replacing HDDs in such systems. One approach to effectively take advantage of SSDs is to use them as a caching layer to store performance critical data blocks to reduce the number of accesses to disk subsystem. Due to characteristics of Flash-based SSDs such as limited write endurance and long latency on write operations, employing caching algorithms at the Operating System (OS) level necessitates to take such characteristics into consideration. Previous caching techniques are optimized towards only one type of application, which affects both generality and applicability. In addition, they are not adaptive when the workload pattern changes over time. This paper presents an efficient Reconfigurable Cache Architecture (ReCA) for storage systems using a comprehensive workload characterization to find an optimal cache configuration for I/O intensive applications. For this purpose, we first investigate various types of I/O workloads and classify them into five major classes. Based on this characterization, an optimal cache configuration is presented for each class of workloads. Then, using the main features of each class, we continuously monitor the characteristics of an application during system runtime and the cache organization is reconfigured if the application changes from one class to another class of workloads. The cache reconfiguration is done online and workload classes can be extended to emerging I/O workloads in order to maintain its efficiency with the characteristics of I/O requests. Experimental results obtained by implementing ReCA in a server running Linux show that the proposed architecture improves performance and lifetime up to 24\% and 33\%, respectively.

[87]  arXiv:1805.06749 [pdf, ps, other]
Title: Action Completion: A Temporal Model for Moment Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We introduce completion moment detection for actions - the problem of locating the moment of completion, when the action's goal is confidently considered achieved. The paper proposes a joint classification-regression recurrent model that predicts completion from a given frame, and then integrates frame-level contributions to detect sequence-level completion moment. We introduce a recurrent voting node that predicts the frame's relative position of the completion moment by either classification or regression. The method is also capable of detecting incompletion. For example, the method is capable of detecting a missed ball-catch, as well as the moment at which the ball is safely caught. We test the method on 16 actions from three public datasets, covering sports as well as daily actions. Results show that when combining contributions from frames prior to the completion moment as well as frames post completion, the completion moment is detected within one second in 89% of all tested sequences.

[88]  arXiv:1805.06752 [pdf, other]
Title: Scheduling Policies for Age Minimization in Wireless Networks with Unknown Channel State
Comments: arXiv admin note: substantial text overlap with arXiv:1803.06471
Subjects: Networking and Internet Architecture (cs.NI)

Age of information (AoI) is a recently proposed metric that measures the time elapsed since the generation of the last received information update. We consider the problem of AoI minimization for a network under general interference constraints, and time varying channel. We study the case where the channel statistics are known, but the current channel state is unknown. We propose two scheduling policies, namely, the virtual queue based policy and age-based policy. In the virtual queue based policy, the scheduler schedules links with maximum weighted sum of the virtual queue lengths, while in the age-based policy, the scheduler schedules links with maximum weighted sum of a function of link AoI. We prove that the virtual queue based policy is peak age optimal, up to an additive constant, while the age-based policy is at most factor 4 away from the optimal age. Numerical results suggest that both the proposed policies are, in fact, very close to the optimal.

[89]  arXiv:1805.06757 [pdf, ps, other]
Title: Matching Consecutive Subpatterns Over Streaming Time Series
Comments: 15 pages, 8 figures
Subjects: Databases (cs.DB)

Pattern matching of streaming time series with lower latency under limited computing resource comes to a critical problem, especially as the growth of Industry 4.0 and Industry Internet of Things. However, against traditional single pattern matching model, a pattern may contain multiple subpatterns representing different physical meanings in the real world. Hence, we formulate a new problem, called "consecutive subpatterns matching", which allows users to specify a pattern containing several consecutive subpatterns with various specified thresholds. We propose a novel representation Equal-Length Block (ELB) together with two efficient implementations, which work very well under all Lp-Norms without false dismissals. Extensive experiments are performed on synthetic and real-world datasets to illustrate that our approach outperforms the brute-force method and MSM, a multi-step filter mechanism over the multi-scaled representation by orders of magnitude.

[90]  arXiv:1805.06771 [pdf, other]
Title: Convolutional Social Pooling for Vehicle Trajectory Prediction
Comments: Accepted for publication at CVPR TrajNet Workshop, 2018. arXiv admin note: text overlap with arXiv:1805.05499
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Forecasting the motion of surrounding vehicles is a critical ability for an autonomous vehicle deployed in complex traffic. Motion of all vehicles in a scene is governed by the traffic context, i.e., the motion and relative spatial configuration of neighboring vehicles. In this paper we propose an LSTM encoder-decoder model that uses convolutional social pooling as an improvement to social pooling layers for robustly learning interdependencies in vehicle motion. Additionally, our model outputs a multi-modal predictive distribution over future trajectories based on maneuver classes. We evaluate our model using the publicly available NGSIM US-101 and I-80 datasets. Our results show improvement over the state of the art in terms of RMS values of prediction error and negative log-likelihoods of true future trajectories under the model's predictive distribution. We also present a qualitative analysis of the model's predicted distributions for various traffic scenarios.

[91]  arXiv:1805.06775 [pdf, other]
Title: Circularly Pulse-Shaped Precoding for OFDM: A New Waveform and Its Optimization Design for 5G New Radio
Comments: 15 pages, 21 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Information Theory (cs.IT)

A new circularly pulse-shaped (CPS) precoding orthogonal frequency division multiplexing (OFDM) waveform, or CPS-OFDM for short, is proposed in this paper. CPS-OFDM, characterized by user-specific precoder flexibility, possesses the advantages of both low out-of-subband emission (OSBE) and low peak-to-average power ratio (PAPR), which are two major desired physical layer signal properties for various scenarios in 5G New Radio (NR), including fragmented spectrum access, new types of user equipments (UEs), and communications at high carrier frequencies. As opposed to most of existing waveform candidates using windowing or filtering techniques, CPS-OFDM prevents block extension that causes extra inter-block interference (IBI) and envelope fluctuation unfriendly to signal reception and power amplifier (PA) efficiency, respectively. An optimization problem of the prototype shaping vector built in the CPS precoder is formulated to minimize the variance of instantaneous power (VIP) with controllable OSBE power (OSBEP) and noise enhancement penalty (NEP). In order to solve the optimization problem involving a quartic objective function, the majorization-minimization (MM) algorithmic framework is exploited. By proving the convexity of the proposed problem, the globally optimal solution invariant of coming data is guaranteed to be attained via numbers of iterations. Simulation results demonstrate the advantages of the proposed scheme in terms of detection reliability and spectral efficiency for practical 5G cases such as asynchronous transmissions and mixed numerologies.

[92]  arXiv:1805.06776 [pdf, other]
Title: Situation Assessment for Planning Lane Changes: Combining Recurrent Models and Prediction
Comments: ICRA 2018
Subjects: Computer Vision and Pattern Recognition (cs.CV)

One of the greatest challenges towards fully autonomous cars is the understanding of complex and dynamic scenes. Such understanding is needed for planning of maneuvers, especially those that are particularly frequent such as lane changes. While in recent years advanced driver-assistance systems have made driving safer and more comfortable, these have mostly focused on car following scenarios, and less on maneuvers involving lane changes. In this work we propose a situation assessment algorithm for classifying driving situations with respect to their suitability for lane changing. For this, we propose a deep learning architecture based on a Bidirectional Recurrent Neural Network, which uses Long Short-Term Memory units, and integrates a prediction component in the form of the Intelligent Driver Model. We prove the feasibility of our algorithm on the publicly available NGSIM datasets, where we outperform existing methods.

[93]  arXiv:1805.06780 [pdf, other]
Title: The Crossing Number of Single-Pair-Seq-Shellable Drawings of Complete Graphs
Comments: arXiv admin note: substantial text overlap with arXiv:1803.07515
Subjects: Computational Geometry (cs.CG); Combinatorics (math.CO)

The Harary-Hill Conjecture states that for $n\geq 3$ every drawing of $K_n$ has at least \begin{align*} H(n) := \frac{1}{4}\Big\lfloor\frac{n}{2}\Big\rfloor\Big\lfloor\frac{n-1}{2}\Big\rfloor\Big\lfloor\frac{n-2}{2}\Big\rfloor\Big\lfloor\frac{n-3}{2}\Big\rfloor \end{align*} crossings. In general the problem remains unsolved, however there has been some success in proving the conjecture for restricted classes of drawings. The most recent and most general of these classes is seq-shellability. In this work, we improve these results and introduce the new class of single-pair-seq-shellable drawings. We prove the Harary-Hill Conjecture for this new class using novel results on triple cumulated $k$-edges. So far, all approaches for proving the Harary-Hill Conjecture for specific classes rely on a globally fixed reference face. We successfully apply new techniques in order to loosen this restriction, which enables us to select different reference faces when considering subdrawings. Furthermore, we introduce the notion of $k$-deviations as the difference between an optimal and the actual number of $k$-edges. Using $k$-deviations, we gain interesting insights into the essence of $k$-edges, and we further relax the necessity of a fixed reference face.

[94]  arXiv:1805.06781 [pdf, other]
Title: Property and structure in constructive analysis
Authors: Auke B. Booij
Comments: 19 pages, submitted
Subjects: Logic (math.LO); Logic in Computer Science (cs.LO)

Real numbers such as Dedekind reals or (quotiented) Cauchy reals (as opposed to Bishop-style Cauchy reals) do not admit a procedure for observing information such as the first digit of its decimal expansion, because, for example, there are no non-constant functions into observable types such as the booleans or the integers. We overcome this by considering real numbers equipped with additional structure, which we call a locator. With this structure, it is possible, for instance, to construct a signed-digit representation or a Cauchy sequence. Such constructions are reminiscent of computable analysis. However, instead of working with a notion of computability, we simply work constructively to extract observational information, by changing one axiom of Dedekind cuts from property into structure.

[95]  arXiv:1805.06782 [pdf, other]
Title: Extraction and Analysis of Dynamic Conversational Networks from TV Series
Authors: Xavier Bost (LIA), Vincent Labatut (LIA), Serigne Gueye (LIA), Georges Linarès (LIA)
Comments: arXiv admin note: substantial text overlap with arXiv:1602.07811
Journal-ref: Springer. Extraction and analysis of dynamic conversational networks from TV series, 2018, \&\#x3008;10.1007/978-3-319-78196-9\_3\&\#x3009
Subjects: Multimedia (cs.MM)

Identifying and characterizing the dynamics of modern tv series subplots is an open problem. One way is to study the underlying social network of interactions between the characters. Standard dynamic network extraction methods rely on temporal integration, either over the whole considered period, or as a sequence of several time-slices. However, they turn out to be inappropriate in the case of tv series, because the scenes shown onscreen alternatively focus on parallel storylines, and do not necessarily respect a traditional chronology. In this article, we introduce Narrative Smoothing, a novel network extraction method taking advantage of the plot properties to solve some of their limitations. We apply our method to a corpus of 3 popular series, and compare it to both standard approaches. Narrative smoothing leads to more relevant observations when it comes to the characterization of the protagonists and their relationships, confirming its appropriateness to model the intertwined storylines constituting the plots.

[96]  arXiv:1805.06786 [pdf, other]
Title: Betting on Blockchain Consensus with Fantomette
Comments: arXiv admin note: text overlap with arXiv:1801.07965
Subjects: Cryptography and Security (cs.CR)

Blockchain-based consensus protocols present the opportunity to develop new protocols, due to their novel requirements of open participation and explicit incentivization of participants. To address the first requirement, it is necessary to consider the leader election inherent in consensus protocols, which can be difficult to scale to a large and untrusted set of participants. To address the second, it is important to consider ways to provide incentivization without relying on the resource-intensive proofs-of-work used in Bitcoin. In this paper, we propose a new leader election protocol, Caucus. We both prove it secure and demonstrate via a prototype implementation its cost-effectiveness. We next fit this leader election protocol into a broader blockchain-based consensus protocol, Fantomette, that treats incentivization as a first-class concern. Again, we argue for the security of Fantomette despite the fact that it does not rely on expensive processes such as proof-of-work.

[97]  arXiv:1805.06792 [pdf, ps, other]
Title: Faster Rates for Convex-Concave Games
Comments: COLT 2018
Subjects: Learning (cs.LG); Machine Learning (stat.ML)

We consider the use of no-regret algorithms to compute equilibria for particular classes of convex-concave games. While standard regret bounds would lead to convergence rates on the order of $O(T^{-1/2})$, recent work \citep{RS13,SALS15} has established $O(1/T)$ rates by taking advantage of a particular class of optimistic prediction algorithms. In this work we go further, showing that for a particular class of games one achieves a $O(1/T^2)$ rate, and we show how this applies to the Frank-Wolfe method and recovers a similar bound \citep{D15}. We also show that such no-regret techniques can even achieve a linear rate, $O(\exp(-T))$, for equilibrium computation under additional curvature assumptions.

[98]  arXiv:1805.06798 [pdf, other]
Title: Generic Deriving of Generic Traversals
Comments: 28 pages, ICFP
Subjects: Programming Languages (cs.PL)

Functional programmers have an established tradition of using traversals as a design pattern to work with recursive data structures. The technique is so prolific that a whole host of libraries have been designed to help in the task of automatically providing traversals by analysing the generic structure of data types. More recently, lenses have entered the functional scene and have proved themselves to be a simple and versatile mechanism for working with product types. They make it easy to focus on the salient parts of a data structure in a composable and reusable manner.
In this paper, we use the combination of lenses and traversals to give rise to an expressive and flexible library for querying and modifying complex data structures. Furthermore, since our lenses and traversals are based on the generic shape of data, we are able to use this information to produce code that is as efficient as hand-written versions. The technique leverages the structure of data to produce generic abstractions that are then eliminated by the standard workhorses of modern functional compilers: inlining and specialisation.

[99]  arXiv:1805.06801 [pdf, other]
Title: Dependability in a Multi-tenant Multi-framework Deep Learning as-a-Service Platform
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Deep learning (DL), a form of machine learning, is becoming increasingly popular in several application domains. As a result, cloud-based Deep Learning as a Service (DLaaS) platforms have become an essential infrastructure in many organizations. These systems accept, schedule, manage and execute DL training jobs at scale.
This paper explores dependability in the context of a DLaaS platform used in IBM. We begin by explaining how DL training workloads are different, and what features ensure dependability in this context. We then describe the architecture, design and implementation of a cloud-based orchestration system for DL training. We show how this system has been architected with dependability in mind while also being horizontally scalable, elastic, flexible and efficient. We also present an initial empirical evaluation of the overheads introduced by our platform, and discuss tradeoffs between efficiency and dependability.

[100]  arXiv:1805.06814 [pdf, other]
Title: Cellular Network Multi-Access Measurements on the Roads of Värmland, Sweden
Comments: Technical Report extending the results presented at the IFIP Network Traffic Measurement and Analysis Conference (TMA'18), Vienna, 2018
Subjects: Networking and Internet Architecture (cs.NI)

Cooperative Intelligent Transport Systems (C-ITS) make road traffic safer and more efficient, but require the mobile networks to handle time-critical applications. While some applications may need new dedicated communications technologies such as IEEE 802.11p or 5G, other applications can use current cellular networks. This study evaluates the performance that connected vehicles can expect from existing networks, and estimates the potential gain of multi-access by simultaneously transmitting over several operators. We upload time-critical warning messages from buses in Sweden, and characterise transaction times and network availability. We conduct the experiments with different protocols: UDP, TCP, and HTTPS. Our results show that when using UDP, the median transaction time for sending a typical warning message is 135 ms. We also show that multi-access can bring this value down to 73 ms. For time-critical applications requiring transaction times under 200 ms, multi-access can increase availability of the network from to 57.4% to 92.0%.

[101]  arXiv:1805.06816 [pdf]
Title: Annotating Electronic Medical Records for Question Answering
Comments: 10 pages, 2016
Subjects: Computation and Language (cs.CL); Computers and Society (cs.CY)

Our research is in the relatively unexplored area of question answering technologies for patient-specific questions over their electronic health records. A large dataset of human expert curated question and answer pairs is an important pre-requisite for developing, training and evaluating any question answering system that is powered by machine learning. In this paper, we describe a process for creating such a dataset of questions and answers. Our methodology is replicable, can be conducted by medical students as annotators, and results in high inter-annotator agreement (0.71 Cohen's kappa). Over the course of 11 months, 11 medical students followed our annotation methodology, resulting in a question answering dataset of 5696 questions over 71 patient records, of which 1747 questions have corresponding answers generated by the medical students.

[102]  arXiv:1805.06821 [pdf, other]
Title: External memory BWT and LCP computation for sequence collections with applications
Subjects: Data Structures and Algorithms (cs.DS)

We propose an external memory algorithm for the computation of the BWT and LCP array for a collection of sequences. Our algorithm takes the amount of available memory as an input parameter, and tries to make the best use of it by splitting the input collection into subcollections sufficiently small that it can compute their BWT in RAM using an optimal linear time algorithm. Next, it merges the partial BWTs in external memory and in the process it also computes the LCP values. We prove that our algorithm performs O(n AveLcp) sequential I/Os, where n is the total length of the collection, and AveLcp is the average Longest Common Prefix of the collection. This bound is an improvement over the known algorithms for the same task. The experimental results show that our algorithm outperforms the current best algorithm for collections of sequences with different lengths and for collections with relatively small average Longest Common Prefix.
In the second part of the paper, we show that our algorithm can be modified to output two additional arrays that, used with the BWT and LCP arrays, provide simple, scan based, external memory algorithms for three well known problems in bioinformatics: the computation of maximal repeats, the all pairs suffix-prefix overlaps, and the construction of succinct de Bruijn graphs. To our knowledge, there are no other known external memory algorithms for these problems.

[103]  arXiv:1805.06822 [pdf, other]
Title: DNN or $k$-NN: That is the Generalize vs. Memorize Question
Comments: Paper is currently under review for NIPS 2018 conference
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

This paper studies the relationship between the classification performed by deep neural networks and the $k$-NN decision at the embedding space of these networks. This simple important connection shown here provides a better understanding of the relationship between the ability of neural networks to generalize and their tendency to memorize the training data, which are traditionally considered to be contradicting to each other and here shown to be compatible and complementary. Our results support the conjecture that deep neural networks approach Bayes optimal error rates.

[104]  arXiv:1805.06824 [pdf, other]
Title: Learning Time-Sensitive Strategies in Space Fortress
Comments: 8 pages, 3 figures
Subjects: Artificial Intelligence (cs.AI)

Although there has been remarkable progress and impressive performance on reinforcement learning (RL) on Atari games, there are many problems with challenging characteristics that have not yet been explored in Deep Learning for RL. These include reward sparsity, abrupt context-dependent reversals of strategy and time-sensitive game play. In this paper, we present Space Fortress, a game that incorporates all these characteristics and experimentally show that the presence of any of these renders state of the art Deep RL algorithms incapable of learning. Then, we present our enhancements to an existing algorithm and show big performance increases through each enhancement through an ablation study. We discuss how each of these enhancements was able to help and also argue that appropriate transfer learning boosts performance.

[105]  arXiv:1805.06828 [pdf, ps, other]
Title: On two consequences of Berge-Fulkerson conjecture
Comments: 3 pages
Subjects: Discrete Mathematics (cs.DM); Combinatorics (math.CO)

The classical Berge-Fulkerson conjecture states that any bridgeless cubic graph $G$ admits a list of six perfect matchings such that each edge of $G$ belongs to two of the perfect matchings from the list. In this short note, we discuss two statements that are consequences of this conjecture. We show that the first statement is equivalent to Fan-Raspaud conjecture. We also show that the smallest counter-example to the second one is a cyclically $4$-edge-connected cubic graph.

[106]  arXiv:1805.06830 [pdf, other]
Title: Disparity Sliding Window: Object Proposals From Disparity Images
Comments: Submitted to IROS 2018
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Sliding window approaches have been widely used for object recognition tasks in recent years. They guarantee an investigation of the entire input image for the object to be detected and allow a localization of that object. Despite the current trend towards deep neural networks, sliding window methods are still used in combination with convolutional neural networks. The risk of overlooking an object is clearly reduced compared to alternative detection approaches which detect objects based on shape, edges or color. Nevertheless, the sliding window technique strongly increases the computational effort as the classifier has to verify a large number of object candidates. This paper proposes a sliding window approach which also uses depth information from a stereo camera. This leads to a greatly decreased number of object candidates without significantly reducing the detection accuracy. A theoretical investigation of the conventional sliding window approach is presented first. Other publications to date only mentioned rough estimations of the computational cost. A mathematical derivation clarifies the number of object candidates with respect to parameters such as image and object size. Subsequently, the proposed disparity sliding window approach is presented in detail. The approach is evaluated on pedestrian detection with annotations and images from the KITTI object detection benchmark. Furthermore, a comparison with two state-of-the-art methods is made. Code is available in C++ and Python https://github.com/julimueller/ disparity-sliding-window.

[107]  arXiv:1805.06834 [pdf, other]
Title: Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis
Comments: 13 pages, 6 figures
Subjects: Learning (cs.LG); Disordered Systems and Neural Networks (cond-mat.dis-nn); Information Theory (cs.IT); Machine Learning (stat.ML)

We present a high-dimensional analysis of three popular algorithms, namely, Oja's method, GROUSE and PETRELS, for subspace estimation from streaming and highly incomplete observations. We show that, with proper time scaling, the time-varying principal angles between the true subspace and its estimates given by the algorithms converge weakly to deterministic processes when the ambient dimension $n$ tends to infinity. Moreover, the limiting processes can be exactly characterized as the unique solutions of certain ordinary differential equations (ODEs). A finite sample bound is also given, showing that the rate of convergence towards such limits is $\mathcal{O}(1/\sqrt{n})$. In addition to providing asymptotically exact predictions of the dynamic performance of the algorithms, our high-dimensional analysis yields several insights, including an asymptotic equivalence between Oja's method and GROUSE, and a precise scaling relationship linking the amount of missing data to the signal-to-noise ratio. By analyzing the solutions of the limiting ODEs, we also establish phase transition phenomena associated with the steady-state performance of these techniques.

[108]  arXiv:1805.06836 [pdf, other]
Title: Deleting edges to restrict the size of an epidemic in temporal networks
Subjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC)

A variety of potentially disease-spreading contact networks can be naturally modeled with graphs whose structure is subject to discrete changes over time, i.e. with temporal graphs. In such a temporal graph, vertices represent meaningful entities (such as animals or farms) and edges represent potentially infectious contacts between those entities. Furthermore the `availability' of an edge $e$ at time $t$ means that, at time $t$, the entities at the endpoints of $e$ are in contact. In this paper, motivated by network epidemiology applications in the dynamics of disease spreading on a data-derived network, we study the problem of deleting edges and/or edge availabilities from a given temporal graph in order to reduce its (temporal) connectivity. In particular, our aim is to find a temporal subgraph, in which the potential disease of any vertex $u$ can be transferred to only a limited number of other vertices $v$ using a temporal path (i.e. a path from $u$ to $v$, along which the times of the edge availabilities increase). We introduce two natural deletion problems for temporal graphs (for deletion of edges and of edge availabilities, respectively) and we provide positive and negative results on their computational complexity, both in the traditional and the parameterized sense, subject to various natural parameters.

[109]  arXiv:1805.06846 [pdf, other]
Title: RotDCF: Decomposition of Convolutional Filters for Rotation-Equivariant Deep Networks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Machine Learning (stat.ML)

Explicit encoding of group actions in deep features makes it possible for convolutional neural networks (CNNs) to handle global deformations of images, which is critical to success in many vision tasks. This paper proposes to decompose the convolutional filters over joint steerable bases across the space and the group geometry simultaneously, namely a rotation-equivariant CNN with decomposed convolutional filters (RotDCF). This decomposition facilitates computing the joint convolution, which is proved to be necessary for the group equivariance. It significantly reduces the model size and computational complexity while preserving performance, and truncation of the bases expansion serves implicitly to regularize the filters. On datasets involving in-plane and out-of-plane object rotations, RotDCF deep features demonstrate greater robustness and interpretability than regular CNNs. The stability of the equivariant representation to input variations is also proved theoretically under generic assumptions on the filters in the decomposed form. The RotDCF framework can be extended to groups other than rotations, providing a general approach which achieves both group equivariance and representation stability at a reduced model size.

[110]  arXiv:1805.06861 [pdf, other]
Title: Answer Set Programming Modulo `Space-Time'
Subjects: Artificial Intelligence (cs.AI)

We present ASP Modulo `Space-Time', a declarative representational and computational framework to perform commonsense reasoning about regions with both spatial and temporal components. Supported are capabilities for mixed qualitative-quantitative reasoning, consistency checking, and inferring compositions of space-time relations; these capabilities combine and synergise for applications in a range of AI application areas where the processing and interpretation of spatio-temporal data is crucial. The framework and resulting system is the only general KR-based method for declaratively reasoning about the dynamics of `space-time' regions as first-class objects. We present an empirical evaluation (with scalability and robustness results), and include diverse application examples involving interpretation and control tasks.

[111]  arXiv:1805.06862 [pdf, other]
Title: Design Identification of Curve Patterns on Cultural Heritage Objects: Combining Template Matching and CNN-based Re-Ranking
Comments: 11 pages, 12 figures
Subjects: Learning (cs.LG); Machine Learning (stat.ML)

The surfaces of many cultural heritage objects were embellished with various patterns, especially curve patterns. In practice, most of the unearthed cultural heritage objects are highly fragmented, e.g., sherds of potteries or vessels, and each of them only shows a very small portion of the underlying full design, with noise and deformations. The goal of this paper is to address the challenging problem of automatically identifying the underlying full design of curve patterns from such a sherd. Specifically, we formulate this problem as template matching: curve structure segmented from the sherd is matched to each location with each possible orientation of each known full design. In this paper, we propose a new two-stage matching algorithm, with a different matching cost in each stage. In Stage 1, we use a traditional template matching, which is highly computationally efficient, over the whole search space and identify a small set of candidate matchings. In Stage 2, we derive a new matching cost by training a dual-source Convolutional Neural Network (CNN) and apply it to re-rank the candidate matchings identified in Stage 1. We collect 600 pottery sherds with 98 full designs from the Woodland Period in Southeastern North America for experiments and the performance of the proposed algorithm is very competitive.

[112]  arXiv:1805.06864 [pdf, ps, other]
Title: Resource allocation under uncertainty: an algebraic and qualitative treatment
Subjects: Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT)

We use an algebraic viewpoint, namely a matrix framework to deal with the problem of resource allocation under uncertainty in the context of a qualitative approach. Our basic qualitative data are a plausibility relation over the resources, a hierarchical relation over the agents and of course the preference that the agents have over the resources. With this data we propose a qualitative binary relation $\unrhd$ between allocations such that $\mathcal{F}\unrhd \mathcal{G}$ has the following intended meaning: the allocation $\mathcal{F}$ produces more or equal social welfare than the allocation $\mathcal{G}$. We prove that there is a family of allocations which are maximal with respect to $\unrhd$. We prove also that there is a notion of simple deal such that optimal allocations can be reached by sequences of simple deals. Finally, we introduce some mechanism for discriminating {optimal} allocations.

[113]  arXiv:1805.06865 [pdf, other]
Title: Optimal Scheduling and Exact Response Time Analysis for Multistage Jobs
Subjects: Performance (cs.PF); Optimization and Control (math.OC)

Scheduling to minimize mean response time in an M/G/1 queue is a classic problem. The problem is usually addressed in one of two scenarios. In the perfect-information scenario, the scheduler knows each job's exact size, or service requirement. In the zero-information scenario, the scheduler knows only each job's size distribution. The well-known shortest remaining processing time (SRPT) policy is optimal in the perfect-information scenario, and the more complex Gittins index policy is optimal in the zero-information scenario.
In real systems the scheduler often has partial but incomplete information about each job's size. We introduce a new job model, that of multistage jobs, to capture the partial-information scenario. A multistage job consists of a sequence of stages, where both the sequence of stages and stage sizes are unknown, but the scheduler always knows which stage of a job is in progress.
We give an optimal algorithm for scheduling multistage jobs and an exact response time analysis of our algorithm. As a special case of our analysis, we obtain the first closed-form expression for mean response time under the Gittins index policy in the M/G/1 queue.

[114]  arXiv:1805.06869 [pdf, ps, other]
Title: Revisiting the tree edit distance and its backtracing: A tutorial
Authors: Benjamin Paaßen
Comments: Supplementary material for the ICML 2018 paper: Tree Edit Distance Learning via Adaptive Symbol Embeddings
Subjects: Data Structures and Algorithms (cs.DS)

Almost 30 years ago, Zhang and Shasha published a seminal paper describing an efficient dynamic programming algorithm computing the tree edit distance, that is, the minimum number of node deletions, insertions, and replacements that are necessary to transform one tree into another. Since then, the tree edit distance has had widespread applications, for example in bioinformatics and intelligent tutoring systems. However, the original paper of Zhang and Shasha can be challenging to read for newcomers and it does not describe how to efficiently infer the optimal edit script.
In this contribution, we provide a comprehensive tutorial to the tree edit distance algorithm of Zhang and Shasha. We further prove metric properties of the tree edit distance, and describe efficient algorithms to infer the cheapest edit script, as well as a summary of all cheapest edit scripts between two trees.

[115]  arXiv:1805.06872 [pdf, other]
Title: Coding for Interactive Communication with Small Memory and Applications to Robust Circuits
Comments: 40 pages
Subjects: Data Structures and Algorithms (cs.DS)

Classically, coding theory has been concerned with the problem of transmitting a single message in a format which is robust to noise. Recently, researchers have turned their attention to designing coding schemes to make two-way conversations robust to noise. That is, given an interactive communication protocol $\Pi$, an \emph{interactive coding scheme} converts $\Pi$ into another communication protocol $\Pi'$ such that, even if errors are introduced during the execution of $\Pi'$, the parties are able to determine what the outcome of running $\Pi$ would be in a noise-free setting.
We consider the problem of designing interactive coding schemes which allow the parties to simulate the original protocol using little memory. Specifically, given any communication protocol $\Pi$ we construct robust simulating protocols which tolerate a constant noise rate and require the parties to use only $O(\log d \log s)$ memory, where $d$ is the depth of $\Pi$ and $s$ is a measure of the size of $\Pi$. Prior to this work, all known coding schemes required the parties to use at least $\Omega(d)$ memory, as the parties were required to remember the transcript of the conversation thus far. Moreover, our coding scheme achieves a communication rate of $1-O(\sqrt{\varepsilon})$ over oblivious channels and $1-O(\sqrt{\varepsilon\log\log\tfrac{1}{\varepsilon}})$ over adaptive adversarial channels, matching the conjecturally optimal rates. Lastly, we point to connections between fault-tolerant circuits and coding for interactive communication with small memory.

[116]  arXiv:1805.06875 [pdf, other]
Title: NeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning
Comments: CVPR 2018
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Video learning is an important task in computer vision and has experienced increasing interest over the recent years. Since even a small amount of videos easily comprises several million frames, methods that do not rely on a frame-level annotation are of special importance. In this work, we propose a novel learning algorithm with a Viterbi-based loss that allows for online and incremental learning of weakly annotated video data. We moreover show that explicit context and length modeling leads to huge improvements in video segmentation and labeling tasks andinclude these models into our framework. On several action segmentation benchmarks, we obtain an improvement of up to 10% compared to current state-of-the-art methods.

[117]  arXiv:1805.06880 [pdf, other]
Title: It's all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data
Comments: Project page available at this http URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We address the problem of 3D human pose estimation from 2D input images using only weakly supervised training data. Despite showing considerable success for 2D pose estimation, the application of supervised machine learning to 3D pose estimation in real world images is currently hampered by the lack of varied training images with associated 3D poses. Existing 3D pose estimation algorithms train on data that has either been collected in carefully controlled studio settings or has been generated synthetically. Instead, we take a different approach, and propose a 3D human pose estimation algorithm that only requires relative estimates of depth at training time. Such training signal, although noisy, can be easily collected from crowd annotators, and is of sufficient quality for enabling successful training and evaluation of 3D pose. Our results are competitive with fully supervised regression based approaches on the Human3.6M dataset, despite using significantly weaker training data. Our proposed approach opens the door to using existing widespread 2D datasets for 3D pose estimation by allowing fine-tuning with noisy relative constraints, resulting in more accurate 3D poses.

[118]  arXiv:1805.06881 [pdf, ps, other]
Title: Changing Observations in Epistemic Temporal Logic
Subjects: Logic in Computer Science (cs.LO); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)

We study dynamic changes of agents' observational power in logics of knowledge and time. We consider CTL*K, the extension of CTL* with knowledge operators, and enrich it with a new operator that models a change in an agent's way of observing the system. We extend the classic semantics of knowledge for perfect-recall agents to account for changes of observation, and we show that this new operator strictly increases the expressivity of CTL*K. We reduce the model-checking problem for our logic to that for CTL*K, which is known to be decidable. This provides a solution to the model-checking problem for our logic, but its complexity is not optimal. Indeed we provide a direct decision procedure with better complexity.

Cross-lists for Fri, 18 May 18

[119]  arXiv:1705.06510 (cross-list from physics.soc-ph) [pdf, other]
Title: Entropic selection of concepts unveils hidden topics in documents corpora
Comments: Main + SI. Major overhaul after first round of review
Subjects: Physics and Society (physics.soc-ph); Computation and Language (cs.CL); Digital Libraries (cs.DL); Social and Information Networks (cs.SI)

The organization and evolution of science has recently become itself an object of scientific quantitative investigation, thanks to the wealth of information that can be extracted from scientific documents, such as citations between papers and co-authorship between researchers. However, only few studies have focused on the concepts that characterize full documents and that can be extracted and analyzed, revealing the deeper organization of scientific knowledge. Unfortunately, several concepts can be so common across documents that they hinder the emergence of the underlying topical structure of the document corpus, because they give rise to a large amount of spurious and trivial relations among documents. To identify and remove common concepts, we introduce a method to gauge their relevance according to an objective information-theoretic measure related to the statistics of their occurrence across the document corpus. After progressively removing concepts that, according to this metric, can be considered as generic, we find that the topic organization displays a correspondingly more refined structure.

[120]  arXiv:1805.06126 (cross-list from q-fin.CP) [pdf, other]
Title: Market Self-Learning of Signals, Impact and Optimal Trading: Invisible Hand Inference with Free Energy
Comments: 56 pages, 3 figures
Subjects: Computational Finance (q-fin.CP); Artificial Intelligence (cs.AI); Learning (cs.LG); Adaptation and Self-Organizing Systems (nlin.AO); Portfolio Management (q-fin.PM)

We present a simple model of a non-equilibrium self-organizing market where asset prices are partially driven by investment decisions of a bounded-rational agent. The agent acts in a stochastic market environment driven by various exogenous "alpha" signals, agent's own actions (via market impact), and noise. Unlike traditional agent-based models, our agent aggregates all traders in the market, rather than being a representative agent. Therefore, it can be identified with a bounded-rational component of the market itself, providing a particular implementation of an Invisible Hand market mechanism. In such setting, market dynamics are modeled as a fictitious self-play of such bounded-rational market-agent in its adversarial stochastic environment. As rewards obtained by such self-playing market agent are not observed from market data, we formulate and solve a simple model of such market dynamics based on a neuroscience-inspired Bounded Rational Information Theoretic Inverse Reinforcement Learning (BRIT-IRL). This results in effective asset price dynamics with a non-linear mean reversion - which in our model is generated dynamically, rather than being postulated. We argue that our model can be used in a similar way to the Black-Litterman model. In particular, it represents, in a simple modeling framework, market views of common predictive signals, market impacts and implied optimal dynamic portfolio allocations, and can be used to assess values of private signals. Moreover, it allows one to quantify a "market-implied" optimal investment strategy, along with a measure of market rationality. Our approach is numerically light, and can be implemented using standard off-the-shelf software such as TensorFlow.

[121]  arXiv:1805.06453 (cross-list from physics.geo-ph) [pdf]
Title: A nonlinear and time-dependent visco-elasto-plastic rheology model for studying shock-physics phenomena
Comments: 20 pages 6 figures
Subjects: Geophysics (physics.geo-ph); Earth and Planetary Astrophysics (astro-ph.EP); Computational Engineering, Finance, and Science (cs.CE)

We present a simple and efficient implementation of a viscous creep rheology based on diffusion creep, dislocation creep and the Peierls mechanism in conjunction with an elasto-plastic rheology model into a shock-physics code, the iSALE open-source impact code. Our approach is based on the calculation of an effective viscosity which is then used as a reference viscosity for any underlying viscoelastic (or even visco-elasto-plastic) model. Here we use a Maxwell-model which best describes stress relaxation and is therefore likely most important for the formation of large meteorite impact basins. While common viscoelastic behavior during mantle convection or other slow geodynamic or geological processes is mostly controlled by diffusion and dislocation creep, we showed that the Peierls mechanism dominates at the large strain rates that typically occur during meteorite impacts. Thus, the resulting visco-elasto-plastic rheology allows implementation of a more realistic mantle behavior in computer simulations, especially for those dealing with large meteorite impacts. The approach shown here opens the way to more faithful simulations of large impact basin formation, especially in elucidating the physics behind the formation of the external fault rings characteristic of large lunar basins.

[122]  arXiv:1805.06576 (cross-list from stat.ML) [pdf, other]
Title: A Spline Theory of Deep Networks (Extended Version)
Subjects: Machine Learning (stat.ML); Learning (cs.LG)

We build a rigorous bridge between deep networks (DNs) and approximation theory via spline functions and operators. Our key result is that a large class of DNs can be written as a composition of max-affine spline operators (MASOs), which provide a powerful portal through which to view and analyze their inner workings. For instance, conditioned on the input signal, the output of a MASO DN can be written as a simple affine transformation of the input. This implies that a DN constructs a set of signal-dependent, class-specific templates against which the signal is compared via a simple inner product; we explore the links to the classical theory of optimal classification via matched filters and the effects of data memorization. Going further, we propose a simple penalty term that can be added to the cost function of any DN learning algorithm to force the templates to be orthogonal with each other; this leads to significantly improved classifi- cation performance and reduced overfitting with no change to the DN architecture. The spline partition of the input signal space that is implicitly induced by a MASO directly links DNs to the theory of vector quantization (VQ) and K-means clustering, which opens up new geometric avenue to study how DNs organize signals in a hierarchical fashion. To validate the utility of the VQ interpretation, we develop and validate a new distance metric for signals and images that quantifies the difference between their VQ encodings. (This paper is a significantly expanded version of a paper with the same title that will appear at ICML 2018.)

[123]  arXiv:1805.06595 (cross-list from stat.ML) [pdf, ps, other]
Title: Covariance-Insured Screening
Subjects: Machine Learning (stat.ML); Learning (cs.LG)

Modern bio-technologies have produced a vast amount of high-throughput data with the number of predictors far greater than the sample size. In order to identify more novel biomarkers and understand biological mechanisms, it is vital to detect signals weakly associated with outcomes among ultrahigh-dimensional predictors. However, existing screening methods, which typically ignore correlation information, are likely to miss these weak signals. By incorporating the inter-feature dependence, we propose a covariance-insured screening methodology to identify predictors that are jointly informative but only marginally weakly associated with outcomes. The validity of the method is examined via extensive simulations and real data studies for selecting potential genetic factors related to the onset of cancer.

[124]  arXiv:1805.06611 (cross-list from eess.SP) [pdf, other]
Title: Antenna Switching Sequence Design for Channel Sounding in a Fast Time-varying Channel
Comments: 6 pages, accepted to IEEE International Conference on Communications (ICC)
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT)

This paper investigates the impact of array switching patterns on the accuracy of parameter estimation of multipath components for a time division multiplexed (TDM) channel sounder. To measure fast time-varying channels, the conventional uniform array switching pattern poses a fundamental limit of the number of antennas that a TDM channel sounder can utilize. We propose a method, which is based on the simulated annealing algorithm, to find non-uniform array switching patterns for realistic antenna arrays, so that we can extend the Doppler estimation range of the channel sounder by suppressing the high sidelobes in the spatio-temporal ambiguity function. Monte Carlo simulations demonstrate that the optimal switching sequence leads to significantly smaller root mean square errors of both direction of departure and Doppler. Results can be applied in both vehicle-to-vehicle and mobile millimeter wave MIMO channel measurements.

[125]  arXiv:1805.06622 (cross-list from eess.SP) [pdf, other]
Title: Implementation of True Random Number Generator based on Double-Scroll Attractor circuit with GST memristor emulator
Subjects: Signal Processing (eess.SP); Emerging Technologies (cs.ET)

The cryptographic security provided by various techniques of random number generator (RNG) construction is one of the developing researches areas today. Among various types of RNG, the true random bit generator (TRBG) can be considered as the most unpredictable and most secured because its randomness seed is generated from chaotic sources. This paper proposes a design of TRBG model based on double-scroll attractors circuits with GST memristor. After implementation and simulation of the chaotic circuit with GST memristor emulator, the chaotic behavior of the output voltage and inductor current were received. Moreover, their dependence on the input voltage revealed the close to double-scroll form. The randomness generated from the proposed circuit was tested by receiving Fast Fourier Transform (FFT) and Lyapunov exponents of the output voltage.

[126]  arXiv:1805.06626 (cross-list from eess.SP) [pdf, other]
Title: Analysis of Noise in Current Mirrors with memristive Device
Subjects: Signal Processing (eess.SP); Emerging Technologies (cs.ET)

This work presents an analysis of noise in a cascode current mirror with CMOS-memristive device done by comparison with the basic current mirror. The analysis is completed based on THD for different frequency and channel length values by means of computer-aided design. AC and DC analyses are presented for both balanced and unbalanced current mirrors. While the change in the channel length has similar effect in both circuits, memristor in a circuit decreases noise significantly at high frequencies.

[127]  arXiv:1805.06627 (cross-list from stat.ML) [pdf, other]
Title: Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures
Comments: ACL 2018 camera-ready version, 14 pages including appendices
Subjects: Machine Learning (stat.ML); Learning (cs.LG)

Embedding methods which enforce a partial order or lattice structure over the concept space, such as Order Embeddings (OE) (Vendrov et al., 2016), are a natural way to model transitive relational data (e.g. entailment graphs). However, OE learns a deterministic knowledge base, limiting expressiveness of queries and the ability to use uncertainty for both prediction and learning (e.g. learning from expectations). Probabilistic extensions of OE (Lai and Hockenmaier, 2017) have provided the ability to somewhat calibrate these denotational probabilities while retaining the consistency and inductive bias of ordered models, but lack the ability to model the negative correlations found in real-world knowledge. In this work we show that a broad class of models that assign probability measures to OE can never capture negative correlation, which motivates our construction of a novel box lattice and accompanying probability measure to capture anticorrelation and even disjoint concepts, while still providing the benefits of probabilistic modeling, such as the ability to perform rich joint and conditional queries over arbitrary sets of concepts, and both learning from and predicting calibrated uncertainty. We show improvements over previous approaches in modeling the Flickr and WordNet entailment graphs, and investigate the power of the model.

[128]  arXiv:1805.06631 (cross-list from eess.SP) [pdf, other]
Title: Widlar Current Mirror Design Using BJT-Memristor Circuits
Subjects: Signal Processing (eess.SP); Emerging Technologies (cs.ET)

This paper presents a description of basic current mirror (CM), Widlar current mirror, fourth circuit element (memristor) and an analysis of Widlar Configuration with integrated memristor. The analysis has been performed by comparing a modified configuration with a simple circuit of Widlar CM. The focus of analysis were a power dissipation, a Total Harmonic Distortion and a chip-surface. The results has shown that a presence of memristor in the Widlar CM decreases the chip-surface area and the deviation of the signal in the circuit from a fundamental frequency. Although the analysis of power dissipation has also been conducted, there is no definite conclusion about the power losses in the circuit because of the memristor model.

[129]  arXiv:1805.06660 (cross-list from stat.ML) [pdf, other]
Title: Single Shot Active Learning using Pseudo Annotators
Comments: 12 pages, 8 figure, submitted to Pattern Recognition
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)

Standard myopic active learning assumes that human annotations are always obtainable whenever new samples are selected. This, however, is unrealistic in many real-world applications where human experts are not readily available at all times. In this paper, we consider the single shot setting: all the required samples should be chosen in a single shot and no human annotation can be exploited during the selection process. We propose a new method, Active Learning through Random Labeling (ALRL), which substitutes single human annotator for multiple, what we will refer to as, pseudo annotators. These pseudo annotators always provide uniform and random labels whenever new unlabeled samples are queried. This random labeling enables standard active learning algorithms to also exhibit the exploratory behavior needed for single shot active learning. The exploratory behavior is further enhanced by selecting the most representative sample via minimizing nearest neighbor distance between unlabeled samples and queried samples. Experiments on real-world datasets demonstrate that the proposed method outperforms several state-of-the-art approaches.

[130]  arXiv:1805.06713 (cross-list from math.CO) [pdf, other]
Title: Bounds for the smallest $k$-chromatic graphs of given girth
Comments: 17 pages; submitted for publication
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM)

Let $n_g(k)$ denote the smallest order of a $k$-chromatic graph of girth at least $g$. We consider the problem of determining $n_g(k)$ for small values of $k$ and $g$. After giving an overview of what is known about $n_g(k)$, we provide some new lower bounds based on exhaustive searches, and then obtain several new upper bounds using computer algorithms for the construction of witnesses, and for the verification of their correctness. We also present the first examples of reasonably small order for $k = 4$ and $g > 5$. In particular, the new bounds include: $n_4(7) \leq 77$, $26 \leq n_6(4) \leq 66$, $30 \leq n_7(4) \leq 171$.

[131]  arXiv:1805.06729 (cross-list from math.OC) [pdf, ps, other]
Title: Data-Driven Chance Constrained Optimization under Wasserstein Ambiguity Sets
Subjects: Optimization and Control (math.OC); Systems and Control (cs.SY)

In this note, we present a data-driven approach for ambiguous or distributionally robust chance constrained optimization problems. We consider the case where the decision-maker has access to a finite number of samples or realizations of the uncertainty. The chance constraint is then required to hold for all distributions that are close to the empirical distribution constructed from the samples (where the distance between two distributions are defined via the Wasserstein distance). When the feasibility set of the chance constraint program is replaced by its convex inner approximation (following the framework of [Nemirovski and Shapiro 2006]), we present a convex reformulation of the ambiguous chance constraint program under the Wasserstein ambiguity set. We then show that the feasibility set of the original problem and that of its convex inner approximation are identical when the constraint function is concave in the uncertainty parameter. Finally, we present a tractable convex reformulation of the ambiguous chance constraint program when the constraint function is affine in uncertainty.

[132]  arXiv:1805.06753 (cross-list from stat.ML) [pdf, other]
Title: Interpolatron: Interpolation or Extrapolation Schemes to Accelerate Optimization for Deep Neural Networks
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Learning (cs.LG); Optimization and Control (math.OC)

In this paper we explore acceleration techniques for large scale nonconvex optimization problems with special focuses on deep neural networks. The extrapolation scheme is a classical approach for accelerating stochastic gradient descent for convex optimization, but it does not work well for nonconvex optimization typically. Alternatively, we propose an interpolation scheme to accelerate nonconvex optimization and call the method Interpolatron. We explain motivation behind Interpolatron and conduct a thorough empirical analysis. Empirical results on DNNs of great depths (e.g., 98-layer ResNet and 200-layer ResNet) on CIFAR-10 and ImageNet show that Interpolatron can converge much faster than the state-of-the-art methods such as the SGD with momentum and Adam. Furthermore, Anderson's acceleration, in which mixing coefficients are computed by least-squares estimation, can also be used to improve the performance. Both Interpolatron and Anderson's acceleration are easy to implement and tune. We also show that Interpolatron has linear convergence rate under certain regularity assumptions.

[133]  arXiv:1805.06768 (cross-list from quant-ph) [pdf, ps, other]
Title: Quantum-enhanced Logic-based Blochchain I: Quantum Honest-success Byzantine Agreement and Qulogicoin
Comments: 15 pages
Subjects: Quantum Physics (quant-ph); Cryptography and Security (cs.CR)

We proposed a framework of quantum-enhanced logic-based blockchain, which improves the efficiency and power of quantum-secured blockchain. The efficiency is improved by using a new quantum honest-success Byzantine agreement protocol to replace the classical Byzantine agreement protocol, while the power is improved by incorporating quantum protection and quantum certificate into the syntax of transactions. Our quantum-secured logic-based blockchain can already be implemented by the current technology. The cryptocurrency created and transferred in our blockchain is called qulogicoin. Incorporating quantum protection and quantum certificates into blockchain makes it possible to use blockchain to overcome the limitations of some quantum cryptographic protocols. As an illustration, we show that a significant shortcoming of cheat-sensitive quantum bit commitment protocols can be overcome with the help of our blockchain and qulogicoin.

[134]  arXiv:1805.06826 (cross-list from stat.ML) [pdf, other]
Title: The Blessings of Multiple Causes
Comments: 54 pages
Subjects: Machine Learning (stat.ML); Learning (cs.LG); Methodology (stat.ME)

Causal inference from observation data often assumes "strong ignorability," that all confounders are observed. This assumption is standard yet untestable. However, many scientific studies involve multiple causes, different variables whose effects are simultaneously of interest. We propose the deconfounder, an algorithm that combines unsupervised machine learning and predictive model checking to perform causal inference in multiple-cause settings. The deconfounder infers a latent variable as a substitute for unobserved confounders and then uses that substitute to perform causal inference. We develop theory for when the deconfounder leads to unbiased causal estimates, and show that it requires weaker assumptions than classical causal inference. We analyze its performance in three types of studies: semi-simulated data around smoking and lung cancer, semi-simulated data around genomewide association studies, and a real dataset about actors and movie revenue. The deconfounder provides a checkable approach to estimating close-to-truth causal effects.

[135]  arXiv:1805.06837 (cross-list from stat.ML) [pdf, other]
Title: A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics
Subjects: Machine Learning (stat.ML); Learning (cs.LG)

We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates the number of topics K from the observed data. We derive new finite sample minimax lower bounds for the estimation of A, as well as new upper bounds for our proposed estimator. We describe the scenarios where our estimator is minimax adaptive. Our finite sample analysis is valid for any number of documents (n), individual document length (N_i), dictionary size (p) and number of topics (K), and both p and K are allowed to increase with n, a situation not handled well by previous analyses. We complement our theoretical results with a detailed simulation study. We illustrate that the new algorithm is faster and more accurate than the current ones, although we start out with a computational and theoretical disadvantage of not knowing the correct number of topics K, while we provide the competing methods with the correct value in our simulations.

[136]  arXiv:1805.06879 (cross-list from cs.CL) [pdf, other]
Title: Neural language representations predict outcomes of scientific research
Comments: 8 pages, 3 figures, plus supporting material
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Learning (cs.LG); Machine Learning (stat.ML)

Many research fields codify their findings in standard formats, often by reporting correlations between quantities of interest. But the space of all testable correlates is far larger than scientific resources can currently address, so the ability to accurately predict correlations would be useful to plan research and allocate resources. Using a dataset of approximately 170,000 correlational findings extracted from leading social science journals, we show that a trained neural network can accurately predict the reported correlations using only the text descriptions of the correlates. Accurate predictive models such as these can guide scientists towards promising untested correlates, better quantify the information gained from new findings, and has implications for moving artificial intelligence systems from predicting structures to predicting relationships in the real world.

Replacements for Fri, 18 May 18

[137]  arXiv:1211.4892 (replaced) [pdf, ps, other]
Title: Confusion of Tagged Perturbations in Forward Automatic Differentiation of Higher-Order Functions
Subjects: Symbolic Computation (cs.SC); Mathematical Software (cs.MS); Differential Geometry (math.DG)
[138]  arXiv:1504.06043 (replaced) [pdf, ps, other]
Title: Stability of Stochastic Approximations with `Controlled Markov' Noise and Temporal Difference Learning
Comments: 18 pages
Subjects: Systems and Control (cs.SY); Machine Learning (stat.ML)
[139]  arXiv:1603.01793 (replaced) [pdf, ps, other]
Title: A New Numerical Method for Solving the Acoustic Radiation Problem
Journal-ref: Poblet-Puig J., Shanin A.V., A new numerical method for solving the acoustic radiation problem. Acoustical Physics. Vol. 64, no. 2. PP. 252-259 (2018)
Subjects: Numerical Analysis (cs.NA); Mathematical Software (cs.MS)
[140]  arXiv:1604.01091 (replaced) [pdf, other]
Title: Efficient Reallocation under Additive and Responsive Preferences
Subjects: Computer Science and Game Theory (cs.GT)
[141]  arXiv:1609.08646 (replaced) [pdf, ps, other]
Title: Squared chromatic number without claws or large cliques
Comments: 13 pages; v2 corrects for a subtlety in the original derivation of Thm 1.2; v3 accepted to Canadian Mathematical Bulletin
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM)
[142]  arXiv:1610.02962 (replaced) [pdf, other]
Title: Low-Rank Dynamic Mode Decomposition: Optimal Solution in Polynomial-Time
Subjects: Machine Learning (stat.ML); Numerical Analysis (cs.NA)
[143]  arXiv:1610.03883 (replaced) [pdf, ps, other]
Title: A method for obtaining Fibonacci identities
Journal-ref: Integers, Vol. 18 (2018), #A42
Subjects: Number Theory (math.NT); Information Theory (cs.IT); Numerical Analysis (math.NA); Rings and Algebras (math.RA)
[144]  arXiv:1611.04419 (replaced) [pdf, other]
Title: Buffering Time Strategies for Wireless Full-duplex Systems under Poisson Traffic
Comments: 30 pages, 21 figures
Subjects: Networking and Internet Architecture (cs.NI)
[145]  arXiv:1612.09089 (replaced) [pdf, other]
Title: What Makes Audio Event Detection Harder than Classification?
Comments: Published version available at this https URL
Journal-ref: Published in Proceedings of the 25th European Signal Processing Conference (EUSIPCO), pp. 2739-2743, 2017
Subjects: Sound (cs.SD)
[146]  arXiv:1612.09514 (replaced) [pdf, other]
Title: A Ghost at $ω_1$
Authors: Paul Blain Levy
Comments: 27 pages
Subjects: Logic in Computer Science (cs.LO)
[147]  arXiv:1704.02942 (replaced) [pdf, ps, other]
Title: The Boolean SATisfiability Problem in Clifford algebra
Authors: Marco Budinich
Comments: 16 pages, better formalization and simpler proofs
Subjects: Mathematical Physics (math-ph); Computational Complexity (cs.CC)
[148]  arXiv:1704.06962 (replaced) [pdf, ps, other]
Title: Coherent multiple-antenna block-fading channels at finite blocklength
Subjects: Information Theory (cs.IT)
[149]  arXiv:1704.07978 (replaced) [pdf, ps, other]
Title: On Improving Deep Reinforcement Learning for POMDPs
Comments: 7 pages, 6 figures, 3 tables
Subjects: Learning (cs.LG)
[150]  arXiv:1704.07998 (replaced) [pdf, ps, other]
Title: A Strategy for Dynamic Programs: Start over and Muddle through
Subjects: Logic in Computer Science (cs.LO)
[151]  arXiv:1706.00699 (replaced) [pdf, other]
Title: Action Sets: Weakly Supervised Action Segmentation without Ordering Constraints
Comments: CVPR 2018
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[152]  arXiv:1706.05147 (replaced) [pdf, other]
Title: Spectral Domain Sampling of Graph Signals
Authors: Yuichi Tanaka
Comments: accepted to IEEE Transactions on Signal Processing
Subjects: Information Theory (cs.IT)
[153]  arXiv:1706.07853 (replaced) [pdf, ps, other]
Title: Loom: Exploiting Weight and Activation Precisions to Accelerate Convolutional Neural Networks
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Learning (cs.LG)
[154]  arXiv:1708.01012 (replaced) [pdf, other]
Title: On the convergence properties of a $K$-step averaging stochastic gradient descent algorithm for nonconvex optimization
Subjects: Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (stat.ML)
[155]  arXiv:1708.01856 (replaced) [pdf]
Title: Exploring the context of visual information seeking
Comments: 18 pages, 0 figures
Subjects: Information Retrieval (cs.IR)
[156]  arXiv:1708.02675 (replaced) [pdf, ps, other]
Title: Inheritance of Convexity for the $\mathcal{P}_{\min}$-Restricted Game
Authors: Alexandre Skoda
Subjects: Discrete Mathematics (cs.DM)
[157]  arXiv:1708.04632 (replaced) [pdf, other]
Title: On Almost Well-Covered Graphs of Girth at Least 6
Subjects: Discrete Mathematics (cs.DM); Combinatorics (math.CO)
[158]  arXiv:1708.07120 (replaced) [pdf, other]
Title: Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates
Comments: This paper was significantly revised to show super-convergence as a general fast training methodology
Subjects: Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
[159]  arXiv:1709.07359 (replaced) [pdf, other]
Title: Class-Splitting Generative Adversarial Networks
Comments: Under consideration at Pattern Recognition Letters
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
[160]  arXiv:1710.00458 (replaced) [pdf, other]
Title: ObliDB: Oblivious Query Processing using Hardware Enclaves
Subjects: Cryptography and Security (cs.CR)
[161]  arXiv:1710.02823 (replaced) [pdf, other]
Title: Structural Feature Selection for Event Logs
Comments: Extended version of a paper published in the proceedings of the BPM 2017 workshops
Subjects: Learning (cs.LG); Databases (cs.DB); Software Engineering (cs.SE); Machine Learning (stat.ML)
[162]  arXiv:1711.02078 (replaced) [pdf, other]
Title: Fully-Dynamic Bin Packing with Limited Repacking
Comments: To appear in ICALP 2018. Improved worst-case recourse for unit costs added (Theorem 2.7)
Subjects: Data Structures and Algorithms (cs.DS)
[163]  arXiv:1711.03896 (replaced) [pdf, other]
Title: Observability in Inertial Parameter Identification
Comments: Journal draft
Subjects: Robotics (cs.RO)
[164]  arXiv:1711.04121 (replaced) [pdf, other]
Title: Weakly Supervised Audio Source Separation via Spectrum Energy Preserved Wasserstein Learning
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[165]  arXiv:1711.07063 (replaced) [pdf, other]
Title: Trajectory-Optimized Sensing for Active Search of Tissue Abnormalities in Robotic Surgery
Comments: 8 pages, ICRA 2018
Subjects: Robotics (cs.RO)
[166]  arXiv:1711.11469 (replaced) [pdf, ps, other]
Title: Sum of squares lower bounds from symmetry and a good story
Authors: Aaron Potechin
Subjects: Computational Complexity (cs.CC)
[167]  arXiv:1712.00938 (replaced) [pdf, ps, other]
Title: Algebraic Soft Decoding of Reed-Solomon Codes Using Module Minimization
Comments: 30 pages, 4 figures
Subjects: Information Theory (cs.IT)
[168]  arXiv:1712.00960 (replaced) [pdf, other]
Title: FSSD: Feature Fusion Single Shot Multibox Detector
Comments: add project code
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[169]  arXiv:1712.02555 (replaced) [pdf, other]
Title: Hungarian Layer: Logics Empowered Neural Architecture
Comments: This is the draft submitting to ICML 2018. You could expect the final version, which is more perfect
Subjects: Computation and Language (cs.CL)
[170]  arXiv:1712.09665 (replaced) [pdf, other]
Title: Adversarial Patch
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[171]  arXiv:1801.02531 (replaced) [pdf, other]
Title: A Scale-out Blockchain for Value Transfer with Spontaneous Sharding
Comments: Accepted by Crypto Valley Conference for Blockchain Technology 2018
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[172]  arXiv:1801.05127 (replaced) [pdf, other]
Title: Round- and Message-Optimal Distributed Graph Algorithms
Comments: To appear in PODC 2018
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)
[173]  arXiv:1801.06146 (replaced) [pdf, other]
Title: Universal Language Model Fine-tuning for Text Classification
Comments: ACL 2018, fixed error in related work where order of 'general' and 'task-specific' was reversed
Subjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)
[174]  arXiv:1801.07875 (replaced) [pdf, other]
Title: Support Vector Machine Active Learning Algorithms with Query-by-Committee versus Closest-to-Hyperplane Selection
Comments: 8 pages, 7 figures, 3 tables; published in Proceedings of the IEEE 12th International Conference on Semantic Computing (ICSC 2018), Laguna Hills, CA, USA, pages 148-155, January 2018
Journal-ref: In Proceedings of the 2018 IEEE 12th International Conference on Semantic Computing (ICSC), pages 148-155, Laguna Hills, CA, USA, January 2018. IEEE
Subjects: Learning (cs.LG); Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (stat.ML)
[175]  arXiv:1801.07887 (replaced) [pdf, other]
Title: Impact of Batch Size on Stopping Active Learning for Text Classification
Comments: 2 pages, 1 table; published in Proceedings of the IEEE 12th International Conference on Semantic Computing (ICSC 2018), Laguna Hills, CA, USA, pages 306-307, January 2018
Journal-ref: In Proceedings of the 2018 IEEE 12th International Conference on Semantic Computing (ICSC), pages 306-307, Laguna Hills, CA, USA, January 2018. IEEE
Subjects: Learning (cs.LG); Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (stat.ML)
[176]  arXiv:1801.08099 (replaced) [pdf, other]
Title: Logically-Constrained Reinforcement Learning
Subjects: Learning (cs.LG); Logic in Computer Science (cs.LO)
[177]  arXiv:1801.08590 (replaced) [pdf, other]
Title: Individual testing is optimal for nonadaptive group testing in the linear regime
Authors: Matthew Aldridge
Comments: 4 pages, 1 figure
Subjects: Information Theory (cs.IT); Combinatorics (math.CO); Probability (math.PR)
[178]  arXiv:1801.08694 (replaced) [pdf, other]
Title: PDNet: Semantic Segmentation integrated with a Primal-Dual Network for Document binarization
Comments: Under consideration for Pattern Recognition Letters Special Issue on Graphonomics for e-citizens: e-health, e-society, e-education 11 pages, 10 figures, 2 tables
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
[179]  arXiv:1801.09240 (replaced) [pdf, other]
Title: Time Constrained Continuous Subgraph Search over Streaming Graphs
Subjects: Databases (cs.DB)
[180]  arXiv:1802.03248 (replaced) [pdf, other]
Title: Piecewise Flat Embedding for Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181]  arXiv:1802.03752 (replaced) [pdf, other]
Title: Supervised classification of Dermatological diseases by Deep learning
Comments: Submitted to CBMI 2018, 4 pages and 1 reference page
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
[182]  arXiv:1802.04085 (replaced) [pdf, ps, other]
Title: Empirical Risk Minimization in Non-interactive Local Differential Privacy: Efficiency and High Dimensional Case
Comments: Add a new section on high dimensional case
Subjects: Learning (cs.LG); Cryptography and Security (cs.CR); Machine Learning (stat.ML)
[183]  arXiv:1802.06627 (replaced) [pdf, other]
Title: Robustness of Rotation-Equivariant Networks to Adversarial Perturbations
Comments: 4 pages + references; public implementation of Spatially Transformed Adversarial Examples can be found at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Learning (cs.LG)
[184]  arXiv:1802.07344 (replaced) [pdf, other]
Title: Coconut: Threshold Issuance Selective Disclosure Credentials with Applications to Distributed Ledgers
Subjects: Cryptography and Security (cs.CR)
[185]  arXiv:1802.08288 (replaced) [pdf, other]
Title: Privacy-Preserving Boosting with Random Linear Classifiers for Learning from User-Generated Data
Subjects: Cryptography and Security (cs.CR)
[186]  arXiv:1802.09808 (replaced) [pdf, other]
Title: #DebateNight: The Role and Influence of Socialbots on Twitter During the 1st 2016 U.S. Presidential Debate
Journal-ref: 12th International AAAI Conference on Web and Social Media (ICWSM 2018)
Subjects: Social and Information Networks (cs.SI)
[187]  arXiv:1802.10089 (replaced) [pdf, other]
Title: Friction Variability in Planar Pushing Data: Anisotropic Friction and Data-collection Bias
Comments: 8 pages, 13 figures
Subjects: Robotics (cs.RO); Data Analysis, Statistics and Probability (physics.data-an)
[188]  arXiv:1802.10199 (replaced) [pdf, ps, other]
Title: Local Distributed Algorithms in Highly Dynamic Networks
Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)
[189]  arXiv:1803.01364 (replaced) [pdf, ps, other]
Title: SAFE: Spectral Evolution Analysis Feature Extraction for Non-Stationary Time Series Prediction
Comments: submitted to IEEE Trans Cybernetics
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI)
[190]  arXiv:1803.01365 (replaced) [pdf, ps, other]
Title: New Results on Multi-Step Traffic Flow Prediction
Comments: submitted to IEEE Trans on ITS
Subjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
[191]  arXiv:1803.05948 (replaced) [pdf, other]
Title: Average Cost of QuickXsort with Pivot Sampling
Authors: Sebastian Wild
Comments: updated to final version accepted for AofA 2018
Subjects: Data Structures and Algorithms (cs.DS)
[192]  arXiv:1803.10150 (replaced) [pdf, other]
Title: Learning to Branch
Subjects: Artificial Intelligence (cs.AI); Data Structures and Algorithms (cs.DS)
[193]  arXiv:1804.01413 (replaced) [pdf, other]
Title: On the Calculation of Fundamental Groups in Homotopy Type Theory by Means of Computational Paths
Comments: 30 pages, 9 figures, 2 appendix. arXiv admin note: substantial text overlap with arXiv:1803.01709, arXiv:1609.05079
Subjects: Logic in Computer Science (cs.LO)
[194]  arXiv:1804.01661 (replaced) [pdf, other]
Title: Learning Strict Identity Mappings in Deep Residual Networks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[195]  arXiv:1804.03195 (replaced) [pdf, ps, other]
Title: Contextual Search via Intrinsic Volumes
Subjects: Data Structures and Algorithms (cs.DS); Learning (cs.LG); Metric Geometry (math.MG)
[196]  arXiv:1804.03997 (replaced) [pdf, other]
Title: Security Properties of Gait for Mobile Device Pairing
Subjects: Cryptography and Security (cs.CR); Human-Computer Interaction (cs.HC)
[197]  arXiv:1804.04918 (replaced) [pdf, other]
Title: Distributed Collaborative Hashing and Its Applications in Ant Financial
Comments: 10 pages
Subjects: Learning (cs.LG); Information Retrieval (cs.IR); Machine Learning (stat.ML)
[198]  arXiv:1804.06334 (replaced) [pdf, other]
Title: On $f$-Divergences: Integral Representations, Local Behavior, and Inequalities
Authors: Igal Sason
Comments: Final edits before publication. To appear in the Entropy journal, special issue on Entropy and Information Inequalities, May 2018
Subjects: Information Theory (cs.IT); Probability (math.PR)
[199]  arXiv:1804.06732 (replaced) [pdf, other]
Title: DPRed: Making Typical Activation Values Matter In Deep Learning Computing
Comments: This paper was originally submitted to HPCA-24 in August 2017 and subsequently revised to include the motivation section. It was also submitted to ISCA-18 in November 2017. The current version has been updated to include an external bandwidth analysis. arXiv admin note: text overlap with arXiv:1707.09068
Subjects: Neural and Evolutionary Computing (cs.NE)
[200]  arXiv:1804.07221 (replaced) [pdf, other]
Title: A Decomposition-based Approach towards the Control of Boolean Networks (Technical Report)
Comments: 18 pages
Subjects: Systems and Control (cs.SY)
[201]  arXiv:1804.07459 (replaced) [pdf, other]
Title: A Complementary Tracking Model with Multiple Features
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[202]  arXiv:1804.10332 (replaced) [pdf, other]
Title: Sim-to-Real: Learning Agile Locomotion For Quadruped Robots
Comments: Accompanying video: this https URL
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
[203]  arXiv:1805.02405 (replaced) [pdf, ps, other]
Title: A Combinatorial Game and an Efficiently Computable Shift Rule for the Prefer Max De Bruijn Sequence
Subjects: Discrete Mathematics (cs.DM); Combinatorics (math.CO)
[204]  arXiv:1805.02587 (replaced) [pdf, other]
Title: Complete Analysis of a Random Forest Model
Subjects: Machine Learning (stat.ML); Learning (cs.LG)
[205]  arXiv:1805.02785 (replaced) [pdf, other]
Title: Fast Online Exact Solutions for Deterministic MDPs with Sparse Rewards
Comments: Submitted to NIPS 2018; preprint version posted here. 8 pages content, appendices include pseudocode and proof for algorithm
Subjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
[206]  arXiv:1805.05010 (replaced) [pdf, other]
Title: Detecting Adversarial Samples for Deep Neural Networks through Mutation Testing
Comments: Sumitted to NIPS 2018
Subjects: Learning (cs.LG); Machine Learning (stat.ML)
[207]  arXiv:1805.05395 (replaced) [pdf, other]
Title: Distributed Circumnavigation Control with Dynamic Spacings for a Heterogeneous Multi-robot System
Comments: Conditional accepted by 2018 RoboCup Symposium
Subjects: Robotics (cs.RO)
[208]  arXiv:1805.05631 (replaced) [pdf, other]
Title: Complexity Reduction in the Negotiation of New Lexical Conventions
Comments: Published at CogSci 2018 conference
Subjects: Multiagent Systems (cs.MA); Computation and Language (cs.CL); Social and Information Networks (cs.SI)
[209]  arXiv:1805.05999 (replaced) [pdf, other]
Title: Agent Based Rumor Spreading in a scale-free network
Subjects: Multiagent Systems (cs.MA); Social and Information Networks (cs.SI)
[210]  arXiv:1805.06070 (replaced) [pdf, ps, other]
Title: A Survey of Intrusion Detection Systems Leveraging Host Data
Subjects: Cryptography and Security (cs.CR)
[211]  arXiv:1805.06157 (replaced) [pdf, other]
Title: Zero-Shot Object Detection by Hybrid Region Embedding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[212]  arXiv:1805.06297 (replaced) [pdf, other]
Title: A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings
Comments: ACL 2018
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Learning (cs.LG)
[213]  arXiv:1805.06315 (replaced) [pdf, other]
Title: Short Schedules for Fast Flow Rerouting
Comments: arXiv admin note: text overlap with arXiv:1611.09296
Subjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC); Networking and Internet Architecture (cs.NI)
[214]  arXiv:1805.06334 (replaced) [pdf, other]
Title: Auxiliary Tasks in Multi-task Learning
Comments: fixed minor typesetting issue
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[215]  arXiv:1805.06368 (replaced) [pdf, other]
Title: Strict Very Fast Decision Tree: a memory conservative algorithm for data stream mining
Comments: 7 pages, 26 figures, Under R1 revision in Pattern Recognition Letters
Subjects: Artificial Intelligence (cs.AI)
[ total of 215 entries: 1-215 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)