Computer Science
New submissions
[ showing up to 2000 entries per page: fewer  more ]
New submissions for Fri, 22 Sep 17
 [1] arXiv:1709.06994 [pdf, other]

Title: Structured Probabilistic Pruning for Deep Convolutional Neural Network AccelerationComments: 7 pages, CNN model compression and accelerationSubjects: Learning (cs.LG); Machine Learning (stat.ML)
Although deep Convolutional Neural Network (CNN) has shown better performance in various machine learning tasks, its application is accompanied by a significant increase in storage and computation. Among CNN simplification techniques, parameter pruning is a promising approach which aims at reducing the number of weights of various layers without intensively reducing the original accuracy. In this paper, we propose a novel progressive parameter pruning method, named Structured Probabilistic Pruning (SPP), which efficiently prunes weights of convolutional layers in a probabilistic manner. Unlike existing deterministic pruning approaches, in which the pruned weights of a welltrained model are permanently eliminated, SPP utilizes the relative importance of weights during training iterations, which makes the pruning procedure more accurate by leveraging the accumulated weight importance. Specifically, we introduce an effective weight competition mechanism to emphasize the important weights and gradually undermine the unimportant ones. Experiments indicate that our proposed method has obtained superior performance on ConvNet and AlexNet compared with existing pruning methods. Our pruned AlexNet achieves 4.0 $\sim$ 8.9x (averagely 5.8x) layerwise speedup in convolutional layers with only 1.3\% top5 error increase on the ImageNet2012 validation dataset. We also prove the effectiveness of our method on transfer learning scenarios using AlexNet.
 [2] arXiv:1709.06995 [pdf, other]

Title: Symmetric Randomized Dependent RoundingComments: arXiv admin note: substantial text overlap with arXiv:1704.06528Subjects: Data Structures and Algorithms (cs.DS)
Dependent rounding is a popular technique in designing approximation algorithms. It allows us to randomly round a fractional solution $x$ to an integral vector $X$ such that $E[X] = x$, all $X_i$'s are ("cylindrically") negatively correlated, and the cardinality constraint $\sum_i X_i = \sum_i x_i$ holds with probability one. One can also essentially replace the cardinality constraint by a general linear constraint $\sum_i a_i X_i = \sum_i a_i x_i$ if the $a_i$'s are nonnegative. However, in certain problems, the coefficients $a_i$ may have mixed signs; furthermore, some applications require correlation inequalities that are much more general than negative correlation.
In this paper, we propose a generalized dependent rounding technique, called symmetric randomized dependent rounding, which can round $x$ to an almostinteger vector so that any given linear constraint is satisfied with probability one and such that all the variables are nearly independent. We give two illustrative examples of this technique: a randomized algorithm for the knapsack center problem with new average approximation guarantees and an improved bicriteria approximation ratio for the knapsack median problem.  [3] arXiv:1709.07020 [pdf, other]

Title: The arXiv of the future will not look like the arXivComments: The authors of this document welcome public comments and ideas from its readers, at the online version of this article (this https URL)Subjects: Digital Libraries (cs.DL); Solar and Stellar Astrophysics (astroph.SR)
The arXiv is the most popular preprint repository in the world. Since its inception in 1991, the arXiv has allowed researchers to freely share publicationready articles prior to formal peer review. The growth and the popularity of the arXiv emerged as a result of new technologies that made document creation and dissemination easy, and cultural practices where collaboration and data sharing were dominant. The arXiv represents a unique place in the history of research communication and the Web itself, however it has arguably changed very little since its creation. Here we look at the strengths and weaknesses of arXiv in an effort to identify what possible improvements can be made based on new technologies not previously available. Based on this, we argue that a modern arXiv might in fact not look at all like the arXiv of today.
 [4] arXiv:1709.07032 [pdf, other]

Title: DataDriven Model Predictive Control of Autonomous MobilityonDemand SystemsComments: Submitted to the International Conference on Robotics and Automation 2018Subjects: Robotics (cs.RO); Multiagent Systems (cs.MA); Systems and Control (cs.SY); Applications (stat.AP)
The goal of this paper is to present an endtoend, datadriven framework to control Autonomous MobilityonDemand systems (AMoD, i.e. fleets of selfdriving vehicles). We first model the AMoD system using a timeexpanded network, and present a formulation that computes the optimal rebalancing strategy (i.e., preemptive repositioning) and the minimum feasible fleet size for a given travel demand. Then, we adapt this formulation to devise a Model Predictive Control (MPC) algorithm that leverages shortterm demand forecasts based on historical data to compute rebalancing strategies. We test the endtoend performance of this controller with a stateoftheart LSTM neural network to predict customer demand and real customer data from DiDi Chuxing: we show that this approach scales very well for large systems (indeed, the computational complexity of the MPC algorithm does not depend on the number of customers and of vehicles in the system) and outperforms stateoftheart rebalancing strategies by reducing the mean customer wait time by up to to 89.6%.
 [5] arXiv:1709.07033 [pdf, other]

Title: Learning Human Behaviors for RobotAssisted DressingComments: 8 pages, 9 figuresSubjects: Robotics (cs.RO)
We investigate robotic assistants for dressing that can anticipate the motion of the person who is being helped. To this end, we use reinforcement learning to create models of human behavior during assistance with dressing. To explore this kind of interaction, we assume that the robot presents an open sleeve of a hospital gown to a person, and that the person moves their arm into the sleeve. The controller that models the person's behavior is given the position of the end of the sleeve and information about contact between the person's hand and the fabric of the gown. We simulate this system with a human torso model that has realistic joint ranges, a simple robot gripper, and a physicsbased cloth model for the gown. Through reinforcement learning (specifically the TRPO algorithm) the system creates a model of human behavior that is capable of placing the arm into the sleeve. We aim to model what humans are capable of doing, rather than what they typically do. We demonstrate successfully trained human behaviors for three robotassisted dressing strategies: 1) the robot gripper holds the sleeve motionless, 2) the gripper moves the sleeve linearly towards the person from the front, and 3) the gripper moves the sleeve linearly from the side.
 [6] arXiv:1709.07035 [pdf, other]

Title: Radio Irregularity Model in OMNeT++Comments: Published in: A. Foerster, A. Udugama, A. Koensgen, A. Virdis, M. Kirsche (Eds.), Proc. of the 4th OMNeT++ Community Summit, University of Bremen  Germany  September 78, 2017Subjects: Networking and Internet Architecture (cs.NI); Performance (cs.PF)
Radio irregularity is a nonnegligible phenomenon that has an impact on protocol performances. For instance, irregularity in radio range leads to asymmetric links that cause the loss of packets in different directions. In order to investigate its effect, the Radio Irregularity Model (RIM) is proposed that takes into account the irregularity of a radio range and estimates path losses in an anisotropic environment. The purpose of this paper is to provide details of the RIM model developed in the INET Framework of the OMNeT++ simulator that can be used to investigate the impact of radio irregularity on protocol performance.
 [7] arXiv:1709.07037 [pdf, other]

Title: Enabling Community Health Care with MicroservicesComments: 7 pages, The 16th IEEE International Conference on Ubiquitous Computing and Communications (IUCC 2017) Guangzhou, China, December 1215, 2017Subjects: Software Engineering (cs.SE)
Microservice architectures (MA) are composed of loosely coupled, coursegrained services that emphasise resilience and autonomy, enabling more scalable applications to be developed. Such architectures are more tolerant of changing demands from users and enterprises, in response to emerging technologies and their associated influences upon human interaction and behaviour. This article looks at microservices in the Internet of Things (IoT) through the lens of agency, and using an example in the community health care domain explores how a complex application scenario (both in terms of software and hardware interactions) might be modelled.
 [8] arXiv:1709.07051 [pdf, other]

Title: Distributed Camouflage for Swarm Robotics and Smart MaterialsSubjects: Robotics (cs.RO)
We present a distributed algorithm for a swarm of active particles to camouflage in an environment. Each particle is equipped with sensing, computation and communication, allowing the system to take color and gradient information from the environment and selforganize into an appropriate pattern. Current artificial camouflage systems are either limited to static patterns, which are adapted for specific environments, or rely on backprojection, which depend on the viewer's point of view. Inspired by the camouflage abilities of the cuttlefish, we propose a distributed estimation and pattern formation algorithm that allows to quickly adapt to different environments. We present convergence results both in simulation as well as on a swarm of miniature robots "Droplets" for a variety of patterns.
 [9] arXiv:1709.07063 [pdf, other]

Title: An Algebraic Glimpse at Bunched Implications and Separation LogicSubjects: Logic in Computer Science (cs.LO)
We overview the logic of Bunched Implications (BI) and Separation Logic (SL) from a perspective inspired by Hiroakira Ono's algebraic approach to substructural logics. We propose generalized BI algebras (GBIalgebras) as a common framework for algebras arising via "declarative resource reading", intuitionistic generalizations of relation algebras and arrow logics and the distributive Lambek calculus with intuitionistic implication. Apart from existing models of BI (in particular, heap models and effect algebras), we also cover models arising from weakening relations, formal languages or more finegrained treatment of labelled trees and semistructured data. After briefly discussing the lattice of subvarieties of GBI, we present a suitable duality for GBI along the lines of Esakia and Priestley and an algebraic proof of cut elimination in the setting of residuated frames of Galatos and Jipsen. We also show how the algebraic approach allows generic results on decidability, both positive and negative ones. In the final part of the paper, we gently introduce the substructural audience to some theory behind stateofart tools, culminating with an algebraic and prooftheoretic presentation of (bi)abduction.
 [10] arXiv:1709.07065 [pdf, other]

Title: Multicamera MultiObject TrackingSubjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, we propose a pipeline for multitarget visual tracking under multicamera system. For multicamera system tracking problem, efficient data association across cameras, and at the same time, across frames becomes more important than singlecamera system tracking. However, most of the multicamera tracking algorithms emphasis on single camera across frame data association. Thus in our work, we model our tracking problem as a global graph, and adopt Generalized Maximum Multi Clique optimization problem as our core algorithm to take both across frame and across camera data correlation into account all together. Furthermore, in order to compute good similarity scores as the input of our graph model, we extract both appearance and dynamic motion similarities. For appearance feature, Local Maximal Occurrence Representation(LOMO) feature extraction algorithm for ReID is conducted. When it comes to capturing the dynamic information, we build Hankel matrix for each tracklet of target and apply rank estimation with Iterative Hankel Total Least Squares(IHTLS) algorithm to it. We evaluate our tracker on the challenging Terrace Sequences from EPFL CVLAB as well as recently published Duke MTMC dataset.
 [11] arXiv:1709.07068 [pdf, other]

Title: Survey on SemiExplicit Time Integration of Eddy Current ProblemsComments: accepted by the SCEE 2016 proceedings to be published in the Springer Series "Mathematics in Industry"Subjects: Computational Engineering, Finance, and Science (cs.CE); Numerical Analysis (math.NA)
The spatial discretization of the magnetic vector potential formulation of magnetoquasistatic field problems results in an infinitely stiff differentialalgebraic equation system. It is transformed into a finitely stiff ordinary differential equation system by applying a generalized Schur complement. Applying the explicit Euler time integration scheme to this system results in a small maximum stable time step size. Fast computations are required in every time step to yield an acceptable overall simulation time. Several acceleration methods are presented.
 [12] arXiv:1709.07077 [pdf, other]

Title: Estimated Depth Map Helps Image ClassificationAuthors: Yihui HeSubjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
We consider image classification with estimated depth. This problem falls into the domain of transfer learning, since we are using a model trained on a set of depth images to generate depth maps (additional features) for use in another classification problem using another disjoint set of images. It's challenging as no direct depth information is provided. Though depth estimation has been well studied, none have attempted to aid image classification with estimated depth. Therefore, we present a way of transferring domain knowledge on depth estimation to a separate image classification task over a disjoint set of train, and test data. We build a RGBD dataset based on RGB dataset and do image classification on it. Then evaluation the performance of neural networks on the RGBD dataset compared to the RGB dataset. From our experiments, the benefit is significant with shallow and deep networks. It improves ResNet20 by 0.55% and ResNet56 by 0.53%. Our code and dataset are available publicly.
 [13] arXiv:1709.07078 [pdf, other]

Title: MaxMin Fair MillimetreWave BackhaulingSubjects: Networking and Internet Architecture (cs.NI)
5G mobile networks are expected to provide pervasive high speed wireless connectivity, to support increasingly resource intensive user applications. Network hyperdensification therefore becomes necessary, though connecting to the Internet tens of thousands of base stations is nontrivial, especially in urban scenarios where optical fibre is difficult and costly to deploy. The millimetre wave (mmwave) spectrum is a promising candidate for inexpensive multiGbps wireless backhauling, but exploiting this band for effective multihop data communications is challenging. In particular, resource allocation and scheduling of very narrow transmission/ reception beams requires to overcome terminal deafness and link blockage problems, while managing fairness issues that arise when flows encounter dissimilar competition and traverse different numbers of links with heterogeneous quality. In this paper, we propose WiHaul, an airtime allocation and scheduling mechanism that overcomes these challenges specific to multihop mmwave networks, guarantees maxmin fairness among traffic flows, and ensures the overall available backhaul resources are fully utilised. We evaluate the proposed WiHaul scheme over a broad range of practical network conditions, and demonstrate up to 5 times individual throughput gains and a fivefold improvement in terms of measurable fairness, over recent mmwave scheduling solutions.
 [14] arXiv:1709.07080 [pdf, other]

Title: A DeepReinforcement Learning Approach for SoftwareDefined Networking Routing OptimizationSubjects: Networking and Internet Architecture (cs.NI); Artificial Intelligence (cs.AI)
In this paper we design and evaluate a DeepReinforcement Learning agent that optimizes routing. Our agent adapts automatically to current traffic conditions and proposes tailored configurations that attempt to minimize the network delay. Experiments show very promising performance. Moreover, this approach provides important operational advantages with respect to traditional optimization algorithms.
 [15] arXiv:1709.07089 [pdf, other]

Title: On the Design of LQR Kernels for Efficient Controller LearningComments: 8 pages, 5 figures, to appear in 56th IEEE Conference on Decision and Control (CDC 2017)Subjects: Systems and Control (cs.SY); Learning (cs.LG); Machine Learning (stat.ML)
Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a firstorder system, we construct two kernels that specifically leverage the structure of the wellknown Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.
 [16] arXiv:1709.07092 [pdf, ps, other]

Title: On Compiling DNNFs without DeterminismSubjects: Artificial Intelligence (cs.AI)
Stateoftheart knowledge compilers generate deterministic subsets of DNNF, which have been recently shown to be exponentially less succinct than DNNF. In this paper, we propose a new method to compile DNNFs without enforcing determinism necessarily. Our approach is based on compiling deterministic DNNFs with the addition of auxiliary variables to the input formula. These variables are then existentially quantified from the deterministic structure in linear time, which would lead to a DNNF that is equivalent to the input formula and not necessarily deterministic. On the theoretical side, we show that the new method could generate exponentially smaller DNNFs than deterministic ones, even by adding a single auxiliary variable. Further, we show that various existing techniques that introduce auxiliary variables to the input formulas can be employed in our framework. On the practical side, we empirically demonstrate that our new method can significantly advance DNNF compilation on certain benchmarks.
 [17] arXiv:1709.07093 [pdf, other]

Title: Near Optimal Sketching of LowRank Tensor RegressionComments: 36 pages, 2 figures, 2 tables, Accepted at NIPS 2017Subjects: Learning (cs.LG); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)
We study the least squares regression problem \begin{align*} \min_{\Theta \in \mathcal{S}_{\odot D,R}} \A\Thetab\_2, \end{align*} where $\mathcal{S}_{\odot D,R}$ is the set of $\Theta$ for which $\Theta = \sum_{r=1}^{R} \theta_1^{(r)} \circ \cdots \circ \theta_D^{(r)}$ for vectors $\theta_d^{(r)} \in \mathbb{R}^{p_d}$ for all $r \in [R]$ and $d \in [D]$, and $\circ$ denotes the outer product of vectors. That is, $\Theta$ is a lowdimensional, lowrank tensor. This is motivated by the fact that the number of parameters in $\Theta$ is only $R \cdot \sum_{d=1}^D p_d$, which is significantly smaller than the $\prod_{d=1}^{D} p_d$ number of parameters in ordinary least squares regression. We consider the above CP decomposition model of tensors $\Theta$, as well as the Tucker decomposition. For both models we show how to apply data dimensionality reduction techniques based on {\it sparse} random projections $\Phi \in \mathbb{R}^{m \times n}$, with $m \ll n$, to reduce the problem to a much smaller problem $\min_{\Theta} \\Phi A \Theta  \Phi b\_2$, for which if $\Theta'$ is a nearoptimum to the smaller problem, then it is also a near optimum to the original problem. We obtain significantly smaller dimension and sparsity in $\Phi$ than is possible for ordinary least squares regression, and we also provide a number of numerical simulations supporting our theory.
 [18] arXiv:1709.07094 [pdf, other]

Title: Decomposing GR(1) Games with Singleton Liveness Guarantees for Efficient SynthesisSubjects: Logic in Computer Science (cs.LO)
Temporal logic based synthesis approaches are often used to find trajectories that are correctbyconstruction for tasks in systems with complex behavior. Some examples of such tasks include synchronization for multiagent hybrid systems, reactive motion planning for robots. However, the scalability of such approaches is of concern and at times a bottleneck when transitioning from theory to practice. In this paper, we identify a class of problems in the GR(1) fragment of lineartime temporal logic (LTL) where the synthesis problem allows for a decomposition that enables easy parallelization. This decomposition also reduces the alternation depth, resulting in more efficient synthesis. A multiagent robot gridworld example with coordination tasks is presented to demonstrate the application of the developed ideas and also to perform empirical analysis for benchmarking the decompositionbased synthesis approach.
 [19] arXiv:1709.07095 [pdf, other]

Title: Practical Machine Learning for Cloud Intrusion Detection: Challenges and the Way ForwardComments: 10 pages, 9 figuresSubjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)
Operationalizing machine learning based security detections is extremely challenging, especially in a continuously evolving cloud environment. Conventional anomaly detection does not produce satisfactory results for analysts that are investigating security incidents in the cloud. Model evaluation alone presents its own set of problems due to a lack of benchmark datasets. When deploying these detections, we must deal with model compliance, localization, and data silo issues, among many others. We pose the problem of "attack disruption" as a way forward in the security data science space. In this paper, we describe the framework, challenges, and open questions surrounding the successful operationalization of machine learning based security detections in a cloud environment and provide some insights on how we have addressed them.
 [20] arXiv:1709.07096 [pdf, other]

Title: Covert Wireless Communication with Artificial Noise GenerationSubjects: Information Theory (cs.IT); Cryptography and Security (cs.CR); Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
Covert communication conceals the transmission of the message from an attentive adversary. Recent work on the limits of covert communication in additive white Gaussian noise (AWGN) channels has demonstrated that a covert transmitter (Alice) can reliably transmit a maximum of $\mathcal{O}\left(\sqrt{n}\right)$ bits to a covert receiver (Bob) without being detected by an adversary (Warden Willie) in $n$ channel uses. This paper focuses on the scenario where other friendly nodes distributed according to a twodimensional Poisson point process with density $m$ are present in the environment. We propose a strategy where the friendly node closest to the adversary, without close coordination with Alice, produces artificial noise. We show that this method allows Alice to reliably and covertly send $\mathcal{O}(\min\{{n,m^{\gamma/2}\sqrt{n}}\})$ bits to Bob in $n$ channel uses, where $\gamma$ is the pathloss exponent. Moreover, we also consider a setting where there are $N_{\mathrm{w}}$ collaborating adversaries uniformly and randomly located in the environment and show that in $n$ channel uses, Alice can reliably and covertly send $\mathcal{O}\left(\min\left\{n,\frac{m^{\gamma/2} \sqrt{n}}{N_{\mathrm{w}}^{\gamma}}\right\}\right)$ bits to Bob when $\gamma >2$, and $\mathcal{O}\left(\min\left\{n,\frac{m \sqrt{n}}{N_{\mathrm{w}}^{2}\log^2 {N_{\mathrm{w}}}}\right\}\right)$ when $\gamma = 2$. Conversely, under mild restrictions on the communication strategy, we demonstrate that no higher covert throughput is possible for $\gamma>2$.
 [21] arXiv:1709.07101 [pdf, other]

Title: FairFuzz: Targeting Rare Branches to Rapidly Increase Greybox Fuzz Testing CoverageSubjects: Software Engineering (cs.SE); Cryptography and Security (cs.CR)
In recent years, fuzz testing has proven itself to be one of the most effective techniques for finding correctness bugs and security vulnerabilities in practice. One particular fuzz testing tool, American Fuzzy Lop or AFL, has become popular thanks to its easeofuse and bugfinding power. However, AFL remains limited in the depth of program coverage it achieves, in particular because it does not consider which parts of program inputs should not be mutated in order to maintain deep program coverage. We propose an approach, FairFuzz, that helps alleviate this limitation in two key steps. First, FairFuzz automatically prioritizes inputs exercising rare parts of the program under test. Second, it automatically adjusts the mutation of inputs so that the mutated inputs are more likely to exercise these same rare parts of the program. We conduct evaluation on realworld programs against stateoftheart versions of AFL, thoroughly repeating experiments to get good measures of variability. We find that on certain benchmarks FairFuzz shows significant coverage increases after 24 hours compared to stateoftheart versions of AFL, while on others it achieves high program coverage at a significantly faster rate.
 [22] arXiv:1709.07102 [pdf, other]

Title: Automatic Detection of MalwareGenerated Domains with Recurrent Neural ModelsComments: Submitted to NISK 2017Subjects: Cryptography and Security (cs.CR)
Modern malware families often rely on domaingeneration algorithms (DGAs) to determine rendezvous points to their commandandcontrol server. Traditional defence strategies (such as blacklisting domains or IP addresses) are inadequate against such techniques due to the large and continuously changing list of domains produced by these algorithms. This paper demonstrates that a machine learning approach based on recurrent neural networks is able to detect domain names generated by DGAs with high precision. The neural models are estimated on a large training set of domains generated by various malwares. Experimental results show that this datadriven approach can detect malwaregenerated domain names with a F_1 score of 0.971. To put it differently, the model can automatically detect 93 % of malwaregenerated domain names for a false positive rate of 1:100.
 [23] arXiv:1709.07104 [pdf, ps, other]

Title: On the Use of Machine TranslationBased Approaches for Vietnamese Diacritic RestorationComments: 4 pages, 2 figures, 4 tables, submitted to IALP 2017Subjects: Computation and Language (cs.CL)
This paper presents an empirical study of two machine translationbased approaches for Vietnamese diacritic restoration problem, including phrasebased and neuralbased machine translation models. This is the first work that applies neuralbased machine translation method to this problem and gives a thorough comparison to the phrasebased machine translation method which is the current stateoftheart method for this problem. On a large dataset, the phrasebased approach has an accuracy of 97.32% while that of the neuralbased approach is 96.15%. While the neuralbased method has a slightly lower accuracy, it is about twice faster than the phrasebased method in terms of inference speed. Moreover, neuralbased machine translation method has much room for future improvement such as incorporating pretrained word embeddings and collecting more training data.
 [24] arXiv:1709.07109 [pdf, other]

Title: Deconvolutional LatentVariable Model for Text Sequence MatchingSubjects: Computation and Language (cs.CL); Learning (cs.LG); Machine Learning (stat.ML)
A latentvariable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives. To alleviate typical optimization challenges in latentvariable models for text, we employ deconvolutional networks as the sequence decoder (generator), providing learned latent codes with more semantic information and better generalization. Our model, trained in an unsupervised manner, yields stronger empirical predictive performance than a decoder based on Long ShortTerm Memory (LSTM), with less parameters and considerably faster training. Further, we apply it to text sequencematching problems. The proposed model significantly outperforms several strong sentenceencoding baselines, especially in the semisupervised setting.
 [25] arXiv:1709.07112 [pdf, other]

Title: Cooperative Adaptive Control for CloudBased RoboticsComments: ICRA 2018 submissionSubjects: Robotics (cs.RO)
This paper studies manipulator adaptive control when knowledge is shared by multiple robots through the cloud. We first consider the case of multiple robots manipulating a common object through synchronous centralized update laws to identify unknown inertial parameters. Through this development, we introduce a notion of Ensemble Sufficient Richness, wherein parameter converge can be enabled through teamwork in the group. The introduction of this property and the analysis of stable adaptive controllers that benefit from it constitute the main new contributions of this work. Building on this original example, we then consider decentralized update laws and the influence of communication delays on this process. Perhaps surprisingly, these nonidealized networked conditions inherit the same benefits of convergence being determined through ensemble effects for the group. Simple simulations of a planar manipulator identifying an unknown load are provided to illustrate the central idea and benefits of Ensemble Sufficient Richness.
 [26] arXiv:1709.07114 [pdf, other]

Title: Cost Adaptation for Robust Decentralized Swarm BehaviourSubjects: Artificial Intelligence (cs.AI)
The multiagent swarm system is a robust paradigm which can drive efficient completion of complex tasks even under energy limitations and time constraints. However, coordination of a swarm from a centralized command center can be difficult, particularly as the swarm becomes large and spans wide ranges. Here, we leverage propagation of messages based on meshnetworking protocols for global communication in the swarm and online costoptimization through decentralized receding horizon control to drive decentralized decisionmaking. Our costbased formulation allows for a wide range of tasks to be encoded. To ensure this, we implement a method for adaptation of costs and constraints which ensures effectiveness with novel tasks, network delays, heterogeneous flight capabilities, and increasingly large swarms. We use the Unity3D game engine to build a simulator capable of introducing artificial networking failures and delays in the swarm. Using the simulator we validate our method using an example coordinated exploration task. We release our simulator and code to the community for future work.
 [27] arXiv:1709.07116 [pdf, other]

Title: Variational Memory Addressing in Generative ModelsSubjects: Learning (cs.LG)
Aiming to augment generative models with external memory, we interpret the output of a memory module with stochastic addressing as a conditional mixture distribution, where a read operation corresponds to sampling a discrete memory address and retrieving the corresponding content from memory. This perspective allows us to apply variational inference to memory addressing, which enables effective training of the memory module by using the target information to guide memory lookups. Stochastic addressing is particularly wellsuited for generative models as it naturally encourages multimodality which is a prominent aspect of most highdimensional datasets. Treating the chosen address as a latent variable also allows us to quantify the amount of information gained with a memory lookup and measure the contribution of the memory module to the generative process. To illustrate the advantages of this approach we incorporate it into a variational autoencoder and apply the resulting model to the task of generative fewshot learning. The intuition behind this architecture is that the memory module can pick a relevant template from memory and the continuous part of the model can concentrate on modeling remaining variations. We demonstrate empirically that our model is able to identify and access the relevant memory contents even with hundreds of unseen Omniglot characters in memory
 [28] arXiv:1709.07121 [pdf, other]

Title: DiscreteTime Polar Opinion Dynamics with SusceptibilityComments: Extended version, with complete proofs, of a submission to the American Control Conference 2018Subjects: Social and Information Networks (cs.SI); Multiagent Systems (cs.MA); Systems and Control (cs.SY); Dynamical Systems (math.DS)
This paper considers a discretetime opinion dynamics model in which each individual's susceptibility to being influenced by others is dependent on her current opinion. We assume that the social network has timevarying topology and that the opinions are scalars on a continuous interval. We first propose a general opinion dynamics model based on the DeGroot model, with a general function to describe the functional dependence of each individual's susceptibility on her own opinion, and show that this general model is analogous to the FriedkinJohnsen model, which assumes a constant susceptibility for each individual. We then consider two specific functions in which the individual's susceptibility depends on the \emph{polarity} of her opinion, and provide motivating social examples. First, we consider stubborn positives, who have reduced susceptibility if their opinions are at one end of the interval and increased susceptibility if their opinions are at the opposite end. A court jury is used as a motivating example. Second, we consider stubborn neutrals, who have reduced susceptibility when their opinions are in the middle of the spectrum, and our motivating examples are social networks discussing established social norms or institutionalized behavior. For each specific susceptibility model, we establish the initial and graph topology conditions in which consensus is reached, and develop necessary and sufficient conditions on the initial conditions for the final consensus value to be at either extreme of the opinion interval. Simulations are provided to show the effects of the susceptibility function when compared to the DeGroot model.
 [29] arXiv:1709.07122 [pdf, other]

Title: Accelerating PageRank using PartitionCentric ProcessingComments: 12 pages, 15 figures, 7 tablesSubjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Performance (cs.PF)
PageRank is a fundamental link analysis algorithm and a key representative of the performance of other graph algorithms and Sparse Matrix Vector (SpMV) multiplication. Calculating PageRank on sparse graphs generates large amount of random memory accesses resulting in low cache line utilization and poor use of memory bandwidth. In this paper, we present a novel PartitionCentric Processing Methodology (PCPM) that drastically reduces the amount of communication with DRAM and achieves high memory bandwidth. Similar to the state of the art Binning with Vertexcentric GatherApplyScatter (BVGAS) method, PCPM performs partition wise scatter and gather of updates with both phases enjoying full cache line utilization. However, BVGAS suffers from random memory accesses and redundant read/write of update values from nodes to their neighbors. In contrast, PCPM propagates single update from source node to all destinations in a partition, thus decreasing redundancy effectively. We make use of this characteristic to develop a novel bipartite PartitionNode Graph (PNG) data layout for PCPM, that enables streaming memory accesses, with very little generation overhead. We perform detailed analysis of PCPM and provide theoretical bounds on the amount of communication and random DRAM accesses. We experimentally evaluate our approach using 6 large graph datasets and demonstrate an average 2.7x speedup in execution time and 1.7x reduction in communication, compared to the state of the art. We also show that unlike the BVGAS implementation, PCPM is able to take advantage of intelligent node labeling that enhances locality in graphs, by further reducing the amount of communication with DRAM. Although we use PageRank as the target application in this paper, our approach can be applied to generic SpMV computation.
 [30] arXiv:1709.07124 [pdf, other]

Title: Deep Recurrent NMF for Speech Separation by Unfolding Iterative ThresholdingComments: To be presented at WASPAA 2017Subjects: Sound (cs.SD); Learning (cs.LG); Machine Learning (stat.ML)
In this paper, we propose a novel recurrent neural network architecture for speech separation. This architecture is constructed by unfolding the iterations of a sequential iterative softthresholding algorithm (ISTA) that solves the optimization problem for sparse nonnegative matrix factorization (NMF) of spectrograms. We name this network architecture deep recurrent NMF (DRNMF). The proposed DRNMF network has three distinct advantages. First, DRNMF provides better interpretability than other deep architectures, since the weights correspond to NMF model parameters, even after training. This interpretability also provides principled initializations that enable faster training and convergence to better solutions compared to conventional random initialization. Second, like many deep networks, DRNMF is an order of magnitude faster at test time than NMF, since computation of the network output only requires evaluating a few layers at each time step. Third, when a limited amount of training data is available, DRNMF exhibits stronger generalization and separation performance compared to sparse NMF and stateoftheart longshort term memory (LSTM) networks. When a large amount of training data is available, DRNMF achieves lower yet competitive separation performance compared to LSTM networks.
 [31] arXiv:1709.07129 [pdf, ps, other]

Title: Hypergraph Theory: Applications in 5G Heterogeneous UltraDense NetworksSubjects: Information Theory (cs.IT)
Heterogeneous ultradense network (HUDN) can significantly increase the spectral efficiency of cellular networks and cater for the explosive growth of data traffic in the fifthgeneration (5G) communications. Due to the dense deployment of small cells (SCs), interference among neighboring cells becomes severe. As a result, the effective resource allocation and user association algorithms are essential to minimize intercell interference and optimize network performance. However, optimizing network resources in HUDN is extremely complicated as resource allocation and user association are coupled. Therefore, HUDN requires lowcomplexity but effective resource allocation schemes to address these issues. Hypergraph theory has been recognized as a useful mathematical tool to model the complex relations among multiple entities. In this article, we show how the hypergraph models can be used to effectively tackle resource allocation problems in HUDN. We also discuss several potential research issues in this field.
 [32] arXiv:1709.07130 [pdf, ps, other]

Title: Modeling and Quantifying the Forces Driving Online Video Popularity EvolutionComments: 6 pages, 3 figuresSubjects: Social and Information Networks (cs.SI); Networking and Internet Architecture (cs.NI)
Video popularity is an essential reference for optimizing resource allocation and video recommendation in online video services. However, there is still no convincing model that can accurately depict a video's popularity evolution. In this paper, we propose a dynamic popularity model by modeling the video information diffusion process driven by various forms of recommendation. Through fitting the model with real traces collected from a practical system, we can quantify the strengths of the recommendation forces. Such quantification can lead to characterizing video popularity patterns, user behaviors and recommendation strategies, which is illustrated by a case study of TV episodes.
 [33] arXiv:1709.07139 [pdf, ps, other]

Title: Learning to Prove Safety over Parameterised Concurrent Systems (Full Version)Comments: Full version of FMCAD'17 paperSubjects: Logic in Computer Science (cs.LO); Formal Languages and Automata Theory (cs.FL); Programming Languages (cs.PL)
We revisit the classic problem of proving safety over parameterised concurrent systems, i.e., an infinite family of finitestate concurrent systems that are represented by some finite (symbolic) means. An example of such an infinite family is a dining philosopher protocol with any number n of processes (n being the parameter that defines the infinite family). Regular model checking is a wellknown generic framework for modelling parameterised concurrent systems, where an infinite set of configurations (resp. transitions) is represented by a regular set (resp. regular transducer). Although verifying safety properties in the regular model checking framework is undecidable in general, many sophisticated semialgorithms have been developed in the past fifteen years that can successfully prove safety in many practical instances. In this paper, we propose a simple solution to synthesise regular inductive invariants that makes use of Angluin's classic L* algorithm (and its variants). We provide a termination guarantee when the set of configurations reachable from a given set of initial configurations is regular. We have tested L* algorithm on standard (as well as new) examples in regular model checking including the dining philosopher protocol, the dining cryptographer protocol, and several mutual exclusion protocols (e.g. Bakery, Burns, Szymanski, and German). Our experiments show that, despite the simplicity of our solution, it can perform at least as well as existing semialgorithms.
 [34] arXiv:1709.07149 [pdf, ps, other]

Title: Learning RBM with a DC programming ApproachComments: Accepted in ACML2017Subjects: Learning (cs.LG); Machine Learning (stat.ML)
By exploiting the property that the RBM loglikelihood function is the difference of convex functions, we formulate a stochastic variant of the difference of convex functions (DC) programming to minimize the negative loglikelihood. Interestingly, the traditional contrastive divergence algorithm is a special case of the above formulation and the hyperparameters of the two algorithms can be chosen such that the amount of computation per minibatch is identical. We show that for a given computational budget the proposed algorithm almost always reaches a higher loglikelihood more rapidly, compared to the standard contrastive divergence algorithm. Further, we modify this algorithm to use the centered gradients and show that it is more efficient and effective compared to the standard centered gradient algorithm on benchmark datasets.
 [35] arXiv:1709.07150 [pdf, other]

Title: Feature Engineering for Predictive Modeling using Reinforcement LearningSubjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
Feature engineering is a crucial step in the process of predictive modeling. It involves the transformation of given feature space, typically using mathematical functions, with the objective of reducing the modeling error for a given target. However, there is no welldefined basis for performing effective feature engineering. It involves domain knowledge, intuition, and most of all, a lengthy process of trial and error. The human attention involved in overseeing this process significantly influences the cost of model generation. We present a new framework to automate feature engineering. It is based on performance driven exploration of a transformation graph, which systematically and compactly enumerates the space of given options. A highly efficient exploration strategy is derived through reinforcement learning on past examples.
 [36] arXiv:1709.07153 [pdf, other]

Title: Large Vocabulary Automatic Chord Estimation Using Deep Neural Nets: Design Framework, System Variations and LimitationsSubjects: Sound (cs.SD); Artificial Intelligence (cs.AI)
In this paper, we propose a new system design framework for large vocabulary automatic chord estimation. Our approach is based on an integration of traditional sequence segmentation processes and deep learning chord classification techniques. We systematically explore the design space of the proposed framework for a range of parameters, namely deep neural nets, network configurations, input feature representations, segment tiling schemes, and training data sizes. Experimental results show that among the three proposed deep neural nets and a baseline model, the recurrent neural network based system has the best average chord quality accuracy that significantly outperforms the other considered models. Furthermore, our biasvariance analysis has identified a glass ceiling as a potential hindrance to future improvements of large vocabulary automatic chord estimation systems.
 [37] arXiv:1709.07158 [pdf, other]

Title: SceneCut: Joint Geometric and Object Segmentation for Indoor ScenesSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
This paper presents SceneCut, a novel approach to jointly discover previously unseen objects and nonobject surfaces using a single RGBD image. SceneCut's joint reasoning over scene semantics and geometry allows a robot to detect and segment object instances in complex scenes where modern deep learningbased methods either fail to separate object instances, or fail to detect objects that were not seen during training. SceneCut automatically decomposes a scene into meaningful regions which either represent objects or scene surfaces. The decomposition is qualified by an unified energy function over objectness and geometric fitting. We show how this energy function can be optimized efficiently by utilizing hierarchical segmentation trees. Moreover, we leverage a pretrained convolutional oriented boundary network to predict accurate boundaries from images, which are used to construct highquality region hierarchies. We evaluate SceneCut on several different indoor environments, and the results show that SceneCut significantly outperforms all the existing methods.
 [38] arXiv:1709.07166 [pdf, other]

Title: SemiAutomated Nasal PAP Mask Sizing using Facial PhotographsComments: 4 pages, 3 figures, 4 tables, IEEE Engineering Medicine and Biology Conference 2017Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present a semiautomated system for sizing nasal Positive Airway Pressure (PAP) masks based upon a neural network model that was trained with facial photographs of both PAP mask users and nonusers. It demonstrated an accuracy of 72% in correctly sizing a mask and 96% accuracy sizing to within 1 mask size group. The semiautomated system performed comparably to sizing from manual measurements taken from the same images which produced 89% and 100% accuracy respectively.
 [39] arXiv:1709.07168 [pdf, ps, other]

Title: Indepth comparison of the Berlekamp  Massey  Sakata and the ScalarFGLM algorithms: the non adaptive variantsSubjects: Symbolic Computation (cs.SC)
We compare thoroughly the Berlekamp  Massey  Sakata algorithm and the ScalarFGLM algorithm, which compute both the ideal of relations of a multidimensional linear recurrent sequence. Suprisingly, their behaviors differ. We detail in which way they do and prove that it is not possible to tweak one of the algorithms in order to mimic exactly the behavior of the other.
 [40] arXiv:1709.07171 [pdf, ps, other]

Title: Finding Maximum Expected Termination Time of Probabilistic Timed Automata Models with Cyclic behaviorSubjects: Formal Languages and Automata Theory (cs.FL)
The paper addresses the problem of computing maximal expected termination time (or expected worst case execution time) of probabilistic timed automata (PTA) models with cyclic behavior. The problem can be studied under different settings and different assumptions. In this work, we restrict our attention to a class of probabilistic timed systems with no nondeterminism, namely purely probabilistic timed systems. The problem can exhibit extremely high computational complexity, in particular when the automaton contains nested cycles or successive cycles. However, we show that acceleration techniques can be applied to accelerate onthefly the execution of probabilistic timed cycles by collapsing their iterations. The advantages of the proposed acceleration are twofolds. First, it helps to reduce significantly the computational complexity of the problem without adversely affecting the outcome of the verification. Second, it can bring the WCET problem within the bound of feasibility for model checking. To our knowledge, this is the first work that addresses the problem of accelerating execution of probabilistic timed cycles.
 [41] arXiv:1709.07172 [pdf, ps, other]

Title: SpectralFPL: Online Spectral Learning for Single Topic ModelsComments: 8 pages, 5 figuresSubjects: Learning (cs.LG); Machine Learning (stat.ML)
This paper studies how to efficiently learn an optimal latent variable model online from large streaming data. Latent variable models can explain the observed data in terms of unobserved concepts. They are traditionally studied in the unsupervised learning setting, and learned by iterative methods such as the EM. Very few online learning algorithms for latent variable models have been developed, and the most popular one is online EM. Though online EM is computationally efficient, it typically converges to a local optimum. In this work, we motivate and develop SpectralFPL, a novel algorithm to learn latent variable models online from streaming data. SpectralFPL is computationally efficient, and we prove that it quickly learns the global optimum under a bagofwords model by deriving an $O(\sqrt n)$ regret bound. Experiment results also demonstrate a consistent performance improvement of SpectralFPL over online EM: in both synthetic and realworld experiments, SpectralFPL's performance is similar with or even better than online EM with optimally tuned parameters.
 [42] arXiv:1709.07174 [pdf, other]

Title: Agile OffRoad Autonomous Driving Using EndtoEnd Deep Imitation LearningAuthors: Yunpeng Pan, ChingAn Cheng, Kamil Saigol, Keuntaek Lee, Xinyan Yan, Evangelos Theodorou, Byron BootsComments: 8 pagesSubjects: Robotics (cs.RO)
We present an endtoend imitation learning system for agile, offroad autonomous driving using only lowcost onboard sensors.By imitating an optimal controller, we train a deep neural network control policy to map raw, highdimensional observations to continuous steering and throttle commands, the latter of which is essential to successfully drive on varied terrain at high speed. Compared with recent approaches to similar tasks, our method requires neither state estimation nor online planning to navigate the vehicle. Realworld experimental results demonstrate successful autonomous offroad driving, matching the stateoftheart performance.
 [43] arXiv:1709.07182 [pdf, ps, other]

Title: Analysis of WirelessPowered DevicetoDevice Communications with Ambient BackscatteringComments: 6 pages, 5 figures. In Proc. IEEE VTC2017Fall, Toronto, CanadaSubjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)
Selfsustainable communications based on advanced energy harvesting technologies have been under rapid development, which facilitate autonomous operation and energyefficient transmission. Recently, ambient backscattering that leverages existing RF signal resources in the air has been invented to empower data communication among lowpower devices. In this paper, we introduce hybrid devicetodevice (D2D) communications by integrating ambient backscattering and wirelesspowered communications. The hybrid D2D communications are selfsustainable, as no dedicated external power supply is required. However, since the radio signals for energy harvesting and backscattering come from external RF sources, the performance of the hybrid D2D communications needs to be optimized efficiently. As such, we design two mode selection protocols for the hybrid D2D transmitter, allowing a more flexible adaptation to the environment. We then introduce analytical models to characterize the impacts of the considered environment factors, e.g., distribution, spatial density, and transmission load of the ambient transmitters, on the hybrid D2D communications performance. Extensive simulations show that the repulsion factor among the ambient transmitters has a nontrivial impact on the communication performance. Additionally, we reveal how different mode selection protocols affect the performance metrics.
 [44] arXiv:1709.07183 [pdf]

Title: A spatial scientometric analysis of the publication output of cities worldwideAuthors: Gyorgy CsomosJournalref: Journal of Informetrics, 11 (4), 976988 (2017)Subjects: Digital Libraries (cs.DL)
In tandem with the rapid globalisation of science, spatial scientometrics has become an important research subfield in scientometric studies. Recently, numerous spatial scientometric contributions have focused on the examination of cities' scientific output by using various scientometric indicators. In this paper, I analyse cities' scientific output worldwide in terms of the number of journal articles indexed by the Scopus database, in the period from 1986 to 2015. Furthermore, I examine which countries are the most important collaborators of cities. Finally, I identify the most productive disciplines in each city. I use GPS Visualizer to illustrate the scientometric data of nearly 2,200 cities on maps. Results show that cities with the highest scientific output are mostly located in developed countries and China. Between 1986 and 2015, the greatest number of scientific articles were created in Beijing. The international hegemony of the United States in science has been described by many studies, and is also reinforced by the fact that the United States is the most important collaborator to more than 75 percent of all cities. Medicine is the most productive discipline in twothirds of cities. Furthermore, cities having the highest scientific output in specific disciplines show welldefined geographical patterns.
 [45] arXiv:1709.07184 [pdf, ps, other]

Title: Convergence characteristics of the generalized residual cutting methodComments: 15 pages, 13 figuresSubjects: Numerical Analysis (cs.NA)
The residual cutting (RC) method has been proposed for efficiently solving linear equations obtained from elliptic partial differential equations. Based on the RC, we have introduced the generalized residual cutting (GRC) method, which can be applied to general sparse matrix problems. In this paper, we study the mathematics of the GRC algorithm and and prove it is a Krylov subspace method. Moreover, we show that it is deeply related to the conjugate residual (CR) method and that GRC becomes equivalent to CR for symmetric matrices. Also, in numerical experiments, GRC shows more robust convergence and needs less memory compared to GMRES, for significantly larger matrix sizes.
 [46] arXiv:1709.07192 [pdf, other]

Title: Visual Question Generation as Dual Task of Visual Question AnsweringComments: 9 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
Recently visual question answering (VQA) and visual question generation (VQG) are two trending topics in the computer vision, which have been explored separately. In this work, we propose an endtoend unified framework, the Invertible Question Answering Network (iQAN), to leverage the complementary relations between questions and answers in images by jointly training the model on VQA and VQG tasks. Corresponding parameter sharing scheme and regular terms are proposed as constraints to explicitly leverage Q,A's dependencies to guide the training process. After training, iQAN can take either question or answer as input, then output the counterpart. Evaluated on the largescale visual question answering datasets CLEVR and VQA2, our iQAN improves the VQA accuracy over the baselines. We also show the dual learning framework of iQAN can be generalized to other VQA architectures and consistently improve the results over both the VQA and VQG tasks.
 [47] arXiv:1709.07198 [pdf, ps, other]

Title: Cyber Insurance for Heterogeneous Wireless NetworksComments: IEEE Communications Magazine (Heterogeneous Ultra Dense Networks)Subjects: Networking and Internet Architecture (cs.NI)
Heterogeneous wireless networks (HWNs) composed of densely deployed base stations of different types with various radio access technologies have become a prevailing trend to accommodate everincreasing traffic demand in enormous volume. Nowadays, users rely heavily on HWNs for ubiquitous network access that contains valuable and critical information such as financial transactions, ehealth, and public safety. Cyber risks, representing one of the most significant threats to network security and reliability, are increasing in severity. To address this problem, this article introduces the concept of cyber insurance to transfer the cyber risk (i.e., service outage, as a consequence of cyber risks in HWNs) to a third party insurer. Firstly, a review of the enabling technologies for HWNs and their vulnerabilities to cyber risks is presented. Then, the fundamentals of cyber insurance are introduced, and subsequently, a cyber insurance framework for HWNs is presented. Finally, open issues are discussed and the challenges are highlighted for integrating cyber insurance as a service of next generation HWNs.
 [48] arXiv:1709.07200 [pdf, other]

Title: Temporal Multimodal Fusion for Video Emotion Classification in the WildJournalref: ACM  ICMI 2017, Nov 2017, Glasgow, United KingdomSubjects: Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG); Multimedia (cs.MM)
This paper addresses the question of emotion classification. The task consists in predicting emotion labels (taken among a set of possible labels) best describing the emotions contained in short video clips. Building on a standard framework  lying in describing videos by audio and visual features used by a supervised classifier to infer the labels  this paper investigates several novel directions. First of all, improved face descriptors based on 2D and 3D Convolutional Neural Networks are proposed. Second, the paper explores several fusion methods, temporal and multimodal, including a novel hierarchical method combining features and scores. In addition, we carefully reviewed the different stages of the pipeline and designed a CNN architecture adapted to the task; this is important as the size of the training set is small compared to the difficulty of the problem, making generalization difficult. The soobtained model ranked 4th at the 2017 Emotion in the Wild challenge with the accuracy of 58.8 %.
 [49] arXiv:1709.07212 [pdf, other]

Title: A First Derivative Potts Model for Segmentation and Denoising Using MILPComments: 6 pages, 2 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
Unsupervised image segmentation and denoising are two fundamental tasks in image processing. Usually, graph based models such as multicut are used for segmentation and variational models are employed for denoising. Our approach addresses both problems at the same time. We propose a novel MILP formulation of a first derivative Potts model, where binary variables are introduced to directly deal with the $\ell_0$ norm. As a byproduct the image is denoised. To the best of our knowledge, it is the first global mathematical programming model for simultaneous segmentation and denoising. Numerical experiments on realworld images are compared with multicut approaches.
 [50] arXiv:1709.07218 [pdf, other]

Title: 3D Deformable Object Manipulation using Fast Online Gaussian Process RegressionSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
In this paper, we present a general approach to automatically visualservo control the position and shape of a deformable object whose deformation parameters are unknown. The servocontrol is achieved by online learning a model mapping between the robotic endeffector's movement and the object's deformation measurement. The model is learned using the Gaussian Process Regression (GPR) to deal with its highly nonlinear property, and once learned, the model is used for predicting the required control at each time step. To overcome GPR's high computational cost while dealing with long manipulation sequences, we implement a fast online GPR by selectively removing uninformative observation data from the regression process. We validate the performance of our controller on a set of deformable object manipulation tasks and demonstrate that our method can achieve effective and accurate servocontrol for general deformable objects with a wide variety of goal settings. Videos are available at https://sites.google.com/view/msofogpr.
 [51] arXiv:1709.07220 [pdf, other]

Title: Human Pose Estimation using Global and Local NormalizationComments: ICCV2017Subjects: Computer Vision and Pattern Recognition (cs.CV)
In this paper, we address the problem of estimating the positions of human joints, i.e., articulated pose estimation. Recent stateoftheart solutions model two key issues, joint detection and spatial configuration refinement, together using convolutional neural networks. Our work mainly focuses on spatial configuration refinement by reducing variations of human poses statistically, which is motivated by the observation that the scattered distribution of the relative locations of joints e.g., the left wrist is distributed nearly uniformly in a circular area around the left shoulder) makes the learning of convolutional spatial models hard. We present a twostage normalization scheme, human body normalization and limb normalization, to make the distribution of the relative joint locations compact, resulting in easier learning of convolutional spatial models and more accurate pose estimation. In addition, our empirical results show that incorporating multiscale supervision and multiscale fusion into the joint detection network is beneficial. Experiment results demonstrate that our method consistently outperforms stateoftheart methods on the benchmarks.
 [52] arXiv:1709.07221 [pdf, ps, other]

Title: SelfDual Codes better than the GilbertVarshamov boundSubjects: Information Theory (cs.IT); Combinatorics (math.CO)
We show that every selforthogonal code over $\mathbb F_q$ of length $n$ can be extended to a selfdual code, if there exists selfdual codes of length $n$. Using a family of Galois towers of algebraic function fields we show that over any nonprime field $\mathbb F_q$, with $q\geq 64$, except possibly $q=125$, there are selfdual codes better than the asymptotic GilbertVarshamov bound.
 [53] arXiv:1709.07223 [pdf, other]

Title: Convolutional neural networks that teach microscopes how to imageComments: 13 pages, 6 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Learning (cs.LG); Optics (physics.optics)
Deep learning algorithms offer a powerful means to automatically analyze the content of medical images. However, many biological samples of interest are primarily transparent to visible light and contain features that are difficult to resolve with a standard optical microscope. Here, we use a convolutional neural network (CNN) not only to classify images, but also to optimize the physical layout of the imaging device itself. We increase the classification accuracy of a microscope's recorded images by merging an optical model of image formation into the pipeline of a CNN. The resulting network simultaneously determines an ideal illumination arrangement to highlight important sample features during image acquisition, along with a set of convolutional weights to classify the detected images postcapture. We demonstrate our joint optimization technique with an experimental microscope configuration that automatically identifies malariainfected cells with 510% higher accuracy than standard and alternative microscope lighting designs.
 [54] arXiv:1709.07224 [pdf, other]

Title: Learning Complex Swarm Behaviors by Exploiting Local Communication Protocols with Deep Reinforcement LearningComments: 8 pages, 6 figures, version 1Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI); Learning (cs.LG); Systems and Control (cs.SY); Machine Learning (stat.ML)
Swarm systems constitute a challenging problem for reinforcement learning (RL) as the algorithm needs to learn decentralized control policies that can cope with limited local sensing and communication abilities of the agents. Although there have been recent advances of deep RL algorithms applied to multiagent systems, learning communication protocols while simultaneously learning the behavior of the agents is still beyond the reach of deep RL algorithms. However, while it is often difficult to directly define the behavior of the agents, simple communication protocols can be defined more easily using prior knowledge about the given task. In this paper, we propose a number of simple communication protocols that can be exploited by deep reinforcement learning to find decentralized control policies in a multirobot swarm environment. The protocols are based on histograms that encode the local neighborhood relations of the agents and can also transmit taskspecific information, such as the shortest distance and direction to a desired target. In our framework, we use an adaptation of Trust Region Policy Optimization to learn complex collaborative tasks, such as formation building, building a communication link, and pushing an intruder. We evaluate our findings in a simulated 2Dphysics environment, and compare the implications of different communication protocols.
 [55] arXiv:1709.07229 [pdf, other]

Title: HighPerformance Derivative Computations using CoDiPackComments: 21 pages, 11 figures, 6 tables, CoDiPack: this https URLSubjects: Mathematical Software (cs.MS)
There are several AD tools available, which all implement different strategies for the reverse mode of AD. The major strategies are primal value taping (implemented e.g. by ADOLc) and Jacobi taping (implemented e.g. by adept and dco/c++). Especially for Jacobi taping, recent advances by using expression templates make this approach very attractive for large scale software. The current implementations are either closed source or miss essential features and flexibility. Therefore, we present the new AD tool CoDiPack (Code Differentiation Package) in this paper. It is specifically designed for a minimal memory consumption and optimal runtime, such that it can be used for the differentiation of large scale software. An essential part of the design of CoDiPack is the modular layout and the recursive data structures, which do not only allow the efficient implementation of the Jacobi taping approach, but will also enable other approaches like the primal value taping or new research ideas. We will also present the performance value of CoDiPack on a generic PDE example and on the SU2 code.
 [56] arXiv:1709.07241 [pdf]

Title: Satisfiability Modulo Theory based Methodology for Floorplanning in VLSI CircuitsComments: 8 pages,5 figuresSubjects: Hardware Architecture (cs.AR)
This paper proposes a Satisfiability Modulo Theory based formulation for floorplanning in VLSI circuits. The proposed approach allows a number of fixed blocks to be placed within a layout region without overlapping and at the same time minimizing the area of the layout region. The proposed approach is extended to allow a number of fixed blocks with ability to rotate and flexible blocks (with variable width and height) to be placed within a layout without overlap. Our target in all cases is reduction in area occupied on a chip which is of vital importance in obtaining a good circuit design. Satisfiability Modulo Theory combines the problem of Boolean satisfiability with domains such as convex optimization. Satisfiability Modulo Theory provides a richer modeling language than is possible with pure Boolean SAT formulas. We have conducted our experiments on MCNC and GSRC benchmark circuits to calculate the total area occupied, amount of deadspace and the total CPU time consumed while placing the blocks without overlapping. The results obtained shows clearly that the amount of dead space or wasted space is reduced if rotation is applied to the blocks.
 [57] arXiv:1709.07244 [pdf, other]

Title: Neural network identification of people hidden from view with a singlepixel, singlephoton detectorAuthors: Piergiorgio Caramazza, Alessandro Boccolini, Daniel Buschek, Matthias Hullin, Catherine Higham, Robert Henderson, Roderick MurraySmith, Daniele FaccioSubjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
Light scattered from multiple surfaces can be used to retrieve information of hidden environments. However, full threedimensional retrieval of an object hidden from view by a wall has only been achieved with scanning systems and requires intensive computational processing of the retrieved data. Here we use a nonscanning, singlephoton singlepixel detector in combination with an artificial neural network: this allows us to locate the position and to also simultaneously provide the actual identity of a hidden person, chosen from a database of people (N=3). Artificial neural networks applied to specific computational imaging problems can therefore enable novel imaging capabilities with hugely simplified hardware and processing times
 [58] arXiv:1709.07249 [pdf, other]

Title: Sorting with Recurrent Comparison ErrorsComments: 15 pages, ISAAC 2017Subjects: Data Structures and Algorithms (cs.DS)
We present a sorting algorithm for the case of recurrent random comparison errors. The algorithm essentially achieves simultaneously good properties of previous algorithms for sorting $n$ distinct elements in this model. In particular, it runs in $O(n^2)$ time, the maximum dislocation of the elements in the output is $O(\log n)$, while the total dislocation is $O(n)$. These guarantees are the best possible since we prove that even randomized algorithms cannot achieve $o(\log n)$ maximum dislocation with high probability, or $o(n)$ total dislocation in expectation, regardless of their running time.
 [59] arXiv:1709.07250 [pdf, other]

Title: Realtime predictive maintenance for wind turbines using Big Data frameworksJournalref: 2017 IEEE International Conference on Prognostics and Health Management (ICPHM)Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
This work presents the evolution of a solution for predictive maintenance to a Big Data environment. The proposed adaptation aims for predicting failures on wind turbines using a datadriven solution deployed in the cloud and which is composed by three main modules. (i) A predictive model generator which generates predictive models for each monitored wind turbine by means of Random Forest algorithm. (ii) A monitoring agent that makes predictions every 10 minutes about failures in wind turbines during the next hour. Finally, (iii) a dashboard where given predictions can be visualized. To implement the solution Apache Spark, Apache Kafka, Apache Mesos and HDFS have been used. Therefore, we have improved the previous work in terms of data process speed, scalability and automation. In addition, we have provided faulttolerant functionality with a centralized access point from where the status of all the wind turbines of a company localized all over the world can be monitored, reducing O&M costs.
 [60] arXiv:1709.07253 [pdf, other]

Title: Revisiting Resolution and InterLayer Coupling Factors in Modularity for Multilayer NetworksComments: Accepted at the IEEE/ACM Conf. on Advances in Social Network Analysis and Mining (ASONAM 2017)Subjects: Social and Information Networks (cs.SI); Physics and Society (physics.socph)
Modularity for multilayer networks, also called multislice modularity, is parametric to a resolution factor and an interlayer coupling factor. The former is useful to express layerspecific relevance and the latter quantifies the strength of node linkage across the layers of a network. However, such parameters can be set arbitrarily, thus discarding any structure information at graph or community level. Other issues are related to the inability of properly modeling order relations over the layers, which is required for dynamic networks.
In this paper we propose a new definition of modularity for multilayer networks that aims to overcome major issues of existing multislice modularity. We revise the role and semantics of the layerspecific resolution and interlayer coupling terms, and define parameterfree unsupervised approaches for their computation, by using information from the withinlayer and interlayer structures of the communities. Moreover, our formulation of multilayer modularity is general enough to account for an available ordering of the layers and relating constraints on layer coupling. Experimental evaluation was conducted using three stateoftheart methods for multilayer community detection and nine realworld multilayer networks. Results have shown the significance of our modularity, disclosing the effects of different combinations of the resolution and interlayer coupling functions. This work can pave the way for the development of new optimization methods for discovering community structures in multilayer networks.  [61] arXiv:1709.07255 [pdf, ps, other]

Title: AssumptionBased Approaches to Reasoning with PrioritiesComments: Forthcoming in the proceedings of AI^3Subjects: Artificial Intelligence (cs.AI)
This paper maps out the relation between different approaches for handling preferences in argumentation with strict rules and defeasible assumptions by offering translations between them. The systems we compare are: nonprioritized defeats i.e. attacks, preferencebased defeats, and preferencebased defeats extended with reverse defeat.
 [62] arXiv:1709.07259 [pdf, ps, other]

Title: A CommunicationEfficient Distributed Data Structure for Topk and kSelect QueriesSubjects: Data Structures and Algorithms (cs.DS)
We consider the scenario of $n$ sensor nodes observing streams of data. The nodes are connected to a central server whose task it is to compute some function over all data items observed by the nodes. In our case, there exists a total order on the data items observed by the nodes. Our goal is to compute the $k$ currently lowest observed values or a value with rank in $[(1\varepsilon)k,(1+\varepsilon)k]$ with probability $(1\delta)$. We propose solutions for these problems in an extension of the distributed monitoring model where the server can send broadcast messages to all nodes for unit cost. We want to minimize communication over multiple time steps where there are $m$ updates to a node's value in between queries. The result is composed of two main parts, which each may be of independent interest: (1) Protocols which answer Topk and kSelect queries. These protocols are memoryless in the sense that they gather all information at the time of the request. (2) A dynamic data structure which tracks for every $k$ an element close to $k$.
We describe how to combine the two parts to receive a protocol answering the stated queries over multiple time steps. Overall, for Top$k$ queries we use $O(k + \log m + \log \log n)$ and for $k$Select queries $O(\frac{1}{\varepsilon^2} \log \frac{1}{\delta} + \log m + \log^2 \log n)$ messages in expectation. These results are shown to be asymptotically tight if $m$ is not too small.  [63] arXiv:1709.07273 [pdf, other]

Title: Hybrid Beamforming Based on Implicit Channel State Information for Millimeter Wave LinksComments: 26 pagesSubjects: Information Theory (cs.IT)
Hybrid beamforming provides a promising solution to achieve high data rate transmission at millimeter waves. To implement the hybrid beamforming at the transceiver, a common solution is to decouple the transmitter and receiver functions and then optimize them separately based on given channel state information. However, many references ignore the complexity of the highdimensional channel estimation problem or the effort for the subsequent step of computing the singular value decomposition of a large channel matrix. To this end, we present a lowcomplexity scheme that exploits implicit channel knowledge to facilitate the design of hybrid beamforming for frequencyselective fading channels. The implicit channel knowledge can be interpreted as a coupling of the channel and all pairs of possible analog beamforming vectors at the transmitter and receiver, and it is obtained by receiving the transmitted pilots. Based on the received pilots, different effective channel matrices are constructed in the sense that the matrix elements are essentially the coupling coefficients. Instead of calculating mutual information, we use the Frobenius norm of the effective channel as a key parameter of the hybrid beamforming gain. As a result, the complicated hybrid beamforming problem becomes a much simpler one: it amounts to trying different sets of the largepower received pilots to construct multiple alternatives for the effective channel matrix. Then, the set yielding the largest Frobenius norm provides the solution to the hybrid beamforming problem.
 [64] arXiv:1709.07276 [pdf, other]

Title: Speech Recognition Challenge in the Wild: Arabic MGB3Subjects: Computation and Language (cs.CL)
This paper describes the Arabic MGB3 Challenge  Arabic Speech Recognition in the Wild. Unlike last year's Arabic MGB2 Challenge, for which the recognition task was based on more than 1,200 hours broadcast TV news recordings from Aljazeera Arabic TV programs, MGB3 emphasises dialectal Arabic using a multigenre collection of Egyptian YouTube videos. Seven genres were used for the data collection: comedy, cooking, family/kids, fashion, drama, sports, and science (TEDx). A total of 16 hours of videos, split evenly across the different genres, were divided into adaptation, development and evaluation data sets. The Arabic MGBChallenge comprised two tasks: A) Speech transcription, evaluated on the MGB3 test set, along with the 10 hour MGB2 test set to report progress on the MGB2 evaluation; B) Arabic dialect identification, introduced this year in order to distinguish between four major Arabic dialects  Egyptian, Levantine, North African, Gulf, as well as Modern Standard Arabic. Two hours of audio per dialect were released for development and a further two hours were used for evaluation. For dialect identification, both lexical features and ivector bottleneck features were shared with participants in addition to the raw audio recordings. Overall, thirteen teams submitted ten systems to the challenge. We outline the approaches adopted in each system, and summarise the evaluation results.
 [65] arXiv:1709.07278 [pdf, ps, other]

Title: Secure Energy Efficiency Optimization for MISO Cognitive Radio Network with Energy HarvestingSubjects: Information Theory (cs.IT)
This paper investigates a secure energy efficiency (SEE) optimization problem in a multipleinput singleoutput (MISO) underlay cognitive radio (CR) network. In particular, a multiantenna secondary transmitter (SUTx) simultaneously sends secured information and energy to a secondary receiver (SURx) and an energy receiver (ER), respectively, in the presence of a primary receiver (PURx). It is assumed that the SURx, ER and PURx are each equipped with a single antenna. In addition, the SUTx should satisfy constraints on maximum interference leakage to the PURx and minimum harvested energy at the ER. In this CR network, we consider the transmit covariance matrix design with the assumption of perfect channel state information (CSI) at the SUTx. In addition, it is assumed that the ER is a potential passive eavesdropper due to broadcast nature of wireless transmission. On the other hand, we consider the worstcase scenario that ER's energy harvesting requirement is only satisfied when it performs only energy harvesting without intercepting or eavesdropping information intended for the SURx. We formulate this transmit covariance matrix design as a SEE maximization problem which is a nonconvex problem due the nonlinear fractional objective function. To realize the solution for this nonconvex problem, we utilize the nonlinear fractional programming and difference of concave (DC) functions approaches to reformulate into a tractable form. Based on these techniques and the Dinkelbach's method, we propose iterative algorithms to determine the solution for the original SEE maximization problem. Numerical simulation results are provided to demonstrate the performance of the proposed transmit covariance matrix design and convergence of the proposed algorithms.
 [66] arXiv:1709.07290 [pdf, ps, other]

Title: Comparing the Switch and Curveball Markov Chains for Sampling Binary Matrices with Fixed MarginalsSubjects: Discrete Mathematics (cs.DM)
The Curveball algorithm is a variation on wellknown switchbased Markov chain approaches for uniformly sampling binary matrices with fixed row and column sums. Instead of a switch, the Curveball algorithm performs a socalled binomial trade in every iteration of the algorithm. Intuitively, this could lead to a better convergence rate for reaching the stationary (uniform) distribution in certain cases. Some experimental evidence for this has been given in the literature. In this note we give a spectral gap comparison between two switchbased chains and the Curveball chain. In particular, this comparison allows us to conclude that the Curveball Markov chain is rapidly mixing whenever one of the two switch chains is rapidly mixing. Our analysis directly extends to the case of sampling binary matrices with forbidden entries (under the assumption of irreducibility). This in particular captures the case of sampling simple directed graphs with given degrees. As a byproduct of our analysis, we show that the switch Markov chain of the KannanTetaliVempala conjecture only has nonnegative eigenvalues if the sampled binary matrices have at least three columns. This shows that the Markov chain does not have to be made lazy, which is of independent interest. We also obtain an improved bound on the smallest eigenvalue for the switch Markov chain studied by Greenhill for uniformly sampling simple directed regular graphs.
 [67] arXiv:1709.07308 [pdf, other]

Title: Predicting Positive and Negative Links with Noisy Queries: Theory & PracticeAuthors: Charalampos E. Tsourakakis, Michael Mitzenmacher, Jarosław Błasiok, Ben Lawson, Preetum Nakkiran, Vasileios NakosComments: 17 pages, 12 figures. arXiv admin note: text overlap with arXiv:1609.00750Subjects: Data Structures and Algorithms (cs.DS); Discrete Mathematics (cs.DM); Learning (cs.LG); Social and Information Networks (cs.SI); Combinatorics (math.CO)
Social networks and interactions in social media involve both positive and negative relationships. Signed graphs capture both types of relationships: positive edges correspond to pairs of "friends", and negative edges to pairs of "foes". The edge sign prediction problem, that aims to predict whether an interaction between a pair of nodes will be positive or negative, is an important graph mining task for which many heuristics have recently been proposed [Leskovec 2010].
We model the edge sign prediction problem as follows: we are allowed to query any pair of nodes whether they belong to the same cluster or not, but the answer to the query is corrupted with some probability $0<q<\frac{1}{2}$. Let $\delta=12q$ be the bias. We provide an algorithm that recovers all signs correctly with high probability in the presence of noise for any constant gap $\delta$ with $O(\frac{n\log n}{\delta^4})$ queries. Our algorithm uses breadth first search as its main algorithmic primitive. A byproduct of our proposed learning algorithm is the use of $st$ paths as an informative feature to predict the sign of the edge $(s,t)$. As a heuristic, we use edge disjoint $st$ paths of short length as a feature for predicting edge signs in realworld signed networks. Our findings suggest that the use of paths improves the classification accuracy, especially for pairs of nodes with no common neighbors.  [68] arXiv:1709.07310 [pdf, other]

Title: Constant Bearing Pursuit on Branching GraphsSubjects: Systems and Control (cs.SY); Robotics (cs.RO)
Cyclic pursuit frameworks provide an efficient way to create useful global behaviors out of pairwise interactions in a collective of autonomous robots. Earlier work studied cyclic pursuit with a constant bearing (CB) pursuit law, and has demonstrated the existence of a variety of interesting behaviors for the corresponding dynamics. In this work, by attaching multiple branches to a single cycle, we introduce a modified version of this framework which allows us to consider any weakly connected pursuit graph where each node has an outdegree of 1. This provides a further generalization of the cyclic pursuit setting. Then, after showing existence of relative equilibria (rectilinear or circling motion), pure shape equilibria (spiraling motion) and periodic orbits, we also derive necessary conditions for stability of a 3agent collective. By paving a way for individual agents to join or leave a collective without perturbing the motion of others, our approach leads to improved reliability of the overall system.
 [69] arXiv:1709.07314 [pdf, ps, other]

Title: Exact Learning of Lightweight Description Logic OntologiesSubjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)
We study the problem of learning description logic (DL) ontologies in Angluin et al.'s framework of exact learning via queries. We admit membership queries ("is a given subsumption entailed by the target ontology?") and equivalence queries ("is a given ontology equivalent to the target ontology?"). We present three main results: (1) ontologies formulated in (two relevant versions of) the description logic DLLite can be learned with polynomially many queries of polynomial size; (2) this is not the case for ontologies formulated in the description logic EL, even when only acyclic ontologies are admitted; and (3) ontologies formulated in a fragment of EL related to the web ontology language OWL 2 RL can be learned in polynomial time. We also show that neither membership nor equivalence queries alone are sufficient in cases (1) and (3).
 [70] arXiv:1709.07322 [pdf, other]

Title: Playing for BenchmarksComments: Published at the International Conference on Computer Vision (ICCV 2017)Subjects: Computer Vision and Pattern Recognition (cs.CV)
We present a benchmark suite for visual perception. The benchmark is based on more than 250K highresolution video frames, all annotated with groundtruth data for both lowlevel and highlevel vision tasks, including optical flow, semantic instance segmentation, object detection and tracking, objectlevel 3D scene layout, and visual odometry. Groundtruth data for all tasks is available for every frame. The data was collected while driving, riding, and walking a total of 184 kilometers in diverse ambient conditions in a realistic virtual world. To create the benchmark, we have developed a new approach to collecting groundtruth data from simulated worlds without access to their source code or content. We conduct statistical analyses that show that the composition of the scenes in the benchmark closely matches the composition of corresponding physical environments. The realism of the collected data is further validated via perceptual experiments. We analyze the performance of stateoftheart methods for multiple tasks, providing reference baselines and highlighting challenges for future research. The supplementary video can be viewed at https://youtu.be/T9OybWv923Y
 [71] arXiv:1709.07326 [pdf, other]

Title: AffordanceNet: An EndtoEnd Deep Learning Approach for Object Affordance DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
We propose AffordanceNet, a new deep learning approach to simultaneously detect multiple objects and their affordances from RGB images. Our AffordanceNet has two branches: an object detection branch to localize and classify the object, and an affordance detection branch to assign each pixel in the object to its most probable affordance label. The proposed framework employs three key components for effectively handling the multiclass problem in the affordance mask: a sequence of deconvolutional layers, a robust resizing strategy, and a multitask loss function. The experimental results on the public datasets show that our AffordanceNet outperforms recent stateoftheart methods by a fair margin, while its endtoend architecture allows the inference at the speed of 150ms per image. This makes our AffordanceNet is well suitable for realtime robotic applications. Furthermore, we demonstrate the effectiveness of AffordanceNet in different testing environments and in real robotic applications. The source code and trained models will be made available.
 [72] arXiv:1709.07330 [pdf, other]

Title: HDenseUNet: Hybrid Densely Connected UNet for Liver and Liver Tumor Segmentation from CT VolumesSubjects: Computer Vision and Pattern Recognition (cs.CV)
Liver and liver tumor segmentation plays an important role in hepatocellular carcinoma diagnosis and treatment planning. Recently, fully convolutional neural networks (FCNs) serve as the backbone in many volumetric medical image segmentation tasks, including 2D and 3D FCNs. However, 2D convolutions can not fully leverage the spatial information along the $z$axis direction while 3D convolutions suffer from high computational cost and GPU memory consumption. To address these issues, we propose a novel hybrid densely connected UNet (HDenseUNet), which consists of a 2D DenseUNet for efficiently extracting intraslice features and a 3D counterpart for hierarchically aggregating volumetric contexts under the spirit of the autocontext algorithm. In this way, the HDenseUNet can harness the hybrid deep features effectively for volumetric segmentation. We extensively evaluated our method on the MICCAI 2017 Liver Tumor Segmentation (LiTS) Challenge. Our method outperformed other stateoftheart methods on the overall score, with dice per case on liver and tumor as 0.961 and 0.686, as well as global dice score on liver and tumor as 0.965 and 0.829, respectively.
 [73] arXiv:1709.07337 [pdf, other]

Title: Efficient Column Generation for Cell Detection and SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
We study the problem of instance segmentation in biological images with crowded and compact cells. We formulate this task as an integer program where variables correspond to cells and constraints enforce that cells do not overlap. To solve this integer program, we propose a column generation formulation where the pricing program is solved via exact optimization of very small scale integer programs. Column generation is tightened using odd set inequalities which fit elegantly into pricing problem optimization. Our column generation approach achieves fast stable anytime inference for our instance segmentation problems. We demonstrate on three distinct light microscopy datasets, with several hundred cells each, that our proposed algorithm rapidly achieves or exceeds state of the art accuracy.
 [74] arXiv:1709.07346 [pdf, other]

Title: ExtendedAlphabet FiniteContext ModelsAuthors: João M. Carvalho, Susana Brás, Diogo Pratas, Jacqueline Ferreira, Sandra C. Soares, Armando J. PinhoSubjects: Information Theory (cs.IT)
The Normalized Relative Compression is a recent dissimilarity metric, related to the Kolmogorov Complexity. It has been successfully used in different applications, like DNA sequences, images or even ECG (electrocardiographic) signal. It uses a reference based compressor to represent a string and compresses a target string using that reference. One possible approach is to use finitecontext models (\textrm{FCM}s) to represent the strings.
A finitecontext model (\textrm{FCM}) calculates the probability distribution of the next symbol, given the previous $k$ symbols. In this paper, we introduce a generalization of the \textrm{FCM}s, called extended finitecontext models (\textrm{xaFCM}), that calculate the probability of occurrence of the next $d$ symbols, given the previous $k$ symbols.
As a testbed application, we show the results for performing ECG biometric identification, using an ECG database previously used in other studies for ECG biometric identification. The results show that often \textrm{xaFCM}s outperform \textrm{FCM}s in accuracy ratios, using less memory, to represent the models, and less computational time.  [75] arXiv:1709.07356 [pdf, ps, other]

Title: User Association and Bandwidth Allocation for Terrestrial and Aerial Base Stations with Backhaul ConsiderationsSubjects: Networking and Internet Architecture (cs.NI); Optimization and Control (math.OC)
Drone base stations (DBSs) can enhance network coverage and area capacity by moving supply towards demand when required. This degree of freedom could be especially useful for future applications with extreme demands, such as ultra reliable and low latency communications (uRLLC). However, deployment of DBSs can face several challenges. One issue is finding the 3D placement of such BSs to satisfy dynamic requirements of the system. Second, the availability of reliable wireless backhaul links and the related resource allocation are principal issues that should be considered. Finally, association of the users with BSs becomes an involved problem due to mobility of DBSs. In this paper, we consider a macroBS (MBS) and several DBSs that rely on the wireless links to the MBS for backhauling. Considering regular and uRLLC users, we propose an algorithm to find efficient 3D locations of DBSs in addition to the userBS associations and wireless backhaul bandwidth allocations to maximize the sum logarithmic rate of the users. To this end, a decomposition method is employed to first find the userBS association and bandwidth allocations. Then DBS locations are updated using a heuristic particle swarm optimization algorithm. Simulation results show the effectiveness of the proposed method and provide useful insights on the effects of traffic distributions and antenna beamwidth.
 [76] arXiv:1709.07357 [pdf]

Title: Retrofitting Concept Vector Representations of Medical Concepts to Improve Estimates of Semantic Similarity and RelatednessComments: To appear in: Proceedings of the 16th World Congress on Medical and Health Informatics 21st25th August Hangzhou, China (2017). Please visit and cite the canonical version once availableSubjects: Computation and Language (cs.CL)
Estimation of semantic similarity and relatedness between biomedical concepts has utility for many informatics applications. Automated methods fall into two categories: methods based on distributional statistics drawn from text corpora, and methods using the structure of existing knowledge resources. Methods in the former category disregard taxonomic structure, while those in the latter fail to consider semantically relevant empirical information. In this paper, we present a method that retrofits distributional context vector representations of biomedical concepts using structural information from the UMLS Metathesaurus, such that the similarity between vector representations of linked concepts is augmented. We evaluated it on the UMNSRS benchmark. Our results demonstrate that retrofitting of concept vector representations leads to better correlation with human raters for both similarity and relatedness, surpassing the best results reported to date. They also demonstrate a clear improvement in performance on this reference standard for retrofitted vector representations, as compared to those without retrofitting.
 [77] arXiv:1709.07358 [pdf, ps, other]

Title: NonDepthFirst Search against Independent Distributions on an ANDOR TreeAuthors: Toshio SuzukiComments: 12 pages, 1 figureSubjects: Data Structures and Algorithms (cs.DS); Artificial Intelligence (cs.AI)
Suzuki and Niida (Ann. Pure. Appl. Logic, 2015) showed the following results on independent distributions (IDs) on an ANDOR tree, where they took only depthfirst algorithms into consideration. (1) Among IDs such that probability of the root having value 0 is fixed as a given r such that 0 < r < 1, if d is a maximizer of cost of the best algorithm then d is an independent and identical distribution (IID). (2) Among all IDs, if d is a maximizer of cost of the best algorithm then d is an IID. In the case where nondepthfirst algorithms are taken into consideration, the counter parts of (1) and (2) are left open in the above work. Peng et al. (Inform. Process. Lett., 2017) extended (1) and (2) to multibranching trees, where in (2) they put an additional hypothesis on IDs that probability of the root having value 0 is neither 0 nor 1. We give positive answers for the two questions of SuzukiNiida. A key to the proof is that if ID d achieves the equilibrium among IDs then we can chose an algorithm of the best cost against d from depthfirst algorithms. In addition, we extend the result of Peng et al. to the case where nondepthfirst algorithms are taken into consideration.
 [78] arXiv:1709.07368 [pdf, other]

Title: Multilabel Pixelwise Classification for Reconstruction of Largescale Urban AreasSubjects: Computer Vision and Pattern Recognition (cs.CV)
Object classification is one of the many holy grails in computer vision and as such has resulted in a very large number of algorithms being proposed already. Specifically in recent years there has been considerable progress in this area primarily due to the increased efficiency and accessibility of deep learning techniques. In fact, for singlelabel object classification [i.e. only one object present in the image] the stateoftheart techniques employ deep neural networks and are reporting very close to humanlike performance. There are specialized applications in which singlelabel objectlevel classification will not suffice; for example in cases where the image contains multiple intertwined objects of different labels.
In this paper, we address the complex problem of multilabel pixelwise classification. We present our distinct solution based on a convolutional neural network (CNN) for performing multilabel pixelwise classification and its application to largescale urban reconstruction. A supervised learning approach is followed for training a 13layer CNN using both LiDAR and satellite images. An empirical study has been conducted to determine the hyperparameters which result in the optimal performance of the CNN. Scale invariance is introduced by training the network on five different scales of the input and labeled data. This results in six pixelwise classifications for each different scale. An SVM is then trained to map the six pixelwise classifications into a singlelabel. Lastly, we refine boundary pixel labels using graphcuts for maximum aposteriori (MAP) estimation with Markov Random Field (MRF) priors. The resulting pixelwise classification is then used to accurately extract and reconstruct the buildings in largescale urban areas. The proposed approach has been extensively tested and the results are reported.  [79] arXiv:1709.07377 [pdf, other]

Title: Geometric SMOTE: Effective oversampling for imbalanced learning through a geometric extension of SMOTEComments: 22 pages, 15 figuresSubjects: Learning (cs.LG)
Classification of imbalanced datasets is a challenging task for standard algorithms. Although many methods exist to address this problem in different ways, generating artificial data for the minority class is a more general approach compared to algorithmic modifications. SMOTE algorithm and its variations generate synthetic samples along a line segment that joins minority class instances. In this paper we propose Geometric SMOTE (GSMOTE) as a generalization of the SMOTE data generation mechanism. GSMOTE generates synthetic samples in a geometric region of the input space, around each selected minority instance. While in the basic configuration this region is a hypersphere, GSMOTE allows its deformation to a hyperspheroid and finally to a line segment, emulating, in the last case, the SMOTE mechanism. The performance of GSMOTE is compared against multiple standard oversampling algorithms. We present empirical results that show a significant improvement in the quality of the generated data when GSMOTE is used as an oversampling algorithm.
 [80] arXiv:1709.07383 [pdf, other]

Title: Urban Land Cover Classification with Missing Data Using Deep Convolutional Neural NetworksSubjects: Computer Vision and Pattern Recognition (cs.CV)
Automatic urban land cover classification is a classical problem in remote sensing and good urban land cover maps build the foundation for many tasks, such as e.g. environmental monitoring. It is a particularly challenging problem, as classes generally have high interclass and low intraclass variance. A common technique to improve urban land cover classification performance in remote sensing is the fusing of data from different sensors with different data modalities. However, all modalities are rarely available for all test data, and this missing data problem poses severe challenges for multimodal learning. Inspired by recent successes in deep learning, we propose as a remedy a convolutional neural network (CNN) architecture for urban remote sensing image segmentation trained on data modalities which are not all available at test time. We train our architecture with a cost function particularly suited for imbalanced classes, as this is a frequent problem in remote sensing, especially in urban areas. We demonstrate the method using two benchmark datasets, both consisting of optical and digital surface model (DSM) images. We simulate missing data, by assuming that the DSM images are missing during testing and show that our method outperforms both CNNs trained on optical images as well as an ensemble of two CNNs trained only on optical images. We further evaluate the potential of our method to handle situations where only some DSM images are missing during training and show that we can clearly exploit training time information of the missing modality during testing.
 [81] arXiv:1709.07387 [pdf]

Title: Analysis of Big Data Maturity Stage in Hospitality IndustrySubjects: Computers and Society (cs.CY)
Big data analytics has an extremely significant impact on many areas in all businesses and industries including hospitality. This study aims to guide information technology (IT) professionals in hospitality on their big data expedition. In particular, the purpose of this study is to identify the maturity stage of the big data in hospitality industry in an objective way so that hotels be able to understand their progress, and realize what it will take to get to the next stage of big data maturity through the scores they will receive based on the survey.
 [82] arXiv:1709.07390 [pdf]

Title: On the economics of knowledge creation and sharingAuthors: Omar MetwallySubjects: Computers and Society (cs.CY)
This work bridges the technical concepts underlying distributed computing and blockchain technologies with their profound socioeconomic and sociopolitical implications, particularly on academic research and the healthcare industry. Several examples from academia, industry, and healthcare are explored throughout this paper. The limiting factor in contemporary life sciences research is often funding: for example, to purchase expensive laboratory equipment and materials, to hire skilled researchers and technicians, and to acquire and disseminate data through established academic channels. In the case of the U.S. healthcare system, hospitals generate massive amounts of data, only a small minority of which is utilized to inform current and future medical practice. Similarly, corporations too expend large amounts of money to collect, secure and transmit data from one centralized source to another. In all three scenarios, data moves under the traditional paradigm of centralization, in which data is hosted and curated by individuals and organizations and of benefit to only a small subset of people.
 [83] arXiv:1709.07399 [pdf, other]

Title: EXPOSE the Line Failures following a CyberPhysical Attack on the Power GridSubjects: Systems and Control (cs.SY)
Recent attacks on power grids demonstrated the vulnerability of the grids to cyber and physical attacks. To analyze this vulnerability, we study cyberphysical attacks that affect both the power grid physical infrastructure and its underlying Supervisory Control And Data Acquisition (SCADA) system. We assume that an adversary attacks an area by: (i) disconnecting some lines within that area, and (ii) obstructing the information (e.g., status of the lines and voltage measurements) from within the area to reach the control center. We leverage the algebraic properties of the AC power flows to introduce the efficient EXPOSE Algorithm for detecting line failures and recovering voltages inside that attacked area after such an attack. The EXPOSE Algorithm outperforms the stateoftheart algorithm for detecting line failures using partial information under the AC power flow model in terms of scalability and accuracy. The main advantages of the EXPOSE Algorithm are that its running time is independent of the size of the grid and number of line failures, and that it provides accurate information recovery under some conditions on the attacked area. Moreover, it approximately recovers the information and provides the confidence of the solution when these conditions do not hold.
 [84] arXiv:1709.07401 [pdf, other]

Title: Influence of Personal Preferences on Link Dynamics in Social NetworksComments: 12 pagesJournalref: Complexity Volume 2017 (2017),Subjects: Social and Information Networks (cs.SI); Physics and Society (physics.socph)
We study a unique network dataset including periodic surveys and electronic logs of dyadic contacts via smartphones. The participants were a sample of freshmen entering university in the Fall 2011. Their opinions on a variety of political and social issues and lists of activities on campus were regularly recorded at the beginning and end of each semester for the first three years of study. We identify a behavioral network defined by call and text data, and a cognitive network based on friendship nominations in egonetwork surveys. Both networks are limited to study participants. Since a wide range of attributes on each node were collected in selfreports, we refer to these networks as attributerich networks. We study whether student preferences for certain attributes of friends can predict formation and dissolution of edges in both networks. We introduce a method for computing student preferences for different attributes which we use to predict link formation and dissolution. We then rank these attributes according to their importance for making predictions. We find that personal preferences, in particular political views, and preferences for common activities help predict link formation and dissolution in both the behavioral and cognitive networks.
 [85] arXiv:1709.07403 [pdf, other]

Title: Inducing Distant Supervision in Suggestion Mining through PartofSpeech EmbeddingsSubjects: Computation and Language (cs.CL)
Mining suggestion expressing sentences from a given text is a less investigated sentence classification task, and therefore lacks hand labeled benchmark datasets. In this work, we propose and evaluate two approaches for distant supervision in suggestion mining. The distant supervision is obtained through a large silver standard dataset, constructed using the text from wikiHow and Wikipedia. Both the approaches use a LSTM based neural network architecture to learn a classification model for suggestion mining, but vary in their method to use the silver standard dataset. The first approach directly trains the classifier using this dataset, while the second approach only learns word embeddings from this dataset. In the second approach, we also learn POS embeddings, which interestingly gives the best classification accuracy.
 [86] arXiv:1709.07407 [pdf, other]

Title: ComSens: Exploiting Pilot Diversity for Pervasive Integration of Communication and Sensing in MIMOTDDFrameworksComments: To be published on IEEE Milcom 2017Subjects: Information Theory (cs.IT)
In this paper, we propose a fullyintegrated radar and communication system  named ComSens. We utilize two different pilot sequences (one for uplink and one for downlink) with the condition that they must be uncorrelated to each other. Within such a framework, the signal received from enduser and the backscattered signal from the desired objects have uncorrelated pilots. Thus, the basestation is able to distinguish data signal from user and backscattered signal from object. We assume a time division duplex (TDD) framework. The pilot sequences are designed for MIMO channels. We evaluate channel MSE as a figure of merit for communication system. We also show that the designed pilots are uncorrelated for a range of time lags. Moreover, designed uplink pilot has negligible autocorrelation for a range of time lags leading to an impulselike autocorrelation for radar sensing.
 [87] arXiv:1709.07417 [pdf, other]

Title: Neural Optimizer Search with Reinforcement LearningSubjects: Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)
We present an approach to automate the process of discovering optimization methods, with a focus on deep learning architectures. We train a Recurrent Neural Network controller to generate a string in a domain specific language that describes a mathematical update equation based on a list of primitive functions, such as the gradient, running average of the gradient, etc. The controller is trained with Reinforcement Learning to maximize the performance of a model after a few epochs. On CIFAR10, our method discovers several update rules that are better than many commonly used optimizers, such as Adam, RMSProp, or SGD with and without Momentum on a ConvNet model. We introduce two new optimizers, named PowerSign and AddSign, which we show transfer well and improve training on a variety of different tasks and architectures, including ImageNet classification and Google's neural machine translation system.
 [88] arXiv:1709.07429 [pdf]

Title: Learned Features are better for Ethnicity ClassificationComments: 15 pages, 8 figures, 2 tables, code and framework available on requestJournalref: Cybernetics and Information Technologies, vol. 17, no. 3, pp. 152164, Sep., 2017Subjects: Computer Vision and Pattern Recognition (cs.CV)
Ethnicity is a key demographic attribute of human beings and it plays a vital role in automatic facial recognition and have extensive real world applications such as Human Computer Interaction (HCI); demographic based classification; biometric based recognition; security and defense to name a few. In this paper we present a novel approach for extracting ethnicity from the facial images. The proposed method makes use of a pre trained Convolutional Neural Network (CNN) to extract the features and then Support Vector Machine (SVM) with linear kernel is used as a classifier. This technique uses translational invariant hierarchical features learned by the network, in contrast to previous works, which use hand crafted features such as Local Binary Pattern (LBP); Gabor etc. Thorough experiments are presented on ten different facial databases which strongly suggest that our approach is robust to different expressions and illuminations conditions. Here we consider ethnicity classification as a three class problem including Asian, AfricanAmerican and Caucasian. Average classification accuracy over all databases is 98.28%, 99.66% and 99.05% for Asian, AfricanAmerican and Caucasian respectively.
 [89] arXiv:1709.07432 [pdf, other]

Title: Dynamic Evaluation of Neural Sequence ModelsSubjects: Neural and Evolutionary Computing (cs.NE); Computation and Language (cs.CL)
We present methodology for using dynamic evaluation to improve neural sequence models. Models are adapted to recent history via a gradient descent based mechanism, allowing them to assign higher probabilities to reoccurring sequential patterns. Dynamic evaluation is demonstrated to compare favourably with existing adaptation approaches for language modelling. We apply dynamic evaluation to improve the state of the art wordlevel perplexities on the Penn Treebank and WikiText2 datasets to 51.1 and 44.3 respectively, and the state of the art characterlevel crossentropy on the Hutter prize dataset to 1.17 bits/character.
 [90] arXiv:1709.07434 [pdf, other]

Title: Analyzing users' sentiment towards popular consumer industries and brands on TwitterComments: 8 pages, 11 figures, 1 table, 2017 IEEE International Conference on Data Mining Workshops (ICDMW 2017), ICDM Sentiment Elicitation from Natural Text for Information Retrieval and Extraction (ICDM SENTIRE) 2017 workshopJournalref: 2017 IEEE International Conference on Data Mining Workshops (ICDMW 2017)Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Social and Information Networks (cs.SI)
Social media serves as a unified platform for users to express their thoughts on subjects ranging from their daily lives to their opinion on consumer brands and products. These users wield an enormous influence in shaping the opinions of other consumers and influence brand perception, brand loyalty and brand advocacy. In this paper, we analyze the opinion of 19M Twitter users towards 62 popular industries, encompassing 12,898 enterprise and consumer brands, as well as associated subject matter topics, via sentiment analysis of 330M tweets over a period spanning a month. We find that users tend to be most positive towards manufacturing and most negative towards service industries. In addition, they tend to be more positive or negative when interacting with brands than generally on Twitter. We also find that sentiment towards brands within an industry varies greatly and we demonstrate this using two industries as use cases. In addition, we discover that there is no strong correlation between topic sentiments of different industries, demonstrating that topic sentiments are highly dependent on the context of the industry that they are mentioned in. We demonstrate the value of such an analysis in order to assess the impact of brands on social media. We hope that this initial study will prove valuable for both researchers and companies in understanding users' perception of industries, brands and associated topics and encourage more research in this field.
 [91] arXiv:1709.07439 [pdf]

Title: Promises and Challenges in Continuous Tracking Utilizing Amino Acids in Skin Secretions for Active MultiFactor Biometric Authentication for CybersecurityComments: Keywords: authentication; forensics; biosensing;amino acids; enzymesJournalref: ChemPhysChem 18 (13), 17141720 (2017)Subjects: Cryptography and Security (cs.CR); Quantitative Methods (qbio.QM)
We consider a new concept of biometricbased cybersecurity systems for active authentication by continuous tracking, which utilizes biochemical processing of metabolites present in skin secretions. Skin secretions contain a large number of metabolites and small molecules that can be targeted for analysis. Here we argue that amino acids found in sweat can be exploited for the establishment of an amino acid profile capable of identifying an individual user of a mobile or wearable device. Individual and combinations of amino acids processed by biocatalytic cascades yield physical (optical or electronic) signals, providing a timeseries of several outputs that, in their entirety, should suffice to authenticate a specific user based on standard statistical criteria. Initial results, motivated by biometrics, indicate that single amino acid levels can provide analog signals that vary according to the individual donor, albeit with limited resolution versus noise. However, some such assays offer digital separation (into welldefined ranges of values) according to groups such as age, biological sex, race, and physiological state of the individual. Multiinput biocatalytic cascades that handle several amino acid signals to yield a single digitaltype output, as well as continuoustracking timeseries data rather than a singleinstance sample, should enable active authentication at the level of an individual.
Crosslists for Fri, 22 Sep 17
 [92] arXiv:1704.06919 (crosslist from math.OC) [pdf, ps, other]

Title: Partially separable convexlyconstrained optimization with nonLipschitzian singularities and its complexitySubjects: Optimization and Control (math.OC); Computational Complexity (cs.CC)
An adaptive regularization algorithm using highorder models is proposed for partially separable convexly constrained nonlinear optimization problems whose objective function contains nonLipschitzian $\ell_q$norm regularization terms for $q\in (0,1)$. It is shown that the algorithm using an $p$th order Taylor model for $p$ odd needs in general at most $O(\epsilon^{(p+1)/p})$ evaluations of the objective function and its derivatives (at points where they are defined) to produce an $\epsilon$approximate firstorder critical point. This result is obtained either with Taylor models at the price of requiring the feasible set to be 'kernelcentered' (which includes bound constraints and many other cases of interest), or for nonLipschitz models, at the price of passing the difficulty to the computation of the step. Since this complexity bound is identical in order to that already known for purely Lipschitzian minimization subject to convex constraints [CartGoulToin2016], the new result shows that introducing nonLipschitzian singularities in the objective function may not affect the worstcase evaluation complexity order. The result also shows that using the problem's partially separable structure (if present) does not affect complexity order either. A final (worse) complexity bound is derived for the case where Taylor models are used with a general convex feasible set.
 [93] arXiv:1705.04895 (crosslist from math.OC) [pdf, other]

Title: Evaluation complexity bounds for smooth constrained nonlinear optimisation using scaled KKT conditions, highorder models and the criticality measure $χ$Subjects: Optimization and Control (math.OC); Computational Complexity (cs.CC)
Evaluation complexity for convexly constrained optimization is considered and it is shown first that the complexity bound of $O(\epsilon^{3/2})$ proved by Cartis, Gould and Toint (IMAJNA 32(4) 2012, pp.16621695) for computing an $\epsilon$approximate firstorder critical point can be obtained under significantly weaker assumptions. Moreover, the result is generalized to the case where highorder derivatives are used, resulting in a bound of $O(\epsilon^{(p+1)/p})$ evaluations whenever derivatives of order $p$ are available. It is also shown that the bound of $O(\epsilon_P^{1/2}\epsilon_D^{3/2})$ evaluations ($\epsilon_P$ and $\epsilon_D$ being primal and dual accuracy thresholds) suggested by Cartis, Gould and Toint (SINUM, 2015) for the general nonconvex case involving both equality and inequality constraints can be generalized to a bound of $O(\epsilon_P^{1/p}\epsilon_D^{(p+1)/p})$ evaluations under similarly weakened assumptions. This paper is variant of a companion report (NTR112015, University of Namur, Belgium) which uses a different firstorder criticality measure to obtain the same complexity bounds.
 [94] arXiv:1705.07285 (crosslist from math.OC) [pdf, ps, other]

Title: Optimality of orders one to three and beyond: characterization and evaluation complexity in constrained nonconvex optimizationSubjects: Optimization and Control (math.OC); Computational Complexity (cs.CC); Numerical Analysis (cs.NA)
Necessary conditions for highorder optimality in smooth nonlinear constrained optimization are explored and their inherent intricacy discussed. A twophase minimization algorithm is proposed which can achieve approximate first, second and thirdorder criticality and its evaluation complexity is analyzed as a function of the choice (among existing methods) of an inner algorithm for solving subproblems in each of the two phases. The relation between highorder criticality and penalization techniques is finally considered, showing that standard algorithmic approaches will fail if approximate constrained highorder critical points are sought.
 [95] arXiv:1709.07155 (crosslist from math.ST) [pdf, other]

Title: Local Private Hypothesis Testing: ChiSquare TestsSubjects: Statistics Theory (math.ST); Cryptography and Security (cs.CR)
The local model for differential privacy is emerging as the reference model for practical applications collecting and sharing sensitive information while satisfying strong privacy guarantees. In the local model, there is no trusted entity which is allowed to have each individual's raw data as is assumed in the traditional curator model for differential privacy. So, individuals' data are usually perturbed before sharing them.
We explore the design of private hypothesis tests in the local model, where each data entry is perturbed to ensure the privacy of each participant. Specifically, we analyze locally private chisquare tests for goodness of fit and independence testing, which have been studied in the traditional, curator model for differential privacy.  [96] arXiv:1709.07267 (crosslist from stat.ML) [pdf, other]

Title: Yet Another ADNI Machine Learning Paper? Paving The Way Towards Fullyreproducible Research on Classification of Alzheimer's DiseaseAuthors: Jorge SamperGonzález, Ninon Burgos, Sabrina Fontanella, Hugo Bertin, MarieOdile Habert, Stanley Durrleman, Theodoros Evgeniou, Olivier ColliotJournalref: Proc. Machine Learning in Medical Imaging MLMI 2017, MICCAI Worskhop, Lecture Notes in Computer Science, volume 10541, pp 5360, SpringerSubjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (qbio.NC); Quantitative Methods (qbio.QM)
In recent years, the number of papers on Alzheimer's disease classification has increased dramatically, generating interesting methodological ideas on the use machine learning and feature extraction methods. However, practical impact is much more limited and, eventually, one could not tell which of these approaches are the most efficient. While over 90\% of these works make use of ADNI an objective comparison between approaches is impossible due to variations in the subjects included, image preprocessing, performance metrics and crossvalidation procedures. In this paper, we propose a framework for reproducible classification experiments using multimodal MRI and PET data from ADNI. The core components are: 1) code to automatically convert the full ADNI database into BIDS format; 2) a modular architecture based on Nipype in order to easily plugin different classification and feature extraction tools; 3) feature extraction pipelines for MRI and PET data; 4) baseline classification approaches for unimodal and multimodal features. This provides a flexible framework for benchmarking different feature extraction and classification tools in a reproducible manner. We demonstrate its use on all (1519) baseline T1 MR images and all (1102) baseline FDG PET images from ADNI 1, GO and 2 with SPMbased feature extraction pipelines and three different classification techniques (linear SVM, anatomically regularized SVM and multiple kernel learning SVM). The highest accuracies achieved were: 91% for AD vs CN, 83% for MCIc vs CN, 75% for MCIc vs MCInc, 94% for ADA$\beta$+ vs CNA$\beta$ and 72% for MCIcA$\beta$+ vs MCIncA$\beta$+. The code is publicly available at https://gitlab.icminstitute.org/aramislab/ADML (depends on the Clinica software platform, publicly available at this http URL).
 [97] arXiv:1709.07268 (crosslist from quantph) [pdf, ps, other]

Title: On Composite Quantum Hypothesis TestingComments: 14 pagesSubjects: Quantum Physics (quantph); Information Theory (cs.IT); Mathematical Physics (mathph)
We extend quantum Stein's lemma in asymmetric quantum hypothesis testing to composite null and alternative hypotheses. As our main result, we show that the asymptotic error exponent for testing convex combinations of quantum states $\rho^{\otimes n}$ against convex combinations of quantum states $\sigma^{\otimes n}$ is given by a regularized quantum relative entropy distance formula. We prove that in general such a regularization is needed but also discuss various settings where our formula as well as extensions thereof become singleletter. This includes a novel operational interpretation of the relative entropy of coherence in terms of hypothesis testing. For our proof, we start from the composite Stein's lemma for classical probability distributions and lift the result to the noncommutative setting by only using elementary properties of quantum entropy. Finally, our findings also imply an improved Markov type lower bound on the quantum conditional mutual information in terms of the regularized quantum relative entropy  featuring an explicit and universal recovery map.
 [98] arXiv:1709.07270 (crosslist from math.NA) [pdf, other]

Title: A New Framework for $\mathcal{H}_2$Optimal Model ReductionSubjects: Numerical Analysis (math.NA); Numerical Analysis (cs.NA)
In this contribution, a new framework for H2optimal reduction of multipleinput, multiple output linear dynamical systems by tangential interpolation is presented. The framework is motivated by the local nature of both tangential interpolation and H2optimal approxi mations. The main advantage is given by a decoupling of the cost of optimization from the cost of reduction, resulting in a significant speedup in H2optimal reduction. In addition, a middlesized surrogate model is produced at no additional cost and can be used e.g. for error estimation. Numerical examples illustrate the new framework, showing its effectiveness in producing H2optimal reduced models at a far lower cost than conventional algorithms. The paper ends with a brief discussion on how the idea behind the framework can be extended to approximate further system classes, thus showing that this truly is a general framework for interpolatory H2 reduction rather than just an additional reduction algorithm.
 [99] arXiv:1709.07333 (crosslist from math.OC) [pdf, ps, other]

Title: Symbolic Optimal ControlSubjects: Optimization and Control (math.OC); Systems and Control (cs.SY)
We present novel results on the solution of a class of leavable, undiscounted optimal control problems in the minimax sense for nonlinear, continuousstate, discretetime plants. The problem class includes entry(exit)time problems as well as minimum time, pursuitevasion and reachavoid games as special cases. We utilize auxiliary optimal control problems ("abstractions") to compute both upper bounds of the value function, i.e., of the achievable closedloop performance, and symbolic feedback controllers realizing those bounds. The abstractions are obtained from discretizing the problem data, and we prove that the computed bounds and the performance of the symbolic controllers converge to the value function as the discretization parameters approach zero. In particular, if the optimal control problem is solvable on some compact subset of the state space, and if the discretization parameters are sufficiently small, then we obtain a symbolic feedback controller solving the problem on that subset. These results do not assume the continuity of the value function or any problem data, and they fully apply in the presence of hard state and control constraints.
 [100] arXiv:1709.07359 (crosslist from stat.ML) [pdf, other]

Title: ClassSplitting Generative Adversarial NetworksSubjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Learning (cs.LG)
Generative Adversarial Networks (GANs) produce systematically better quality samples when class label information is provided., i.e. in the conditional GAN setup. This is still observed for the recently proposed Wasserstein GAN formulation which stabilized adversarial training and allows considering high capacity network architectures such as ResNet. In this work we show how to boost conditional GAN by augmenting available class labels. The new classes come from clustering in the representation space learned by the same GAN model. The proposed strategy is also feasible when no class information is available, i.e. in the unsupervised setup. Our generated samples reach stateoftheart Inception scores for CIFAR10 and STL10 datasets in both supervised and unsupervised setup.
 [101] arXiv:1709.07409 (crosslist from quantph) [pdf, other]

Title: Quantum Autoencoders via Quantum Adders with Genetic AlgorithmsSubjects: Quantum Physics (quantph); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
The quantum autoencoder is a recent paradigm in the field of quantum machine learning, which may enable an enhanced use of resources in quantum technologies. To this end, quantum neural networks with less nodes in the inner than in the outer layers were considered. Here, we propose a useful connection between approximate quantum adders and quantum autoencoders. Specifically, this link allows us to employ optimized approximate quantum adders, obtained with genetic algorithms, for the implementation of quantum autoencoders for a variety of initial states. Furthermore, we can also directly optimize the quantum autoencoders via genetic algorithms. Our approach opens a different path for the design of quantum autoencoders in controllable quantum platforms.
 [102] arXiv:1709.07433 (crosslist from stat.ML) [pdf, other]

Title: Perturbative Black Box Variational InferenceComments: Neural Information Processing Systems (NIPS 2017), to appear. Joint first authorship of first two mentioned authorsSubjects: Machine Learning (stat.ML); Learning (cs.LG)
Black box variational inference (BBVI) with reparameterization gradients triggered the exploration of divergence measures other than the KullbackLeibler (KL) divergence, such as alpha divergences. These divergences can be tuned to be more masscovering (preventing overfitting in complex models), but are also often harder to optimize using MonteCarlo gradients. In this paper, we view BBVI with generalized divergences as a form of biased importance sampling. The choice of divergence determines a biasvariance tradeoff between the tightness of the bound (low bias) and the variance of its gradient estimators. Drawing on variational perturbation theory of statistical physics, we use these insights to construct a new variational bound which is tighter than the KL bound and more mass covering. Compared to alphadivergences, its reparameterization gradients have a lower variance. We show in several experiments on Gaussian Processes and Variational Autoencoders that the resulting posterior covariances are closer to the true posterior and lead to higher likelihoods on heldout data.
Replacements for Fri, 22 Sep 17
 [103] arXiv:1509.02574 (replaced) [pdf]

Title: Barriers to Integration: Physical Boundaries and the Spatial Structure of Residential SegregationSubjects: Physics and Society (physics.socph); Information Theory (cs.IT); Methodology (stat.ME)
 [104] arXiv:1602.00602 (replaced) [pdf, other]

Title: Virtual Machine Warmup Blows Hot and ColdComments: 40 pages, 11 figures, 9 tables, 3 listingsSubjects: Programming Languages (cs.PL)
 [105] arXiv:1603.06470 (replaced) [pdf, other]

Title: Frankenstein: Learning Deep Face Representations using Small DataComments: IEEE TIPSubjects: Computer Vision and Pattern Recognition (cs.CV)
 [106] arXiv:1607.03738 (replaced) [pdf, other]

Title: Do semantic parts emerge in Convolutional Neural Networks?Subjects: Computer Vision and Pattern Recognition (cs.CV)
 [107] arXiv:1607.05602 (replaced) [pdf, ps, other]

Title: Wireless Information and Power Transfer: Nonlinearity, Waveform Design and RateEnergy TradeoffAuthors: Bruno ClerckxComments: submitted for possible journal publicationSubjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
 [108] arXiv:1611.10076 (replaced) [pdf, other]

Title: Android InterApp Communication Threats and Detection TechniquesAuthors: Shweta Bhandari, Wafa Ben Jaballah, Vineeta Jain, Vijay Laxmi, Akka Zemmari, Manoj Singh Gaur, Mohamed Mosbah, Mauro ContiComments: 83 pages, 4 figures, This is a survey paperJournalref: computers & security 70 (2017) 392421Subjects: Cryptography and Security (cs.CR)
 [109] arXiv:1612.02545 (replaced) [pdf, ps, other]
 [110] arXiv:1612.03653 (replaced) [pdf, other]

Title: Learning to Drive using Inverse Reinforcement Learning and Deep QNetworksComments: NIPS workshop on Deep Learning for Action and Interaction, 2016Subjects: Artificial Intelligence (cs.AI); Robotics (cs.RO)
 [111] arXiv:1612.04099 (replaced) [pdf, other]

Title: An attack against the Helios election system that exploits revotingSubjects: Cryptography and Security (cs.CR)
 [112] arXiv:1701.02195 (replaced) [pdf, ps, other]

Title: Distributed Load Shedding for Microgrid with Compensation Support via Wireless NetworkComments: 28 pages, 10 figuresSubjects: Systems and Control (cs.SY)
 [113] arXiv:1701.09177 (replaced) [pdf, other]

Title: A Dirichlet Mixture Model of Hawkes Processes for Event Sequence ClusteringJournalref: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USASubjects: Learning (cs.LG); Machine Learning (stat.ML)
 [114] arXiv:1703.01313 (replaced) [pdf, other]

Title: Secure Virtual Network Embedding in a MultiCloud EnvironmentSubjects: Networking and Internet Architecture (cs.NI)
 [115] arXiv:1703.03391 (replaced) [pdf, ps, other]

Title: Firstorder logic with incomplete informationAuthors: Antti KuusistoSubjects: Logic (math.LO); Logic in Computer Science (cs.LO)
 [116] arXiv:1703.06452 (replaced) [pdf, other]

Title: Algorithms for Semantic Segmentation of Multispectral Remote Sensing Imagery using Deep LearningSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
 [117] arXiv:1703.08692 (replaced) [pdf, other]

Title: Robust Decentralized Abstractions for Multiple Mobile ManipulatorsComments: Accepted for publication in the IEEE Conference on Decision and Control, Melbourne, Australia, 2017Subjects: Systems and Control (cs.SY)
 [118] arXiv:1704.05537 (replaced) [pdf, other]

Title: Signaling on the Continuous Spectrum of Nonlinear Optical fiberJournalref: Optics Express, vol. 25, issue 16, pages 1868518702, July 2017Subjects: Information Theory (cs.IT); Numerical Analysis (cs.NA); Optics (physics.optics)
 [119] arXiv:1704.08464 (replaced) [pdf, other]

Title: Consensus measure of rankingsSubjects: Artificial Intelligence (cs.AI)
 [120] arXiv:1705.02940 (replaced) [pdf, other]

Title: Optimized Data Representation for Interactive Multiview NavigationSubjects: Multimedia (cs.MM)
 [121] arXiv:1705.03284 (replaced) [pdf, other]

Title: Towards a complexity theory for the congested cliqueSubjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computational Complexity (cs.CC)
 [122] arXiv:1705.03811 (replaced) [pdf, other]

Title: From 3D Models to 3D Prints: an Overview of the Processing PipelineComments: European Union (EU); Horizon 2020; H2020FoF2015; RIA  Research and Innovation action; Grant agreement N. 680448Subjects: Graphics (cs.GR)
 [123] arXiv:1705.06628 (replaced) [pdf]

Title: Robust tracking of respiratory rate in highdynamic range scenes using mobile thermal imagingComments: Vol. 8, No. 10, 1 Oct 2017, Biomedical Optics Express 4480  Full abstract can be found in this journal article (due to limited word counts of 'arXiv abstract')Journalref: Biomedical Optics Express, 2017Subjects: Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.medph)
 [124] arXiv:1705.07666 (replaced) [pdf, other]

Title: Goal Clustering: VNS based heuristicsAuthors: Pedro MartinsSubjects: Discrete Mathematics (cs.DM)
 [125] arXiv:1705.08426 (replaced) [pdf, other]

Title: Symbolic LTLf SynthesisSubjects: Logic in Computer Science (cs.LO); Artificial Intelligence (cs.AI)
 [126] arXiv:1705.08947 (replaced) [pdf, other]

Title: Deep Voice 2: MultiSpeaker Neural TexttoSpeechAuthors: Sercan Arik, Gregory Diamos, Andrew Gibiansky, John Miller, Kainan Peng, Wei Ping, Jonathan Raiman, Yanqi ZhouComments: Accepted in NIPS 2017Subjects: Computation and Language (cs.CL)
 [127] arXiv:1705.10195 (replaced) [pdf, other]

Title: Deterministic subgraph detection in broadcast CONGESTSubjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)
 [128] arXiv:1706.00136 (replaced) [pdf, other]

Title: Scalable Generalized Linear Bandits: Online Computation and HashingComments: accepted to NIPS'17Subjects: Machine Learning (stat.ML); Learning (cs.LG)
 [129] arXiv:1706.06366 (replaced) [pdf, other]

Title: A Thorough Formalization of Conceptual SpacesComments: accepted at KI 2017 (this http URL), final publication is available at Springer via this http URLSubjects: Artificial Intelligence (cs.AI)
 [130] arXiv:1706.08050 (replaced) [pdf, ps, other]

Title: Minimum Connected Transversals in Graphs: New Hardness Results and Tractable Cases Using the Price of ConnectivitySubjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC); Discrete Mathematics (cs.DM); Combinatorics (math.CO)
 [131] arXiv:1706.08065 (replaced) [pdf, other]

Title: SURF: A new codebased signature schemeSubjects: Cryptography and Security (cs.CR)
 [132] arXiv:1706.08642 (replaced) [pdf, other]

Title: Error Characterization, Mitigation, and Recovery in Flash Memory Based SolidState DrivesJournalref: Proceedings of the IEEE 105 (2017) 1666  1704Subjects: Hardware Architecture (cs.AR)
 [133] arXiv:1707.01134 (replaced) [pdf, other]

Title: Achievable Rates for Probabilistic ShapingAuthors: Georg BöchererComments: Example 1 addedSubjects: Information Theory (cs.IT)
 [134] arXiv:1707.01245 (replaced) [pdf]

Title: Maximum Induced Matching Algorithms via Vertex Ordering CharacterizationsSubjects: Data Structures and Algorithms (cs.DS)
 [135] arXiv:1707.03076 (replaced) [pdf]

Title: The Closer the Better: Similarity of Publication Pairs at Different CoCitation LevelsSubjects: Digital Libraries (cs.DL)
 [136] arXiv:1707.05165 (replaced) [pdf, other]

Title: A Comprehensive Implementation of Conceptual SpacesComments: Accepted at AIC 2017 (this http URL), currently under review. arXiv admin note: substantial text overlap with arXiv:1707.02292 and arXiv:1706.06366Subjects: Artificial Intelligence (cs.AI)
 [137] arXiv:1708.02151 (replaced) [pdf, other]

Title: Reverse Engineering Human Mobility in Largescale Natural DisastersComments: To appear in Proceedings of MSWiM '17. 8 Pages, 9 Figures. Source code and data available at this https URLSubjects: Networking and Internet Architecture (cs.NI); Computers and Society (cs.CY)
 [138] arXiv:1708.06018 (replaced) [pdf, ps, other]

Title: Conversion of Mersenne Twister to doubleprecision floatingpoint numbersAuthors: Shin HaraseComments: 15 pagesSubjects: Numerical Analysis (math.NA); Numerical Analysis (cs.NA); Computation (stat.CO)
 [139] arXiv:1708.07241 (replaced) [pdf, other]

Title: NNVLP: A Neural NetworkBased Vietnamese Language Processing ToolkitComments: 4 pages, 5 figures, 5 tables, accepted to IJCNLP 2017, updated experiment resultsSubjects: Computation and Language (cs.CL)
 [140] arXiv:1708.08475 (replaced) [pdf, other]

Title: How Unique is Your .onion? An Analysis of the Fingerprintability of Tor Onion ServicesComments: Accepted by ACM CCS 2017Subjects: Cryptography and Security (cs.CR)
 [141] arXiv:1708.09803 (replaced) [pdf, other]

Title: Transfer Learning across LowResource, Related Languages for Neural Machine TranslationSubjects: Computation and Language (cs.CL)
 [142] arXiv:1709.00084 (replaced) [pdf, other]

Title: Behavior Trees in Robotics and AI: An IntroductionSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
 [143] arXiv:1709.00103 (replaced) [pdf, other]

Title: Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement LearningComments: 12 pages, 5 figuresSubjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
 [144] arXiv:1709.02721 (replaced) [pdf]

Title: On the correlation between a level of structure order and properties of composites. In Memory of Yu.L. KlimontovichAuthors: Alexander HeregaComments: in RussianSubjects: Graphics (cs.GR)
 [145] arXiv:1709.03456 (replaced) [pdf, other]

Title: CLAD: A Complex and Long Activities Dataset with Rich Crowdsourced AnnotationsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
 [146] arXiv:1709.04090 (replaced) [pdf, other]

Title: A Constrained, WeightedL1 Minimization Approach for Joint Discovery of Heterogeneous Neural Connectivity GraphsComments: 8 pagesSubjects: Neurons and Cognition (qbio.NC); Learning (cs.LG)
 [147] arXiv:1709.04514 (replaced) [pdf, other]

Title: Differentially Private Mixture of Generative Neural NetworksComments: This is the full version of the paper with the same title to appear at 17th IEEE International Conference on Data Mining series (ICDM 2017)Subjects: Learning (cs.LG); Cryptography and Security (cs.CR)
 [148] arXiv:1709.04558 (replaced) [pdf]

Title: Using NLU in Context for Question Answering: Improving on Facebook's bAbI TasksAuthors: John S. BallComments: 38 Pages, 10 TablesSubjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
 [149] arXiv:1709.05289 (replaced) [pdf, ps, other]

Title: Optimal approximation of piecewise smooth functions using deep ReLU neural networksSubjects: Functional Analysis (math.FA); Learning (cs.LG); Machine Learning (stat.ML)
 [150] arXiv:1709.05476 (replaced) [pdf, ps, other]

Title: Cooperative Network Synchronization: Asymptotic AnalysisSubjects: Information Theory (cs.IT)
 [151] arXiv:1709.05596 (replaced) [pdf, other]

Title: An energy and dynamical systems perspective to the dampinginduced self recovery phenomenonSubjects: Systems and Control (cs.SY)
 [152] arXiv:1709.05633 (replaced) [pdf, other]

Title: An ultralow leakage synaptic scaling homeostatic plasticity circuit with configurable time scales up to 100 kilosecondsSubjects: Emerging Technologies (cs.ET)
 [153] arXiv:1709.05715 (replaced) [pdf, other]

Title: Optimal Battery Control Under Cycle Aging MechanismsComments: Submitted to IEEE TAC. arXiv admin note: text overlap with arXiv:1703.07824Subjects: Optimization and Control (math.OC); Systems and Control (cs.SY)
 [154] arXiv:1709.06070 (replaced) [pdf, ps, other]

Title: MacWilliams' extension theorem for infinite ringsComments: 13 pagesSubjects: Rings and Algebras (math.RA); Information Theory (cs.IT)
 [155] arXiv:1709.06155 (replaced) [pdf, other]

Title: A dissipativity theorem for pdominant systemsComments: 6 pages, 2 figures, in 56th IEEE Conference on Decision and Control (CDC 2017)Subjects: Systems and Control (cs.SY); Dynamical Systems (math.DS); Optimization and Control (math.OC)
 [156] arXiv:1709.06293 (replaced) [pdf, other]

Title: Sparse Markov Decision Processes with Causal Sparse Tsallis Entropy Regularization for Reinforcement LearningComments: 15 pages, 9 figuresSubjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
 [157] arXiv:1709.06456 (replaced) [pdf, other]

Title: Balance control using both ZMP and COM height variations: A convex boundedness approachSubjects: Robotics (cs.RO)
 [158] arXiv:1709.06909 (replaced) [pdf, ps, other]

Title: Opposition based Ensemble Micro Differential EvolutionComments: This paper is accepted for presentation at IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2017), Hawaii, USA, 2017Subjects: Neural and Evolutionary Computing (cs.NE)
[ showing up to 2000 entries per page: fewer  more ]
Disable MathJax (What is MathJax?)