KI 2016: Advances in Artificial Intelligence Papers

Machine Learning Technology  Artificial Intelligence Technology  Natural Language Processing Technology  Semantic Web Technology  Ontology Technology   Reasoning Technology   Knowledge Information Technology  Collecting AI Conference Papers    Digital Transformation Technology

In the previous article, I described KI 2015. In this article, we describe the 39th German Conference on Artificial Intelligence, KI 2016, held September 26-30, 2016

This annual conference, which began in 1975 as the German Workshop on AI (GWAI), traditionally brings together academic and industrial researchers in all areas of AI and provides an systems and algorithms, and serves as an international forum for research on fundamentals and applications.

This year’s conference was held jointly with the Austrian Society for Artificial Intelligence (ÖGAI) in Klagenfurt, Austria. The first two days of the conference were devoted to five workshops on specialized topics in artificial intelligence and a workshop on current AI research in Austria (CAIRA), followed by three days of the main technical program of the conference.

The conference received 44 submissions from 18 countries, and of the 44 papers, 8 (18%) were accepted as full papers and another 12 (27%) as technical communications for the proceedings. Technical Communications will be short papers that can report on ongoing research, important implementation techniques or experimental results, new and interesting benchmarking issues, or other issues of interest to the AI community.

Among the program of KI 2016 are four keynote talks by distinguished scientists Michael Wooldridge (“From Model Checking to Equilibrium Checking”), Thomas Eiter (“Artificial Intel- ligence at the Gates of Dawn?”), Michael May (“Towards Industrial Machine Intelli-gence”), and Ulrich Furbach (“Automated Reasoning and Cognitive Computing Automated Reasoning and Cognitive Computing”) were held at the conference.

The details are described below.

Full Papers

D-FLAT is a framework for developing algorithms that solve computational problems by dynamic programming on a tree decompo- sition of the problem instance. The dynamic programming algorithm is specified by means of Answer-Set Programming (ASP), allowing for declarative and succinct specifications. D-FLAT traverses the tree decomposition and calls an ASP system with the provided specification at each tree decomposition node. It is thus crucial that the evaluation of the ASP program is done in an efficient way. As experiments have shown, problems that include weights or more involved arithmetics slow down this step significantly due to the grounding step in ASP, which yields large ground programs in these cases. To overcome this problem, we equip D-FLAT with built-in counters in order to shift certain com- putations from the ASP side to the internal part of D-FLAT. In this paper, we highlight this new feature and provide empirical benchmarks on weighted versions of the Dominating Set problem showing that our new version increases D-FLAT’s robustness and efficiency

Recent advances of deep learning technology enable one to train complex input-output mappings, provided that a high quality train- ing set is available. In this paper, we show how to extend an existing dataset of depth maps of hand annotated with the corresponding 3D hand poses by fitting a 3D hand model to smart glove-based annota- tions and generating new hand views. We make available our code and the generated data. Based on the present procedure and our previous results, we suggest a pipeline for creating high quality data.

We look at probabilistic first-order formalisms where the domain objects are known. In these formalisms, the standard approach for inference is lifted variable elimination. To benefit from the advantages of the junction tree algorithm for inference in the first-order setting, we transfer the idea of lifting to the junction tree algorithm.

Our lifted junction tree algorithm aims at reducing computations by introducing first-order junction trees that compactly represent symme- tries. First experiments show that we speed up the computation time compared to the propositional version. When querying for multiple mar- ginals, the lifted junction tree algorithm performs better than using lifted VE to infer each marginal individually.

For combinatorial search in single-player games nested Monte-Carlo search is an apparent alternative to algorithms like UCT that are applied in two-player and general games. To trade exploration with exploitation the randomized search procedure intensifies the search with increasing recursion depth. If a concise mapping from states to actions is available, the integration of policy learning yields nested roll- out with policy adaptation (NRPA), while Beam-NRPA keeps a bounded number of solutions in each recursion level. In this paper we propose refinements for Beam-NRPA that improve the runtime and the solution diversity.

This paper aims to speed up the pruning procedure that is encountered in the exact value iteration in POMDPs. The value function in POMDPs can be represented by a finite set of vectors over the state space. In each step of the exact value iteration algorithm, the number of possible vectors increases linearly with the cardinality of the action set and exponentially with the cardinality of the observation set. This set of vectors should be pruned to a minimal subset retaining the same value function over the state space. Therefore, pruning procedure in gen- eral is the bottleneck of finding the optimal policy for POMDPs. This paper analyses two different linear programming methods, the classical Lark’s algorithm and the recently proposed Skyline algorithm for detect- ing these useless vectors. We claim that using the information about the support region of the vectors that have already been processed, both algorithms can be drastically improved. We present comparative experi- ments on both randomly generated problems and POMDP benchmarks

Cognitive robotics aims at understanding biological processes, though it has also the potential to improve future robotics systems. Here we show how a biologically inspired model of motor con- trol with neural fields can be augmented with additional components such that it is able to solve a basic robotics task, that of obstacle avoid- ance. While obstacle avoidance is a well researched area, the focus here is on the extensibility of a biologically inspired framework. This work demonstrates how easily the biological inspired system can be used to adapt to new tasks. This flexibility is thought to be a major hallmark of biological agents.

    Answer Set Programming (ASP) under the stable model semantics supports various language constructs which can be used to express the same realities in syntactically different, but semantically equivalent ways. However, these equivalent programs may not perform equally well. This is because performance depends on the underlying solver implementations that may treat different language constructs dif- ferently. As performance is very important for the successful application of ASP in real-life domains, knowledge about the mutual interchange- ability and performance of ASP language constructs is crucial for knowl- edge engineers. In this article, we present an investigation on how the usage of different language constructs affects the performance of state- of-the-art solvers and grounders on benchmark problems from the ASP competition. Hereby, we focus on constructs used to express disjunction or choice, classical negation, and various aggregate functions. Some inter- esting effects of language constructs on solving performance are revealed

    We present the robotic system IRMA (Interactive Robotic Memory Aid) that assists humans in their search for misplaced belong- ings within a natural home-like environment. Our stand-alone system integrates state-of-the-art approaches in a novel manner to achieve a seamless and intuitive human-robot interaction. IRMA directs its gaze toward the speaker and understands the person’s verbal instructions independent of specific grammatical constructions. It determines the positions of relevant objects and navigates collision-free within the envi- ronment. In addition, IRMA produces natural language descriptions for the objects’ positions by using furniture as reference points. To evaluate IRMA’s usefulness, a user study with 20 participants has been conducted. IRMA achieves an overall user satisfaction score of 4.05 and a perceived accuracy rating of 4.15 on a scale from 1–5 with 5 being the best.

    Technical Communication

    The number of parameters leading to a defined medical cancer therapy is growing rapidly. A clinical decision support system intended for better managing the resulting complexity must be able to reason about the respective active ingredients and their interrelation- ships. In this paper, we present a corresponding ontology and illustrate its use for answering queries relevant for clinical therapy decisions

    For a successful automated negotiation, a vital issue is how well the agent can learn the latent preferences of opponents. Opponents however in most practical cases would be unwilling to reveal their true preferences for exploitation reasons. Existing approaches tend to resolve this issue by learning opponents through their observations during nego- tiation. While useful, it is hard because of the indirect way the target function can be observed as well as the limited amount of experience available to learn from. This situation becomes even worse when it comes to negotiation problems with large outcome space. In this work, a new model is proposed in which the agents can not only negotiate with oth- ers, but also provide information (e.g., labels) about whether an offer is accepted or rejected by a specific agent. In particular, we consider that there is a crowd of agents that can present labels on offers for certain payment; moreover, the collected labels are assumed to be noisy, due to the lack of expert knowledge and/or the prevalence of spammers, etc. Therefore to respond to the challenges, we introduce a novel negotiation approach that (1) adaptively sets the aspiration level on the basis of estimated opponent concession; (2) assigns labeling tasks to the crowd using online primal-dual techniques, such that the overall budget can be both minimized with sufficiently low errors; (3) decides, at every stage of the negotiation, the best possible offer to be proposed.

    Job-shop scheduling problems constitute a big challenge in nowadays industrial manufacturing environments. Because of the size of realistic problem instances, applied methods can only afford low com- putational costs. Furthermore, because of highly dynamic production regimes, adaptability is an absolute must. In state-of-the-art production factories the large-scale problem instances are split into subinstances, and greedy dispatching rules are applied to decide which job operation is to be loaded next on a machine. In this paper we propose a novel scheduling approach inspired by those hand-crafted scheduling routines. Our app- roach builds on problem decomposition for keeping computational costs low, dispatching rules for effectiveness and declarative programming for high adaptability and maintainability. We present first results proving the concept of our novel scheduling approach based on a new large-scale job-shop benchmark with proven optimal solutions

    Recent sustainability efforts require machine scheduling approaches to consider energy efficiency in the optimization of sched- ules. In this paper, an approach to reduce power peaks while maintain- ing the makespan is proposed and evaluated. The central concept of the approach is to slowly equalize highs and lows in the energy input of the schedule without affecting the makespan through an iterative opti- mization. The approach is based on the simulated annealing algorithm to optimize machine schedules regarding the makespan and the energy input, using the goal programming method as the objective function

    In this work, several approaches to feature extraction on sets of time-based events will be developed and evaluated. On the one hand, these sets of events will be extracted from video files and on the other hand it will be manually annotated. By using methods of supervised machine learning the two sets of events will be mapped onto each other. After that, per time slot and requested event type, a binary classification will be applied. Thus aspects of data mining and media technology will be discussed and combined with the goal to reach a reasonable reduc- tion of the input-set by projecting it on an output-set. This will save operator-time in an automated process environment for quality control of audiovisual files. It can be shown, that this objective can be achieved by applying the developed methods. In addition to that, further results and limitations will be presented.

    Freespace navigation for autonomous robots is of growing industrial impact, especially in the logistics and warehousing domain. In this work, we describe a multiagent simulation solution to the physi- cal vehicle routing problem, which extends the physical traveling salesman problem —a recent benchmark used in robot motion planning research— by considering more than one concurrent vehicle.

    For the interaction of vehicles, we compute the collision of physical bodies and then apply the impact resulting from the elastic collision. A multi-threaded controller is implemented which forwards the proposed actions from each individual robot’s controller to the environment real- time simulator. For computing an optimized assignment of the pickup and delivery tasks to the vehicles we apply nested Monte-Carlo tree search.

    In the experiments, we study the problem of robot navigation for automated pickup and delivery of shelves to and from picking stations.

    We tackle social media analysis based on trending topics like “super bowl” and “oscars 2016” acquired from channels such as Twit- ter or Google. Our approach addresses the identification of semantically related topics (such as “oscars 2016” and “leonardo dicaprio”) by enrich- ing trends with textual context acquired from news search and apply- ing a clustering and tracking in term space. In quantitative experiments on manually annotated trends from Feb–Mar 2016, we demonstrate this approach to work reliably (with an F1-score of > 90%) and to outperform several baselines, including knowledge graph modelling using DBPedia as well as a direct comparison of articles or terms

    Collection and maintenance of biodiversity data is in need for automation. We present first results of an automated and model-free approach to the species identification from herbarium specimens kept in herbaria worldwide. Methodologically, our approach relies on standard methods for the detection and description of so-called interest points and their classification into species-characteristic categories using standard supervised learning tools. To keep the approach model-free on the one hand but also offer opportunities for species identification even in very challenging cases on the other hand, we allow to induce specific knowl- edge about important visual cues by using concepts of active learning on demand. First encouraging results on selected fern species show recogni- tion accuracies between 94 % and 100 %.

    Workflow mining is the task of automatically detecting work- flows from a set of event logs. We argue that network traffic can serve as a set of event logs and, thereby, as input for workflow mining. Networks produce large amounts of network traffic and we are able to extract sequences of workflow events by applying data mining techniques. We come to this conclusion due to the following observation: Network traf- fic consists of network packets, which are exchanged between network devices in order to share information to fulfill a common task. This com- mon task corresponds to a workflow event and, when observed over time, we are able to record sequences of workflow events and model workflows as Hidden Markov models (HMM). Sequences of workflow events are caused by network dependencies, which force distributed network devices to interact. To automatically derive workflows based on network traffic, we propose a methodology based on network service dependency mining

    Despite ample advantages of model-based diagnosis, in prac- tice its use has been somehow limited to proof-of-concept prototypes. Some reasons behind this observation are that the required modeling step is resource consuming, and also that this step requires additional training. In order to overcome these problems, we suggest to use mod- eling languages like Modelica that are already established in academia and industry for describing cyber-physical systems as basis for deriving logic based models. Together with observations about the modeled sys- tem, those models can then be used by an abductive diagnosis engine for deriving the root causes for detected defects. The idea behind our approach is to introduce fault models for the components written in Modelica, and to use the available simulation environment to determine behavioral deviations to the expected outcome of a fault free model. The introduced fault models and gained information about the resulting devi- ations can be directly mapped to horn clauses to be used for diagnosis.

    One long term goal of artificial intelligence and robotics research is the development of robot systems, which have approximately the same cognitive, communicational, and handling abilities like humans. This yields several challenges for future robot systems. For instance in the field of communicational abilities, future robot systems have to bridge between natural communication methods of the human, primarily uti- lizing symbols like words or gestures, and the natural communication methods of artificial systems, primarily utilizing low-level subsymbolic control interfaces. In this work, we outline a system which utilizes phys- ical properties, respectively physical effects for the mapping between a high-level symbolic user interface and a low-level subsymbolic robot con- trol interface.

    There is a growing interest in behavior based biometrics. Although biometric data has considerable variations for an individual and may be faked, yet the combination of such ‘weak experts’ can be rather strong. A remotely detectable component is gaze direction esti- mation and thus, eye movement patterns. Here, we present a novel per- sonalization method for gaze estimation systems, which does not require a precise calibration setup, can be non-obtrusive, is fast and easy to use. We show that it improves the precision of gaze direction estimation algorithms considerably. The method is convenient; we exploit 3D face model reconstruction for the enrichment of a small number of collected data artificially.

    Sister Conference Contributions/Extended Abstracts

    Recent developments in information extraction have enabled the construc- tion of huge Knowledge Graphs (KGs), e.g., DBpedia [1] or YAGO [8]. To complete and curate modern KGs, inductive logic programming and data mining methods have been introduced to identify frequent data patterns, e.g., “Married people live in the same place”, and cast them as rules like r1 : livesIn(Y,Z)isMarriedTo(X,Y ),livesIn(X,Z). These rules can be used for various purposes: First, since KGs operate under Open World Assumption (OWA – i.e. absent facts are treated as unknown), they can be applied to derive new potentially true facts. Second, rules can be used to eliminate erroneous information from the KG.

    Existing learning methods restrict to Horn rules [4] (i.e. rules with only positive body atoms), which are insufficient to capture more complex patterns, for instance like r2:livesIn(Y,Z)isMarriedTo(X,Y ),livesIn(X,Z),not researcher(Y ), i.e., nonmonotonic rules. While r1 generally holds, the additional knowledge that Y is a researcher could explain why few instances of isMarriedTo do not live together; this can prevent inferring the missing living place by only relying on the isMar- riedTo relations.

    Profound knowledge about at least the fundamentals of Arti- ficial Intelligence (AI) will become increasingly important for careers in science and engineering. Therefore we present an innovative educational project teaching fundamental concepts of artificial intelligence at high school level. We developed a high school AI-course (called “iRobot”) deal- ing with major topics of AI and computer science (automatons, agent systems, data structures, search algorithms, graphs, problem solving by searching, planning, machine learning) according to suggestions in the current literature. The course was divided into seven weekly teaching units of two hours each, comprising both theoretical and hands-on com- ponents. We conducted and empirically evaluated a pilot project in a representative Austrian high school. The results of the evaluation show that the participating students have become familiar with the concepts included in the course and the various topics addressed

    The ability to learn from conflicts is a key algorithm ingredient in constraint sat- isfaction (e. g.[2, 6, 8, 20, 22, 24]). For state space search, like goal reachability in classical planning which we consider here, progress in this direction has been elusive, and almost entirely limited to length-bounded reachability, where reacha- bility testing reduces to a constraint satisfaction problem, yet requires iterating over different length bounds until some termination criterion applies [5, 16, 19, 28]. But do we actually need a length bound to be able to do conflict analysis and nogood learning in state space search?

    Arguably, the canonical form of a “conflict” in state space search is a dead- end state, from which no solution (of any length) exists. Such conflicts are not as ubiquitous as in constraint satisfaction (including length-bounded reachabil- ity), yet they do occur, e. g., in oversubscription planning [26], in planning with limited resources [11], in single-agent puzzles [4, 15], and in explicit-state model checking of safety properties [7] where a dead-end is any state from which the error property cannot be reached.

    Knowledge representation and reasoning is a central concern of modern AI. Its importance has grown with the availability of large structured data collections, which are published, shared, and integrated in many applications today. Graph- based data representations, so called knowledge graphs (KGs), have become pop- ular in industry and academia, and occur in many formats. RDF [8] is most popular for exchanging such data on the Web, and examples of large KGs in this format include Bio2RDF [7], DBpedia [6], Wikidata [20], and YAGO [9]. Nevertheless, KGs are not always stored in their native graph format, and many reside in relational databases as well.

    We present a method for detecting driver frustration from both video (driver’s face) and audio (driver’s voice) streams captured during the driver’s interaction with an in-vehicle voice-based naviga- tion system. We analyze a dataset of 20 drivers that contains 596 audio epochs (audio clips, with duration from 1sec to 15sec) and 615 video epochs (video clips, with duration from 1sec to 45sec). The dataset is balanced across 2 age groups, 2 vehicle systems, and both genders. The model was subject-independently trained and tested using 4-fold cross-validation. We achieve an accuracy of 77.4 % for detecting frustra- tion from a single audio epoch and 81.2 % for detecting frustration from a single video epoch. We then treat the video and audio epochs as a sequence of interactions and use decision fusion to characterize the trade- off between decision time and classification accuracy, which improved the prediction accuracy to 88.5 % after 9 epochs.

    This paper discusses the inconsistency in G ̈odel’s ontological argument. Despite the popularity of G ̈odel’s argument, this inconsistency remained unnoticed until 2013, when it was detected automatically by the higher-order theorem prover Leo-II. Complementing the meta-logic explanation for the inconsistency available in our IJCAI 2016 paper [6], we present here a new purely object-logic explanation that does not rely on semantic argumentation.

    Tableaux-based methods were among the first techniques proposed for Linear Temporal Logic satisfiability checking. The earli- est tableau for LTL by [21] worked by constructing a graph whose path represented possible models for the formula, and then searching for an actual model among those paths. Subsequent developments led to the tree-like tableau by [17], which works by building a structure similar to an actual search tree, which however still has back-edges and needs mul- tiple passes to assess the existence of a model. This paper summarizes the work done on a new tool for LTL satisfiability checking based on a novel tableau method. The new tableau construction, which is very sim- ple and easy to explain, builds an actually tree-shaped structure and it only requires a single pass to decide whether to accept a given branch or not. The implementation has been compared in terms of speed and mem- ory consumption with tools implementing both existing tableau methods and different satisfiability techniques, showing good results despite the simplicity of the underlying algorithm.

    Answer Set Programming (ASP) has recently been employed to specify and run dynamic programming (DP) algorithms on tree decompositions, a central approach in the field of parameterized com- plexity, which aims at solving hard problems efficiently for instances of certain structure. This ASP-based method followed the standard DP approach where tables are computed in a bottom-up fashion, yielding good results for several counting or enumeration problems. However, for optimization problems this approach lacks the possibility to report solutions before the optimum is found, and for search problems it often computes a lot of unnecessary rows. In this paper, we present a novel ASP-based system allowing for “lazy” DP, which utilizes recent multi- shot ASP technology. Preliminary experimental results show that this approach not only yields better performance for search problems, but also outperforms some state-of-the-art ASP encodings for optimization problems in terms of anytime computation, i.e., measuring the quality of the best solution after a certain timeout.

    In this paper, we explore how ontological knowledge expressed via existential rules can be combined with possibilistic net- works (i) to represent qualitative preferences along with domain knowl- edge, and (ii) to realize preference-based answering of conjunctive queries (CQs). We call these combinations ontological possibilistic net- works (OP-nets). We define skyline and k-rank answers to CQs under preferences and provide complexity (including data tractability) results for deciding consistency and CQ skyline membership for OP-nets. We show that our formalism has a lower complexity than a similar existing formalism.

    Understanding the relation between different semantics in abstract argumentation is an important issue, not least since such semantics capture the basic ingredients of different approaches to non- monotonic reasoning. The question we are interested in relates two semantics as follows: What are the necessary and sufficient conditions, such that we can decide, for any two sets of extensions, whether there exists an argumentation framework which has exactly the first exten- sion set under one semantics, and the second extension set under the other semantics. We investigate in total nine argumentation semantics and give a nearly complete landscape of exact characterizations. As we shall argue, such results not only give an account on the independency between semantics, but might also prove useful in argumentation systems by providing guidelines for how to prune the search space.

    State space search is a canonical approach to testing reachability in large tran- sition systems, like goal reachability in classical planning which is where this work is placed. Decomposition techniques for state space search have a long tra- dition, most notably in the form of Petri net unfolding [7, 14, 20] decomposing the search over concurrent transition paths, and factored planning [2, 4, 8, 19] decomposing the search into local vs. global planning over separate components of state variables.

    Recent work by part of the authors [9, 10] has devised star-topology decou- pling, which can be viewed as a hybrid between Petri net unfolding and factored planning, geared at star topologies. The state variables are factored into com- ponents whose cross-component interactions form a star topology. The search is akin to a Petri net unfolding whose atomic elements are component states, exploring concurrent paths of leaf components in the star independently. Relative to both Petri net unfolding and traditional factored planning, the key advantage lies in exploiting the star topology, which gets rid of major sources of complexity: the need to reason about conflicts and reachable markings, respectively the need to resolve arbitrary cross-component interactions

    Abstraction heuristics are a popular method to guide opti- mal search algorithms in classical planning. Cost partitionings allow to sum heuristic estimates admissibly by partitioning action costs among the abstractions. We introduce state-dependent cost partitionings which take context information of actions into account, and show that an opti- mal state-dependent cost partitioning dominates its state-independent counterpart. We demonstrate the potential of state-dependent cost par- titionings with a state-dependent variant of the recently proposed sat- urated cost partitioning, and show that it can sometimes improve not only over its state-independent counterpart, but even over the optimal state-independent cost partitioning.

    Group decision making [4, 5, 8, 9, 11, 14] addresses the problem of finding a reasonable decision when multiple decision makers have different preferences. In this extended abstract, we give a high-level description of the key ideas from [20]. We explain how probabilistic belief merging can be applied to solve Group decision problems when the preferences can be derived from agents’ individual utilities and beliefs. Subsequently, we discuss some guarantees that our approach can give regarding the relationship between the individual preferences and the derived group preferences.

    Being able to reason nonmonotonically is crucial both for many practical appli- cations of artificial intelligence and for humans in their everyday lives. It has been shown that humans deviate systematically from classical logic, in particular with respect to revising previously drawn conclusions, and they are very success- ful in solving their everyday problems this way. Although many approaches to default and nonmonotonic reasoning in artificial intelligence have been developed in close correspondence to human commonsense reasoning, only few empirical studies have actually been carried out to support the claim that nonmonotonic and default logics are indeed suitable to formally represent human reasoning (e.g., [9, 11], and they are mostly motivated from the points of view of computer science. In this paper, we focus on a core research problem that had been raised first in psychology and was one of the first examples to make the nonmonotonic- ity phenomenon obvious for psychologists: The so-called suppression task was introduced in [1] to show that additional information may cause humans to give up conclusions which they have drawn previously via modus ponens. More pre- cisely, three groups of participants received one of three types of problems: αδ (Group 1), αβδ (Group 2, also referred to as β-case), and αγδ (Group 3, also referred to as γ-case), where α,β,γ are symbols for the following sentences:

    Model-Based Diagnosis is a principled AI approach to deter- mine the possible explanations why a system under observation behaves unexpectedly. For complex systems the number of such explanations can be too large to be inspected manually by a user. In these cases sequential diagnosis approaches can be applied. In order to find the true cause of the problem, these approaches iteratively take additional measurements to narrow down the set of possible explanations.

    One computationally demanding challenge in such sequential diag- nosis settings can be to determine the “best” next measurement point. This paper summarizes the key ideas of our recently proposed sequential diagnosis approach, which uses the newly introduced concept of “par- tial” diagnoses to significantly speed up the process of determining the next measurement point. The resulting overall reductions of the required computation times to find the true cause of the problem are quantified using different benchmark problems and were achieved without the need for any information about the structure of the system.

    In many sequential regression problems, the goal is to maxi- mize correlation between sequences of regression outputs and continuous- valued training targets, while minimizing the average deviation. For example, in continuous dimensional emotion recognition sequences of acoustic features have to be mapped to emotion contours. As in other domains, recurrent neural networks achieve good performance on this task. Yet, the usual squared error objective functions for neural network training do not fully take into account the above-named goal. Hence, in this paper we introduce a technique for the discriminative training of neural networks using the concordance correlation coefficient as cost function. Results on the MediaEval 2013 and RECOLA databases show that the proposed method can significantly improve the evaluation crite- ria compared to standard mean squared error training, both in the music and speech domains.

    In the next article, we will discuss KI2017.

    コメント

    タイトルとURLをコピーしました