Clojure and Functional Programming

Web Technology Digital Transformation Artificial Intelligence Machine Learning User Interface Natural Language Processing Semantic Web Deep Learning Online Learning Reasoning Reinforcement Learning Chatbot and Q&A Knowledge Information Processing  Programming Navigation of this blog

About Clojure and Functional Programming

a language called Clojure, a relatively new language created by Rich Hickey and introduced in 2007. Clojure is a relatively new language, created by Rich Hickey and introduced in 2007, but it is a dialect of the LISP language, which was introduced in 1958, and it runs on top of the JVM and can use code from the legacy JAVA programming language.

One of the features of Clojure is that it is a functional language. This is one of the latest trends in the history of programming languages, where all programs are composed of functional blocks called functions, as opposed to the usual languages such as python and javascript, where procedures are laid out.

One of the perspectives in the development of programming languages is to improve reusability. The object-oriented languages that dominated the world before functional languages were also developed from this perspective, but the idea of constructing programs in blocks of functions has further improved reusability.

In addition, a mechanism called REPL, in which functions are evaluated as they are written, reduces bugs at the time of writing and improves the efficiency of code generation. Furthermore, the “data = code” feature of LISP has the potential to realize artificial intelligence technologies, including automatic program generation.

This blog covers the following topics: overview, setting up the environment, details of the language and its application to web applications, machine learning and artificial intelligence techniques.

Overview

  • Overview of Code as Data and Examples of Algorithms and Implementations

“Code as Data” refers to a concept or approach that treats the code of a program itself as data, and is a method that allows programs to be manipulated, analyzed, transformed, and processed as data structures. Normally, a program receives an input, executes a specific procedure or algorithm on it, and outputs the result. In “Code as Data,” on the other hand, the program itself is treated as data and manipulated by other programs. This allows programs to be handled more flexibly, dynamically, and abstractly.

Technical Topics

In order to program, it is necessary to create a development environment for each language. This section describes how to set up specific development environments for Python, Clojure, C, Java, R, LISP, Prolog, Javascript, and PHP, as described in this blog. Each language has its own platform to facilitate development, which makes it possible to easily set up the environment, but this section focuses on the simplest case.

This section describes how to set up a development environment in Clojure using Sublimetext4 and VS code respectively.

Boot is a Clojure build framework and ad-hoc Clojure script evaluator that provides a runtime environment that includes all the tools necessary to build Clojure projects from scripts written in Clojure and executed in the context of a project. It provides a runtime environment that includes all the tools needed to build a project.”

Leiningen is also an excellent development tool, but its configuration, especially for complex applications such as web page system development, can be complicated and make Leningen’s project files very verbose, for example. In contrast, Boot creates a file called “build.boot,” which acts like Ruby’s Gemfile or NodeJS’s package.json, writing the libraries and tasks to be used.

Describe the environment setup (JVM, text editor (Spacemacs), compiler/library management tool (leiningen)) to get started with Clojure.

Using repl and first simple function programming

About various data structures and EDNs

File input/output functions are the most basic and indispensable functions when programming. Since file input/output functions are procedural instructions, each language has its own way of implementing them. Concrete implementations of file input/output in various languages are described below.

Among programming languages, the basic functionality is one element of the three functions of structured languages (1) sequential progression, (2) conditional branching, and (3) repetition, as described in the “History of Programming Languages” section. Here, we show implementations of repetition and branching in various languages.

From a type system perspective, Clojure is a Lisp programming language with both a static and dynamic type system. It combines the characteristics of a statically typed language, which Java is based on, and the dynamic type system of the Lisp language system, which supports flexible coding without type declarations, thus increasing code reliability and maintaining code flexibility at the same time. Clojure spec is a C++ language.

Clojure spec is one of the libraries included in Clojure. It is a tool that can define specifications for various elements such as function arguments, return values, and data structures, and can also perform data verification and conversion.

ANCIENT, an automatic management system for libraries that is useful for sutras

Full-text search-like data extraction using the map function

On condp and match, a functional language approach as conditional branching, which is the basic syntax of the program.

Actual asynchronous processing in Javascript and Clojure

On state management in a functional language using Clojure as a base for asynchronous processing

Object-oriented approach in a functional language using Clojure

Polymorphism is a term used to describe the type system of a programming language and refers to the property of allowing each element of a programming language (constants, variables, expressions, objects, functions, methods, etc.) to belong to multiple types. It is also called polymorphism, polymorphism, polymorphism and diversity. Its synonym is monomorphism, which refers to the property that each element of a programming language belongs to only one type.

In this article, we will discuss polymorphism and monomorphism in functional languages using Clojure.

About deftype and defrecord as object handling functions in clojure

Handling of locale variables as mutable data handling with Clojure by destructuring

Comparison of iterations in C, Java, Javascript, Python, and Clojure

Various data sorting algorithms and Clojure

Some business applications may need to react asynchronously to external stimuli such as network traffic. For example, IOT applications that receive and process streaming data, or applications that track company stock prices and information in the stock market.

If these applications are executed using the sequential/synchronous processing of a normal computer, the overhead of synchronizing asynchronous data becomes a bottleneck when the amount of input data increases, making it difficult for the application to achieve effective speed.

This time, asynchronous processing will be performed in the server-side back-end processing language, and these processes will be implemented for the actual application.

seesaw is a Clojure library that can be used to easily create a UI as a desktop application. This is a version of Swing, a Java graphical library, that can be used with Clojure.

Clojure Quil is a library for creating 2D graphics and animation written in the Clojure programming language. Quil allows users to create 2D graphics and animations using programs written in Clojure.

At Clojure/Conj2018, one of the Clojure conferences, I saw Tyler Hobbs give a talk on generative art called “CODE GOES IN, ART COMES OUT”.

According to wiki, generative art is

Generative art refers to works of art that are algorithmically generated, synthesized, or constructed by computer software algorithms or mathematical/mechanical/random autonomous processes. By taking advantage of the computational freedom and computational speed of computers, and by implementing theories derived from natural science, many works are made to express themselves in a unified, organic manner, somewhere between artificial and natural.

Generative art is an art form that uses natural scientific systems as its main creative method. The difference between generative art and other art forms is that generative art requires the design and creation of mechanisms that operate autonomously. Works of art with systems may implement scientific theories such as complex systems and information theory.

I believe that mathematics abstracts the laws and patterns of the world and gives them form. The fact that mathematics occupies an important position in machine learning and artificial intelligence is not only due to its function in line with the computer’s ability to calculate, but also because of its ability to abstract laws and patterns. Furthermore, mathematics is a field that requires “free thinking” and “the ability to feel” in order to find patterns and laws by devising various viewpoints and ideas. From this perspective, “music,” which uses various intuitions and ideas to create patterns that move people’s hearts, can be considered to have something in common with mathematics.

Blues is one of the roots of modern popular music and can be one of the genres that play an important role. Here we discuss the history of blues and its auto-generation by Clojure.

Typically, IOT devices are small devices with sensors and actuators, and use wireless communication to collect sensor data and control actuators. Various communication protocols and technologies are used for wireless IoT control. This section describes examples of IoT implementations using this wireless technology in various languages.

Working with other languages

Clojure inherits the powerful DSL (domain-specific language) functionality of its base LISP, such as the macro system, and can be used in conjunction with a variety of languages. This makes it possible to build AI solutions that integrate Pyhton’s learning library for machine learning, R’s library for statistical processing, and Prolog, a language for inference technology, as well as Javascript, the de facto standard for web applications, Java, which is widely used to build mission-critical systems, and Pyhton’s R library for statistical processing. In addition, it has the advantage of combining with Javascript, the de facto standard for web applications, and libraries such as Java and PHP, which are widely used to build mission-critical systems.

The details of the linkage with each language are described below.

  • Clojure and Python Integration and Machine Learning

    In the area of machine learning, environments with rich libraries such as Python and R are used and have become almost de facto. However, it was not at a level where the user could freely use the libraries of the other party, and there were hurdles in making full use of the latest algorithms.

    In contrast, in recent years (since 2018), frameworks that can interoperate with the Python environment, such as libPython-clj, have appeared, and mathematical frameworks that utilize Java and C libraries, such as fastmath, deep learning framework Cortex, Deep The development of frameworks such as fastmath, a mathematical framework that leverages Java and C libraries, and deep learning frameworks such as Cortex and DeepDiamond have led to active discussions on approaches to machine learning, such as scicloj.ml, a well-known machine learning community on Clojure.

  • Clojure and Javascript and web frameworks (node.js)

    In recent years, web frameworks have become an important part of system development. Javascript (or its derivative, AltJavascript) and frameworks using Javascript (React, View, etc.) are the de facto languages used to develop the front end of web frameworks.

    Clojure has developed a framework (Clojurescript) that integrates Javascript/frameworks and Clojure, and I would like to discuss them in this article. In recent years, Java and Javascript have become interoperable via GraalVM, a high-speed JVM, and there is an article that says that Clojure and Javascript can also be linked via GraalVM, but I will skip this article this time.

In this article, we will discuss the integration with R. There are several tools to access R libraries from Clojure, which are summarized in the following links. In terms of each tool, we will discuss (a) the API and parsing provided, (b) the type of R backend used (JRI+REngine / Rserve+REngine / Opencpu / Run R from a shell, etc.), (c) the R “data frame” or “matrix” equivalent that is Are there any Clojure concepts being used, and if so, what are they?

Of these, we will discuss Clojisr, which is relatively stable and available.

Clojure is a Lisp that runs on top of the JVM and can be integrated with Java at various levels, allowing full use of Java assets.

  • Prolog by Clojure

    Prolog is one of the two classical languages for symbolic artificial intelligence programming along with Lisp. The language is characterized as a declarative programming language, which means that the program is composed by declaring the definition of the target = “what (what) is to be obtained” and not describing the process, procedure, or algorithm = “how (how) to obtain it” that general procedural languages have[1][2]. Prolog functions in Clojure are realized using core.logic, which is a logic programming function in Clojure based on miniKanren, a logic programming language.

Web Application

Database technology refers to technology for efficiently managing, storing, retrieving, and processing data, and is intended to support data persistence and manipulation in information systems and applications, and to ensure data accuracy, consistency, availability, and security.

The following sections describe implementations in various languages for actually handling these databases.

This section describes examples of how servers described in “Server Technology” can be used in various programming languages. Server technology here refers to technology related to the design, construction, and operation of server systems that receive requests from clients over a network, execute requested processes, and return responses.

Server technologies are used in a variety of systems and services, such as web applications, API servers, database servers, and mail servers. Server technology implementation methods and best practices differ depending on the programming language and framework.

Web crawling is a technology to automatically collect information on the Web. This section describes an overview of web crawling, its applications, and concrete implementations using Python and Clojure.

About setting up a server with Ring in Clojure

Deploy routing with compojure on the server with Clojure that was set up last time.

Launch postgresql with Clojure to build web applications

Integrate servers and databases with Clojure

About the format function used to automatically generate SQL, SPARQL, and other queries in Clojure

Implementation of using Redis, a fast kvDB in Clojure

PlantUML will be a tool that can automatically draw various open source data models and is based on graphviz, an open source drawing tool provided by AT&T Laboratories. It will be a component for quickly creating various diagrams such as those shown below.

There are various ways to use plantUML. (1) using Jar files, (2) using brew on Mac, (3) Web services, and (4) planting in applications.

    Microservice

    Here you will learn common patterns and practices and how to apply them using the Clojure programming language. You will learn the basic concepts of architectural design and RESTful communication, and be introduced to patterns that provide manageable code that scales during development and in production. It also provides examples of how to put these concepts and patterns into practice with Clojure.

    Pedestal is an API-first Clojure framework, a data-driven extensible framework that provides a set of libraries for building reliable concurrent services with dynamic properties, implemented using protocols to reduce coupling between components.

    In this article, we will discuss Datomic, a next-generation database for microservices, which is the foundation for storing and retrieving data reliably for data-oriented applications such as microserpices (data-oriented applications are applications where the volume, complexity, and change of data is an issue). Datomic is a library written in Clojure, a cloud service provided by AWS.

    From “Microservice with Clojure. In this article, we will discuss the use of ElasticStash for monitoring microservice systems. The monitoring system described here can be widely applied to systems other than microservice systems; please refer to “Search Tool Elasticsearch – Startup Procedure” for details.

    Machine Learning / Natural Language Processing

    The gradient method is one of the widely used methods in machine learning and optimization algorithms, whose main goal is to iteratively update parameters in order to find the minimum (or maximum) value of a function. In machine learning, the goal is usually to minimize the cost function (also called loss function). For example, in regression and classification problems, a cost function is defined to represent the error between predicted and actual values, and it helps to find the parameter values that minimize this cost function.

    This section describes various algorithms for this gradient method and examples of implementations in various languages.

    LightGBM is a Gradient Boosting Machine (GBM) framework developed by Microsoft, which is a machine learning tool designed to build fast and accurate models for large data sets. Here we describe its implementation in pyhton, R, and Clojure.

    Generalized Linear Model (GLM) is one of the statistical modeling and machine learning methods used for stochastic modeling of the relationship between response variables (objective variables) and explanatory variables (features). This section provides an overview of this generalized linear model and its implementation in various languages (python, R, and Clojure).

    Particle Swarm Optimization (PSO) is a type of evolutionary computation algorithm inspired by swarming behavior in nature, modeling the behavior of flocks of birds and fish. PSO is characterized by its ability to search a wider search space than genetic algorithms, which tend to fall into local solutions. PSO is widely used to solve machine learning and optimization problems, and numerous studies and practical examples have been reported.

      Statistical analysis in general considers how a sample is described in terms of summary statistics and how population parameters can be inferred from it. Such an analysis tells us something about the population in general and the sample in particular, but does not provide a very precise description of the individual elements. This is because much information is lost by reducing the data to two statistics, the mean and the standard deviation.

      We often want to go further and establish a relationship between two or more variables or to predict another variable from one variable. This is where correlation and regression studies come in. Correlation concerns the strength and direction of the relationship between two or more variables. Regression determines the nature of this relationship from which predictions can be made.

      Linear regression is an elementary machine learning algorithm. Given a sample of data, the model learns a linear equation and is able to make predictions about the new unknown data. To this end, we will use Incanter, a statistical library in Clojure, to describe how matrices can be manipulated using Incanter to determine the relationship between height and weight of Olympic athletes.

      Although it is useful to know that two variables are correlated, it is not enough to predict the weight of an Olympic swimmer from his/her height, and vice versa, using the data described in “Statistical Analysis and Correlation Evaluation Using Clojure/Incanter”. In establishing the correlation, we measured the strength and sign of the relationship, but not the slope. In order to make predictions, it is necessary to know what the rate of change of one variable will be when the other variable changes by one unit.
      What is needed is an equation that relates the specific value of one variable, called the independent variable, to the expected value of the other variable, called the dependent variable. For example, in a linear equation that predicts weight from height, height is the independent variable and weight is the dependent variable.

      The line represented by this equation is called the regression line. The term was introduced by Sir Francis Galton, a 19th century English philologist who, along with his student Karl Pearson (who defined the correlation coefficient), developed various methods to study linear relationships, which collectively came to be called regression methods.

      In the previous article, “Regression Analysis Using Clojure (1) Single Regression Model,” we discussed how to construct a regression line with a single independent variable. However, when considering a real problem, it is often desirable to construct a model with multiple independent variables. This problem is called a multiple regression problem. Each independent variable needs its own coefficient. So rather than alphabetize each coefficient, we will specify a new variable, beta (pronounced “beta”), to hold all the coefficients.

      In this article, we will discuss the application of regression and classification analysis in a way that is more suitable for large amounts of data. In this case, we will be dealing with a relatively modest data set of 100,000 records. This is not big data (at 100 MB, it fits comfortably in the memory of a single machine), but it is large enough to demonstrate common methods of large-scale data processing. In this chapter, we will focus on how to scale algorithms to very large data volumes through parallel processing, using Hadoop (a popular framework for distributed computation) as a case study.

      Hadoop features a distributed file control module called HDFS (Hadoop Distributed File System) and a distributed data processing infrastructure called MapReduce. Here, we focus on two libraries provided by Clojure that work with Hadoop, Tesser and Parkour, and describe the MapReduce mechanism based on concrete implementations.

      The length of time it takes to process each iteration of batch gradient descent depends on the size of the data and the number of processors in the computer. Even though several chunks of data are processed in parallel, the data set is large and the processors are finite. While parallel computation provides higher speed, doubling the size of the data set doubles the execution time.

      Hadoop will be one of several systems that have emerged in the past decade that aim to parallelize work beyond the capabilities of a single machine; Hadoop aims to run computations on many servers, rather than running code on multiple processors. In fact, a Hadoop cluster can consist of thousands of servers.

      The pair-wise differencing of all items described in “Implementing a Simple Recommendation Algorithm Using Clojure (2)” is a time-consuming task to compile. One of the advantages of item-based recommendation techniques is that the pairwise differences between items are relatively stable over time. The difference matrix need only be computed periodically. This means that, as we have seen, if a user evaluates 10 items and then evaluates one more item, the user only needs to adjust the difference between the 11 items he or she has evaluated.
      However, the execution time of the item-based recommender varies with the number of items to be stored, and the execution time increases in proportion to the number of items.
      If the number of users is small compared to the number of items, it may be more efficient to implement a user-based recommender. For example, content aggregation sites, where the number of items may exceed the number of users by an order of magnitude, are good candidates for user-based recommender.
      The Mahout library, described in “Large-Scale Clustering with Clojure and Mahout,” includes tools for creating various types of recommenders, including user-based recommenders. In this article, we will discuss these tools.

      The Spark project is a cluster computing framework that emphasizes low-latency job execution and is a relatively new project that emerged from UC Berkley’s AMP Lab in 2009. Distributed File System (HDFS)), it aims to significantly accelerate job execution time by keeping much of the computation in memory.

      In this article, we will discuss the key concepts required to run Spark jobs using Sparkling, a Clojure library.

      Consider the case where graphs are treated in a mathematical rather than a visual sense. A graph is simply a set of vertices connected by edges, and the simplicity of this abstraction means that graphs can be anywhere. Graphs are valid models for a variety of structures, including the hyperlink structure of the Web, the physical structure of the Internet, roads, telecommunications, and social networks.

      For these graph data algorithms, we have previously described C++ implementations of “basic algorithms for graph data (DFS, BFS, bipartite graph decision, shortest path problem, minimum global tree)” and “advanced graph algorithms (strongly connected component decomposition, DAG, 2-SAT, LCA)”. In this article, we will discuss the use of Clojure/loop.

      GraphX will be a distributed graph processing library designed to work in conjunction with Spark. Like the MLlib library described in “Large-scale Machine Learning with Apache Spark and MLlib,” GraphX provides a set of abstractions built on top of Spark’s RDDs. By representing graph vertices and edges as RDDs, GraphX can handle very large graphs in a scalable manner.

      By restricting the types of computations that can be represented and introducing techniques for partitioning and distributing graphs, graph parallel systems can execute sophisticated graph algorithms several orders of magnitude faster and more efficiently than typical data parallel systems.

      The Pregel API will be the primary abstraction in GraphX for representing custom, iterative graph parallel computing. It is named after Google’s 2010 in-house system for performing large-scale graph processing. It is also the name of the river over which the famous Königsberg bridge spans in the problem of Euler paths (paths through all edges of a graph) in graphs, described in “Network Analysis Using Clojure (1) Width-first/Depth-first Search, Shortest Path Search, Minimum Spanning Tree, Subgraphs and Connected Components”.

      Google’s Pregel paper popularized the “think like a vertex” approach to graph parallel computing; Pregel’s model essentially uses message passing between graph vertices organized into a series of steps, called supersteps. At the beginning of each superstep, Pregel executes a user-specified function on each vertex, passing on all messages sent to that vertex in the previous superstep. The vertex function has the opportunity to process each of these messages and send messages to other vertices in turn. A vertex may also “vote to stop” the computation, and if all vertices vote to stop, the computation is terminated.

      In the previous section “Network Analysis with GraphX Pregel using Clojure Glittering”, we discussed whether it is possible to identify what connects the largest communities among those analyzed in the previous section “Network Analysis with GraphX Pregel using Clojure Glittering”. There are various ways to do this. If we had access to the tweets themselves, we could perform textual analysis, as we do with clustering, to see if certain words or certain languages are used more frequently among these groups.

      In this section, we discuss network analysis. There, the graph structure is used to identify the most influential accounts in each community. This means, for example, that a list of the top 10 most influential accounts would be an indicator of what resonates with their followers.

      Game theory is a theory for determining the optimal strategy when there are multiple decision makers (players) who influence each other, such as in competition or cooperation, by mathematically modeling their strategies and their outcomes. It is used primarily in economics, social sciences, and political science.

      Various methods are used as algorithms for game theory, including minimax methods, Monte Carlo tree search, deep learning, and reinforcement learning. Here we describe examples of implementations in R, Python, and Clojure.

      Actual processing of morphological analysis, etc. using natural language processing tools (OpemNLP, kuromoji, Juman, KNP) using Clojure

      On the classification of sentences using the Clojure wrapper of liblinear, a library of support vector machines

      On unsupervised learning with K-means using Clojure

      Implementation of tfidf with Clojure for use in natural language processing and search

      Implementation of stopword removal in Clojure for use in natural language processing cleansing

      On one-hot-vectors and category vectors for structuring words in natural language processing

      • Implementation of Gaussian Processes in Clojure

        A Gaussian process is like a box (stochastic process) that randomly outputs a function form. For example, if we consider that the process of dice generating the natural numbers 1, 2, 3, 4, 5, and 6 depends on the distortion of the dice, we can assume that the appearance of the function ( function that represents the probability that the dice will turn up) depending on the parameters (in this case, the skewness of the dice).

        Gaussian process regression is analyzed using correlation coefficients between data, so algorithms using kernel methods are used, algorithms using MCMC combined with Bayesian analytical methods, etc. are applied. The tools used for these analyses are open source in various languages such as Matlab, Python, R, and Clojure. In this article, we will discuss the approach in Clojure.

      Bayesian optimization is an applied technology that fully utilizes the characteristics of Gaussian regression processes, which can make probabilistic predictions based on a small number of samples and a minimal number of processes.

      Specific examples include the sequential extraction of the optimal combination of experimental parameters to be used next while conducting experiments in experimental design for medicine, chemistry, materials research, etc., the sequential optimization of hyper-parameters in machine learning while rotating the learning/evaluation cycle, and the optimization of functions by matching parts in the manufacturing industry. It is a technology that can be used in a wide range of applications, such as in the optimization of functions by matching parts in the manufacturing industry.

      Stan, BUSGS, etc., which were previously described in probabilistic generative models such as Bayesian models, are also called Probabilistic Programming (PP). PP is a programming paradigm in which probabilistic models are specified in some form and inference on those models is automatically performed. Their purpose is to integrate probabilistic modeling and general-purpose programming to build systems combined with various AI techniques for various uncertain information, such as stock price prediction, movie recommendation, computer diagnostics, cyber intrusion detection, and image detection.

      In this article, we describe our approach to this probabilistic programming in Clojure.

      A CRP (Chinese resturant process) is a stochastic process that describes a particular data generating process. Mathematically, this data generating process is one that, at each step, samples a new integer from the set of possible integers, with a probability proportional to the number of times that particular integer has been sampled so far, with a constant probability of sampling a new integer that has not been seen before The probability is proportional to the number of times the particular integer has been sampled so far.

      In this article, we describe the implementation of this CRP using Anglican, a framework for probabilistic programming of Clojure, and its combination with a mixed Gaussian model.

      In this article, we will discuss time series data. A time series is a series of data consisting of regularly observed values of a certain quantity arranged according to their measurement time. In order to predict future values of a time series, it is necessary that future values are based to some extent on past values. In this article, we will discuss the implementation of AR, MA, and ARMA models using Clojure.

      In this article, we describe an implementation of the Kalman filter, one of the applications of the state-space model, in Clojure. The Kalman filter is an infinite impulse response filter used to estimate time-varying quantities (e.g., position and velocity of an object) from discrete observations with errors, and is used in a wide range of engineering fields such as radar and computer vision due to its ease of use. Specific examples of its use include integrating information with errors from device built-in accelerometers and GPS to estimate the ever-changing position of vehicles, as well as in satellite and rocket control.

      The Kalman filter is a state-space model with hidden states and observation data generated from them, similar to the hidden Markov model described previously, in which the states are continuous and changes in state variables are statistically described using noise following a Gaussian distribution.

      Anomaly detection will be a machine learning technique that determines whether a given set of values for some selected features representative of a system is unexpectedly different from the values of the features that are normally observed. Applications of anomaly detection include structural and operational defect detection in manufacturing, network intrusion detection systems, system monitoring, and medical diagnostics. Anomaly detection is an extended version of binary classification in the form of a machine learning problem.

      One approach to anomaly detection is to use probability distribution models constructed from training data to detect anomalies. Another approach that can be used to detect anomalies would be a proximity-based approach. In this approach, the proximity, or closeness, of a set of observed values to the remaining values in the sample data would be determined.

      Whether a given set of observed values is anomalous can also be determined based on the density of the surrounding data. This approach is called the density-based anomaly detection approach. A given set of input values is classified as anomalous if the data around the given values is low.

      A recommendation system is an information system that attempts to predict user preferences and tastes for an item. A recommendation system is an information filtering system that aims to provide useful information to the user. A recommendation system uses the user’s behavioral history or recommends items that other users like. These two approaches form the basis of the two types of algorithms (content-based filtering and collaborative filtering) used in recommendation systems.

      In this article, we describe a recommendation system that uses a measure of similarity between text documents, which is used for clustering text documents using the k-means algorithm. In them, the concept of similarity is used to suggest items that users may like.

      In this section, we will first describe the basic types of recommendation systems and implement one of the simplest ones in Clojure. In the next section, we will discuss how to create different types of recommendations using Mahout.

      With the introduction of small-scale deep learning in algorithms for reinforcement learning, online learning, etc. in mind, I will describe the implementation of neural nets in Clojure (including an understanding of the principles of neural net algorithms). The base implementation is based on the article “Building Neural Networks from Zero and Observing Hidden Layers in Clojure” on qita with some additions.

      Hierarchical Temporal Memory (HTM) is a machine learning technology that aims to capture the structural and algorithmic properties of the neocortex. HTM is a neural network-like pattern recognition algorithm based on the theory of “auto-associative memory” (thinking brain, thinking computer) advocated by Jeff Hawkins, the inventor of the handheld computers (palm, Treo) that are the prototypes of today’s smart phones.

      Consider the case where graphs are treated in a mathematical rather than a visual sense. A graph is simply a set of vertices connected by edges, and the simplicity of this abstraction means that graphs can be anywhere. Graphs are valid models for a variety of structures, including the hyperlink structure of the Web, the physical structure of the Internet, roads, telecommunications, and social networks.

      For these graph data algorithms, we have previously described C++ implementations of “basic algorithms for graph data (DFS, BFS, bipartite graph decision, shortest path problem, minimum global tree)” and “advanced graph algorithms (strongly connected component decomposition, DAG, 2-SAT, LCA)”. In this article, we will discuss the use of Clojure/loop.

      Analytically computing integrals of complex complex complex distributions, such as Bayesian estimation, is difficult, and the Markov Chain Monte Carlo (MCMC) method is often used. This is a type of basic random selection algorithm. The simplest Monte Carlo method generates random numbers as candidate parameters and calculates the probability (integral) corresponding to the random numbers, but such a brute-force method is inefficient because the computational complexity increases explosively as the number of parameters increases, while the MCMC method is not completely random and can be used to calculate the probability of a random number. The algorithm is not random, but rather generates the next random number next random number while gradually searching for a value with a larger probability (Markov probability field search) based on one previous random number.

      We have previously introduced Rstan, and Pystan and PyMC3 are implementations of MCMC in Python. Also, since Rstan has not been updated since 2020 and the Stan team seems to be encouraging CmdStan, I would like to discuss clj-stan, a simple wrapper for CmdStan as well as CmdStanr, a wrapper for r in CmdStan.

      Other AI Technologies

      Case-based reasoning is a technique for finding appropriate solutions to similar problems by referring to past problem-solving experience and case studies. This section provides an overview of this case-based reasoning technique, its challenges, and various implementations.

      About core.logic, a logic programming function in Clojure based on miniKanren, a logic programming language

      Clara rule and rete4frames are Clojure libraries inspired by CLIPS. Simple sample programs using each are shown below.

      Basic implementation of Clara Rules, Clojure’s expert system

      On the applied implementation of Clara Rules, Clojure’s expert system

      Building a chatbot framework in Clojure and Javascript and integrating various AI functions (natural language processing, SVM, BERT, Transformer, knowledge graph, database, expert systems)

      コメント

      タイトルとURLをコピーしました