Wiki

We have compiled hundreds of related entries to help you understand "artificial intelligence"

Bootstrapping

Bootstrapping is a method of uniform sampling with replacement from a given training set, that is, whenever a sample is selected, it is equally likely to be selected again and added to the training set again. The bootstrap method was first proposed by Bradley Efron in Annals of Statistics in 1979. […]

6 years ago

Bootstrap Sampling / Repeatable Sampling / Sampling With Replacement

For a sample, the probability of it being collected in a random sampling of a training set containing m samples is 1m. The probability of not being collected is 1−1m. If the probability of not being collected in m samplings is (1−1m)m, then when m→∞, (1−1m)m→1/e≃0 […]

6 years ago

Boltzmann Machine

The Boltzmann machine is a type of random neural network and recurrent neural network invented by Geoffrey Hinton and Terry Sejnowski in 1985. The Boltzmann machine can be viewed as a random process that generates corresponding […]

6 years ago

Bi-partition

Definition Binary search is an algorithm whose input is an ordered list of elements. If the element to be found is contained in the list, binary search returns its position; otherwise it returns null. Basic idea This method is suitable when the amount of data is large. When using binary search, the data must be sorted. Assume that the data is in ascending order […]

6 years ago

Binomial Test

Definition The binomial test compares the observed frequencies of the two categories of a dichotomous variable with the expected frequencies under a binomial distribution with a specified probability parameter, which is 0.5 for both groups by default. Example A coin is tossed and the probability of heads is 1/2. Under this hypothesis, the coin is tossed 40 times [...]

6 years ago

Binary Classification

Indicates that there are only two categories in the classification task, for example, we want to identify whether a picture is a cat or not. That is, train a classifier, input a picture, represented by the feature vector x, and output whether it is a cat, represented by y = 0 or 1; two-class classification assumes that each sample is set with one and only one label 0 […]

6 years ago

Bi-directional Long-Short Term Memory/Bi-LSTM

Definition Deep neural networks have shown superior results in many fields such as speech recognition, image processing, and natural language processing. LSTM, as a variant of RNN, can learn long-term dependencies in data compared to RNN. In 2005, Graves proposed combining LSTM with […]

6 years ago

Bias-Variance Dilemma

The bias-variance dilemma means that it is impossible to reduce both bias and variance at the same time, and you can only strike a balance between the two. In a model, if you want to reduce bias, you will increase the complexity of the model to prevent underfitting; but at the same time, you cannot make the model too complex to increase variance and cause overfitting. Therefore, you need to find a balance in the complexity of the model, which can […]

6 years ago

Bias-variance Decomposition

Bias-variance decomposition is a tool to explain the generalization performance of learning algorithms from the perspective of bias and variance. The specific definition is as follows: Assume that there are K data sets, each of which is independently drawn from a distribution p(t,x) (t represents the variable to be predicted, x represents the feature variable). In different […]

6 years ago

Bias

Definition: The difference between the expected output and the true label is called bias. The following figure can well illustrate the relationship between bias and variance:

6 years ago

Between-class Scatter Matrix

The inter-class scatter matrix is used to represent the distribution of each sample point around the mean. Mathematical definition

6 years ago

Bayesian Network

Definition Bayesian network is one of the most effective theoretical models in the field of uncertain knowledge expression and reasoning. Bayesian network consists of nodes representing variables and directed edges connecting these nodes. Nodes represent random variables, and directed edges between nodes represent the relationship between nodes. The strength of the relationship is expressed by conditional probability. There is no parent node […]

6 years ago

Bayesian Decision Theory

Basic Concepts Bayesian decision theory is a basic method in statistical model decision making. Its basic idea is: Known class conditional probability density parameter expression and prior probability are converted into posterior probability using Bayesian formula. Decision classification is made based on the size of the posterior probability. Related formula Let D1, D2, ..., Dn be samples […]

6 years ago

Bayes Optimal Classifier

In order to minimize the overall risk, the class label that can minimize the risk R(c|x) on the sample is selected, that is, h∗ is the Bayesian optimal classifier.

6 years ago

Bayes Model Averaging/BMA

In model selection, one "best" model is usually selected from a set of candidate models, and then this selected "best" model is used for prediction. Unlike a single best model, Bayesian model averaging assigns weights to each model and performs weighted averaging to determine the final prediction value. The weight assigned to a model is […]

6 years ago

Bayes Decision Rule

For each sample x, if h can minimize the conditional risk R(h(x)|x), then the overall risk will also be minimized. This gives rise to the Bayes decision rule: to minimize the overall risk, we only need to choose the one that can make the conditional risk R(c|x […]

6 years ago

BN Batch Normalization

BN is a set of regularization methods that can speed up the training of large convolutional networks and improve the classification accuracy after convergence. When BN is used in a certain layer of a neural network, it will standardize the internal data of each mini-batch so that the output is normalized to the normal distribution of N(0,1), reducing […]

6 years ago

Base Learning Algorithm

In ensemble learning, the "individual learners" generated by group are homogeneous. Such learners are called base learners, and the corresponding learning algorithms are called base learning algorithms.

6 years ago

Long Short-Term Memory

Long Short-Term Memory (LSTM) is a time recurrent neural network (RNN) that was first published in 1997. Due to its unique design structure, LSTM is suitable for processing and predicting important events in time series with very long intervals and delays. […]

6 years ago

Information Entropy

Information entropy is a quantity that measures the amount of information. It was proposed by Shannon in 1948. It borrowed the concept of entropy in thermodynamics and called the average amount of information after excluding redundancy in information information entropy. It also gave a related mathematical expression. Three properties of information entropy Monotonicity: The higher the probability of an event, the more information it carries […]

7 years ago

Knowledge Representation

Knowledge representation refers to the representation and description of knowledge. It is concerned with how agents can reasonably use relevant knowledge. This is a study of thinking as a computational process. Strictly speaking, knowledge representation and knowledge reasoning are two closely related concepts in the same research field, but in fact, knowledge representation is also used to refer to the broad concept of reasoning.

7 years ago

Exponential Loss Function

Exponential loss function is a commonly used loss function in AdaBoost algorithm. Its function expression is in exponential form. The schematic diagram is as follows. Common loss error Exponential loss Exponential Loss: Mainly used in Adaboost ensemble learning algorithm; Hinge loss H […]

7 years ago

Ground-truth

In the field of machine learning, truth refers to the accurate set value of the training set for the classification result in supervised learning, which is generally used for error estimation and effect evaluation. In supervised learning, the labeled data usually appears in the form of (x, t), where x represents the input data and t represents the label. The correct label is Grou […]

7 years ago

Error-ambiguity Decomposition

Error-divergence decomposition refers to the process of decomposing the integrated generalization error, which can be expressed as follows: ${E= \overline {E}- \overline {A}}$ where the left side E represents the integrated generalization error, and the right side $latex {\over […]

7 years ago

Command Palette

Wiki

Bootstrapping

Bootstrap Sampling / Repeatable Sampling / Sampling With Replacement

Boltzmann Machine

Bi-partition

Binomial Test

Binary Classification

Bi-directional Long-Short Term Memory/Bi-LSTM

Bias-Variance Dilemma

Bias-variance Decomposition

Bias

Between-class Scatter Matrix

Bayesian Network

Bayesian Decision Theory

Bayes Optimal Classifier

Bayes Model Averaging/BMA

Bayes Decision Rule

BN Batch Normalization

Base Learning Algorithm

Long Short-Term Memory

Information Entropy

Knowledge Representation

Exponential Loss Function

Ground-truth

Error-ambiguity Decomposition

Bootstrapping

Bootstrap Sampling / Repeatable Sampling / Sampling With Replacement

Boltzmann Machine

Bi-partition

Binomial Test

Binary Classification

Bi-directional Long-Short Term Memory/Bi-LSTM

Bias-Variance Dilemma

Bias-variance Decomposition

Bias

Between-class Scatter Matrix

Bayesian Network

Bayesian Decision Theory

Bayes Optimal Classifier

Bayes Model Averaging/BMA

Bayes Decision Rule

BN Batch Normalization

Base Learning Algorithm

Long Short-Term Memory

Information Entropy

Knowledge Representation

Exponential Loss Function

Ground-truth

Error-ambiguity Decomposition