Command Palette
Search for a command to run...
Wiki
We have compiled hundreds of related entries to help you understand "artificial intelligence"
Search for a command to run...
We have compiled hundreds of related entries to help you understand "artificial intelligence"
Bootstrapping is a method of uniform sampling with replacement from a given training set, that is, whenever a sample is selected, it is equally likely to be selected again and added to the training set again. The bootstrap method was first proposed by Bradley Efron in Annals of Statistics in 1979. […]
For a sample, the probability of it being collected in a random sampling of a training set containing m samples is 1m. The probability of not being collected is 1−1m. If the probability of not being collected in m samplings is (1−1m)m, then when m→∞, (1−1m)m→1/e≃0 […]
The Boltzmann machine is a type of random neural network and recurrent neural network invented by Geoffrey Hinton and Terry Sejnowski in 1985. The Boltzmann machine can be viewed as a random process that generates corresponding […]
Definition Binary search is an algorithm whose input is an ordered list of elements. If the element to be found is contained in the list, binary search returns its position; otherwise it returns null. Basic idea This method is suitable when the amount of data is large. When using binary search, the data must be sorted. Assume that the data is in ascending order […]
Definition The binomial test compares the observed frequencies of the two categories of a dichotomous variable with the expected frequencies under a binomial distribution with a specified probability parameter, which is 0.5 for both groups by default. Example A coin is tossed and the probability of heads is 1/2. Under this hypothesis, the coin is tossed 40 times [...]
Indicates that there are only two categories in the classification task, for example, we want to identify whether a picture is a cat or not. That is, train a classifier, input a picture, represented by the feature vector x, and output whether it is a cat, represented by y = 0 or 1; two-class classification assumes that each sample is set with one and only one label 0 […]
Definition Deep neural networks have shown superior results in many fields such as speech recognition, image processing, and natural language processing. LSTM, as a variant of RNN, can learn long-term dependencies in data compared to RNN. In 2005, Graves proposed combining LSTM with […]
The bias-variance dilemma means that it is impossible to reduce both bias and variance at the same time, and you can only strike a balance between the two. In a model, if you want to reduce bias, you will increase the complexity of the model to prevent underfitting; but at the same time, you cannot make the model too complex to increase variance and cause overfitting. Therefore, you need to find a balance in the complexity of the model, which can […]
Bias-variance decomposition is a tool to explain the generalization performance of learning algorithms from the perspective of bias and variance. The specific definition is as follows: Assume that there are K data sets, each of which is independently drawn from a distribution p(t,x) (t represents the variable to be predicted, x represents the feature variable). In different […]
Definition: The difference between the expected output and the true label is called bias. The following figure can well illustrate the relationship between bias and variance:
The inter-class scatter matrix is used to represent the distribution of each sample point around the mean. Mathematical definition
Definition Bayesian network is one of the most effective theoretical models in the field of uncertain knowledge expression and reasoning. Bayesian network consists of nodes representing variables and directed edges connecting these nodes. Nodes represent random variables, and directed edges between nodes represent the relationship between nodes. The strength of the relationship is expressed by conditional probability. There is no parent node […]
Basic Concepts Bayesian decision theory is a basic method in statistical model decision making. Its basic idea is: Known class conditional probability density parameter expression and prior probability are converted into posterior probability using Bayesian formula. Decision classification is made based on the size of the posterior probability. Related formula Let D1, D2, ..., Dn be samples […]
In order to minimize the overall risk, the class label that can minimize the risk R(c|x) on the sample is selected, that is, h∗ is the Bayesian optimal classifier.
In model selection, one "best" model is usually selected from a set of candidate models, and then this selected "best" model is used for prediction. Unlike a single best model, Bayesian model averaging assigns weights to each model and performs weighted averaging to determine the final prediction value. The weight assigned to a model is […]
For each sample x, if h can minimize the conditional risk R(h(x)|x), then the overall risk will also be minimized. This gives rise to the Bayes decision rule: to minimize the overall risk, we only need to choose the one that can make the conditional risk R(c|x […]
BN is a set of regularization methods that can speed up the training of large convolutional networks and improve the classification accuracy after convergence. When BN is used in a certain layer of a neural network, it will standardize the internal data of each mini-batch so that the output is normalized to the normal distribution of N(0,1), reducing […]
In ensemble learning, the "individual learners" generated by group are homogeneous. Such learners are called base learners, and the corresponding learning algorithms are called base learning algorithms.
Long Short-Term Memory (LSTM) is a time recurrent neural network (RNN) that was first published in 1997. Due to its unique design structure, LSTM is suitable for processing and predicting important events in time series with very long intervals and delays. […]
Information entropy is a quantity that measures the amount of information. It was proposed by Shannon in 1948. It borrowed the concept of entropy in thermodynamics and called the average amount of information after excluding redundancy in information information entropy. It also gave a related mathematical expression. Three properties of information entropy Monotonicity: The higher the probability of an event, the more information it carries […]
Knowledge representation refers to the representation and description of knowledge. It is concerned with how agents can reasonably use relevant knowledge. This is a study of thinking as a computational process. Strictly speaking, knowledge representation and knowledge reasoning are two closely related concepts in the same research field, but in fact, knowledge representation is also used to refer to the broad concept of reasoning.
Exponential loss function is a commonly used loss function in AdaBoost algorithm. Its function expression is in exponential form. The schematic diagram is as follows. Common loss error Exponential loss Exponential Loss: Mainly used in Adaboost ensemble learning algorithm; Hinge loss H […]
In the field of machine learning, truth refers to the accurate set value of the training set for the classification result in supervised learning, which is generally used for error estimation and effect evaluation. In supervised learning, the labeled data usually appears in the form of (x, t), where x represents the input data and t represents the label. The correct label is Grou […]
Error-divergence decomposition refers to the process of decomposing the integrated generalization error, which can be expressed as follows: where the left side E represents the integrated generalization error, and the right side $latex {\over […]