Social cognitive optimization

Last updated October 10, 2021

Social cognitive optimization (SCO) is a population-based metaheuristic optimization algorithm which was developed in 2002.^[1] This algorithm is based on the social cognitive theory, and the key point of the ergodicity is the process of individual learning of a set of agents with their own memory and their social learning with the knowledge points in the social sharing library. It has been used for solving continuous optimization,^[2]^[3] integer programming,^[4] and combinatorial optimization problems. It has been incorporated into the NLPSolver extension of Calc in Apache OpenOffice.

Algorithm

Let $f(x)$ be a global optimization problem, where $x$ is a state in the problem space $S$ . In SCO, each state is called a knowledge point, and the function $f$ is the goodness function.

In SCO, there are a population of $N_{c}$ cognitive agents solving in parallel, with a social sharing library. Each agent holds a private memory containing one knowledge point, and the social sharing library contains a set of $N_{L}$ knowledge points. The algorithm runs in T iterative learning cycles. By running as a Markov chain process, the system behavior in the tth cycle only depends on the system status in the (t − 1)th cycle. The process flow is in follows:

[1. Initialization]：Initialize the private knowledge point $x_{i}$ in the memory of each agent $i$ , and all knowledge points in the social sharing library $X$ , normally at random in the problem space $S$ .
[2. Learning cycle]： At each cycle $t$ $Social cognitive optimization$ $(t=1,\ldots ,T)$ $Social cognitive optimization$ ：
- [2.1. Observational learning] For each agent $i$ $Social cognitive optimization$ $(i=1,\ldots ,N_{c})$ $Social cognitive optimization$ ：
  - [2.1.1. Model selection]：Find a high-quality model point $x_{M}$ in $X(t)$ , normally realized using tournament selection, which returns the best knowledge point from randomly selected $\tau _{B}$ points.
  - [2.1.2. Quality Evaluation]：Compare the private knowledge point $x_{i}(t)$ and the model point $x_{M}$ ，and return the one with higher quality as the base point $x_{Base}$ ，and another as the reference point $x_{Ref}$ 。
  - [2.1.3. Learning]：Combine $x_{Base}$ and $x_{Ref}$ to generate a new knowledge point $x_{i}(t+1)$ . Normally $x_{i}(t+1)$ should be around $x_{Base}$ ，and the distance with $x_{Base}$ is related to the distance between $x_{Ref}$ and $x_{Base}$ , and boundary handling mechanism should be incorporated here to ensure that $x_{i}(t+1)\in S$ .
  - [2.1.4. Knowledge sharing]：Share a knowledge point, normally $x_{i}(t+1)$ , to the social sharing library $X$ .
  - [2.1.5. Individual update]：Update the private knowledge of agent $i$ , normally replace $x_{i}(t)$ by $x_{i}(t+1)$ . Some Monte Carlo types might also be considered.
- [2.2. Library Maintenance]：The social sharing library using all knowledge points submitted by agents to update $X(t)$ into $X(t+1)$ . A simple way is one by one tournament selection: for each knowledge point submitted by an agent, replace the worse one among $\tau _{W}$ points randomly selected from $X(t)$ .
[3. Termination]：Return the best knowledge point found by the agents.

SCO has three main parameters, i.e., the number of agents $N_{c}$ , the size of social sharing library $N_{L}$ , and the learning cycle $T$ . With the initialization process, the total number of knowledge points to be generated is $N_{L}+N_{c}*(T+1)$ , and is not related too much with $N_{L}$ if $T$ is large.

Compared to traditional swarm algorithms, e.g. particle swarm optimization, SCO can achieving high-quality solutions as $N_{c}$ is small, even as $N_{c}=1$ . Nevertheless, smaller $N_{c}$ and $N_{L}$ might lead to premature convergence. Some variants ^[5] were proposed to guaranteed the global convergence. One can also make a hybrid optimization method using SCO combined with other optimizers. For example, SCO was hybridized with differential evolution to obtain better results than individual algorithms on a common set of benchmark problems.^[6]

Related Research Articles

The travelling salesman problem asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?" It is an NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

Linear programming is a method to achieve the best outcome in a mathematical model whose requirements are represented by linear relationships. Linear programming is a special case of mathematical programming.

In machine learning, support-vector machines are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laboratories by Vladimir Vapnik with colleagues SVMs are one of the most robust prediction methods, being based on statistical learning frameworks or VC theory proposed by Vapnik and Chervonenkis (1974). Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier. SVM maps training examples to points in space so as to maximise the width of the gap between the two categories. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall.

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.

The assignment problem is a fundamental combinatorial optimization problem. In its most general form, the problem is as follows:

High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lies within lower-dimensional space. If the data of interest is of low enough dimension, the data can be visualised in the low-dimensional space.

In computer science and operations research, the ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems which can be reduced to finding good paths through graphs. Artificial ants stand for multi-agent methods inspired by the behavior of real ants. The pheromone-based communication of biological ants is often the predominant paradigm used. Combinations of artificial ants and local search algorithms have become a method of choice for numerous optimization tasks involving some sort of graph, e.g., vehicle routing and internet routing.

Cluster analysis Task of grouping a set of objects so that objects in the same group (or cluster) are more similar to each other than to those in other clusters

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups (clusters). It is a main task of exploratory data analysis, and a common technique for statistical data analysis, used in many fields, including pattern recognition, image analysis, information retrieval, bioinformatics, data compression, computer graphics and machine learning.

Multi-task learning (MTL) is a subfield of machine learning in which multiple learning tasks are solved at the same time, while exploiting commonalities and differences across tasks. This can result in improved learning efficiency and prediction accuracy for the task-specific models, when compared to training the models separately. Early versions of MTL were called "hints".

In machine learning and statistics, feature selection, also known as variable selection, attribute selection or variable subset selection, is the process of selecting a subset of relevant features for use in model construction. Feature selection techniques are used for several reasons:

Stochastic gradient descent is an iterative method for optimizing an objective function with suitable smoothness properties. It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient by an estimate thereof. Especially in high-dimensional optimization problems this reduces the computational burden, achieving faster iterations in trade for a lower convergence rate.

In (unconstrained) minimization, a backtracking line search, a search scheme based on the Armijo–Goldstein condition, is a line search method to determine the amount to move along a given search direction. It involves starting with a relatively large estimate of the step size for movement along the search direction, and iteratively shrinking the step size until a decrease of the objective function is observed that adequately corresponds to the decrease that is expected, based on the local gradient of the objective function.

k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances, but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances. For instance, better Euclidean solutions can be found using k-medians and k-medoids.

The study of facility location problems (FLP), also known as location analysis, is a branch of operations research and computational geometry concerned with the optimal placement of facilities to minimize transportation costs while considering factors like avoiding placing hazardous materials near housing, and competitors' facilities. The techniques also apply to cluster analysis.

In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once. Online learning is a common technique used in areas of machine learning where it is computationally infeasible to train over the entire dataset, requiring the need of out-of-core algorithms. It is also used in situations where it is necessary for the algorithm to dynamically adapt to new patterns in the data, or when the data itself is generated as a function of time, e.g., stock price prediction. Online learning algorithms may be prone to catastrophic interference, a problem that can be addressed by incremental learning approaches.

In machine learning, one-class classification (OCC), also known as unary classification or class-modelling, tries to identify objects of a specific class amongst all objects, by primarily learning from a training set containing only the objects of that class, although there exist variants of one-class classifiers where counter-examples are used to further refine the classification boundary. This is different from and more difficult than the traditional classification problem, which tries to distinguish between two or more classes with the training set containing objects from all the classes. Examples include the monitoring of helicopter gearboxes, motor failure prediction, or the operational status of a nuclear plant as 'normal': In this scenario, there are few, if any, examples of catastrophic system states; only the statistics of normal operation are known.

In computer science, data stream clustering is defined as the clustering of data that arrive continuously such as telephone records, multimedia data, financial transactions etc. Data stream clustering is usually studied as a streaming algorithm and the objective is, given a sequence of points, to construct a good clustering of the stream, using a small amount of memory and time.

Large margin nearest neighbor (LMNN) classification is a statistical machine learning algorithm for metric learning. It learns a pseudometric designed for k-nearest neighbor classification. The algorithm is based on semidefinite programming, a sub-class of convex optimization.

In data mining and machine learning, $-flats algorithm is an iterative method which aims to partition observations into clusters where each cluster is close to a -flat, where is a given integer.$

References

↑ Xie, Xiao-Feng; Zhang, Wen-Jun; Yang, Zhi-Lian (2002). Social cognitive optimization for nonlinear programming problems. International Conference on Machine Learning and Cybernetics (ICMLC), Beijing, China: 779-783.
↑ Xie, Xiao-Feng; Zhang, Wen-Jun (2004). Solving engineering design problems by social cognitive optimization. Genetic and Evolutionary Computation Conference (GECCO), Seattle, WA, USA: 261-262.
↑ Xu, Gang-Gang; Han, Luo-Cheng; Yu, Ming-Long; Zhang, Ai-Lan (2011). Reactive power optimization based on improved social cognitive optimization algorithm. International Conference on Mechatronic Science, Electric Engineering and Computer (MEC), Jilin, China: 97-100.
↑ Fan, Caixia (2010). Solving integer programming based on maximum entropy social cognitive optimization algorithm. International Conference on Information Technology and Scientific Management (ICITSM), Tianjing, China: 795-798.
↑ Sun, Jia-ze; Wang, Shu-yan; chen, Hao (2014). A guaranteed global convergence social cognitive optimizer. Mathematical Problems in Engineering: Art. No. 534162.
↑ Xie, Xiao-Feng; Liu, J.; Wang, Zun-Jing (2014). "A cooperative group optimization system". Soft Computing. 18 (3): 469–495. arXiv: 1808.01342 . doi:10.1007/s00500-013-1069-8. S2CID 5393223.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[xzy02sco-1] Xie, Xiao-Feng; Zhang, Wen-Jun; Yang, Zhi-Lian (2002). Social cognitive optimization for nonlinear programming problems. International Conference on Machine Learning and Cybernetics (ICMLC), Beijing, China: 779-783.

[xz04sco-2] Xie, Xiao-Feng; Zhang, Wen-Jun (2004). Solving engineering design problems by social cognitive optimization. Genetic and Evolutionary Computation Conference (GECCO), Seattle, WA, USA: 261-262.

[xu2011reactive-3] Xu, Gang-Gang; Han, Luo-Cheng; Yu, Ming-Long; Zhang, Ai-Lan (2011). Reactive power optimization based on improved social cognitive optimization algorithm. International Conference on Mechatronic Science, Electric Engineering and Computer (MEC), Jilin, China: 97-100.

[fan10scoip-4] Fan, Caixia (2010). Solving integer programming based on maximum entropy social cognitive optimization algorithm. International Conference on Information Technology and Scientific Management (ICITSM), Tianjing, China: 795-798.

[sun2014guaranteed-5] Sun, Jia-ze; Wang, Shu-yan; chen, Hao (2014). A guaranteed global convergence social cognitive optimizer. Mathematical Problems in Engineering: Art. No. 534162.

[CGOS14-6] Xie, Xiao-Feng; Liu, J.; Wang, Zun-Jing (2014). "A cooperative group optimization system". Soft Computing. 18 (3): 469–495. arXiv: 1808.01342 . doi:10.1007/s00500-013-1069-8. S2CID 5393223.

[1]

[2]

[3]

[4]

[5]

[6]