IPO underpricing algorithm

Last updated December 22, 2023

IPO underpricing is the increase in stock value from the initial offering price to the first-day closing price. Many believe that underpriced IPOs leave money on the table for corporations, but some believe that underpricing is inevitable. Investors state that underpricing signals high interest to the market which increases the demand. On the other hand, overpriced stocks will drop long-term as the price stabilizes so underpricing may keep the issuers safe from investor litigation.

IPO underpricing algorithms

Underwriters and investors and corporations going for an initial public offering (IPO), issuers, are interested in their market value. There is always tension that results since the underwriters want to keep the price low while the companies want a high IPO price.

Underpricing may also be caused by investor over-reaction causing spikes on the initial days of trading. The IPO pricing process is similar to pricing new and unique products where there is sparse data on market demand, product acceptance, or competitive response. Thus it is difficult to determine a clear price which is compounded by the different goals issuers and investors have.

The problem with developing algorithms to determine underpricing is dealing with noisy, complex, and unordered data sets. Additionally, people, environment, and various environmental conditions introduce irregularities in the data. To resolve these issues, researchers have found various techniques from artificial intelligence that normalizes the data.

Artificial neural network

Artificial neural networks (ANNs) resolves these issues by scanning the data to develop internal representations of the relationship between the data. By determining the relationship over time, ANNs are more responsive and adaptive to structural changes in the data. There are two models for ANNs: supervised learning and unsupervised learning.

In supervised learning models, there are tests that are needed to pass to reduce mistakes. Usually, when mistakes are encountered i.e. test output does not match test input, the algorithms use back propagation to fix mistakes. Whereas in unsupervised learning models, the input is classified based on which problems need to be resolved.

Evolutionary models

Evolutionary programming is often paired with other algorithms e.g. ANN to improve the robustness, reliability, and adaptability. Evolutionary models reduce error rates by allowing the numerical values to change within the fixed structure of the program. Designers provide their algorithms the variables, they then provide training data to help the program generate rules defined in the input space that make a prediction in the output variable space.

In this approach, the solution is made an individual and the population is made of alternatives. However, the outliers cause the individuals to act unexpectedly as they try to create rules to explain the whole set.

Rule-based system

For example, Quintana^[1] first abstracts a model with 7 major variables. The rules evolved from the Evolutionary Computation system developed at Michigan and Pittsburgh:

Underwriter prestige – Is the underwriter prestigious in role of lead manager? 1 for true, 0 otherwise.
Price range width – The width of the non-binding reference price range offered to potential customers during the roadshow. This width can be interpreted as a sign of uncertainty regarding the real value of the company and a therefore, as a factor that could influence the initial return.
Price adjustment – The difference between the final offer price and the price range width. It can be viewed as uncertainty if the adjustment is outside the previous price range.
Offering price – The final offer price of the IPO
Retained stock – Ratio of number of shares sold at the IPO divided by post-offering number of shares minus the number of shares sold at the IPO.
Offering size – Logarithm of the offering size in millions of dollars excluding the over-allotment option
Technology – Is this a technology company? 1 for true, 0 otherwise.

Quintana uses these factors as signals that investors focus on. The algorithm his team explains shows how a prediction with a high-degree of confidence is possible with just a subset of the data.

Two-layered evolutionary forecasting

Luque^[2] approaches the problem with outliers by performing linear regressions over the set of data points (input, output). The algorithm deals with the data by allocating regions for noisy data. The scheme has the advantage of isolating noisy patterns which reduces the effect outliers have on the rule-generation system. The algorithm can come back later to understand if the isolated data sets influence the general data. Finally, the worst results from the algorithm outperformed all other algorithms' predictive abilities.

Agent-based modelling

Currently, many of the algorithms assume homogeneous and rational behavior among investors. However, there's an approach alternative to financial modeling, and it's called agent-based modelling (ABM). ABM uses different autonomous agents whose behavior evolves endogenously which lead to complicated system dynamics that are sometimes impossible to predict from the properties of individual agents.^[3] ABM is starting to be applied to computational finance. Though, for ABM to be more accurate, better models for rule-generation need to be developed.

Related Research Articles

<span class="mw-page-title-main">Supervised learning</span> A paradigm in machine learning

Supervised learning (SL) is a paradigm in machine learning where input objects and a desired output value train a model. The training data is processed, building a function that maps new data on expected output values. An optimal scenario will allow for the algorithm to correctly determine output values for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way. This statistical quality of an algorithm is measured through the so-called generalization error.

Artificial neural networks are a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains.

An initial public offering (IPO) or stock launch is a public offering in which shares of a company are sold to institutional investors and usually also to retail (individual) investors. An IPO is typically underwritten by one or more investment banks, who also arrange for the shares to be listed on one or more stock exchanges. Through this process, colloquially known as floating, or going public, a privately held company is transformed into a public company. Initial public offerings can be used to raise new equity capital for companies, to monetize the investments of private shareholders such as company founders or private equity investors, and to enable easy trading of existing holdings or future capital raising by becoming publicly traded.

Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can effectively generalize and thus perform tasks without explicit instructions. Recently, generative artificial neural networks have been able to surpass many previous approaches in performance. Machine learning approaches have been applied to large language models, computer vision, speech recognition, email filtering, agriculture, and medicine, where it is too costly to develop algorithms to perform the needed tasks.

Unsupervised learning is a paradigm in machine learning where, in contrast to supervised learning and semi-supervised learning, algorithms learn patterns exclusively from unlabeled data.

In computer science, evolutionary computation is a family of algorithms for global optimization inspired by biological evolution, and the subfield of artificial intelligence and soft computing studying these algorithms. In technical terms, they are a family of population-based trial and error problem solvers with a metaheuristic or stochastic optimization character.

Solomonoff's theory of inductive inference is a mathematical theory of induction introduced by Ray Solomonoff, based on probability theory and theoretical computer science. In essence, Solomonoff's induction derives the posterior probability of any computable theory, given a sequence of observed data. This posterior probability is derived from Bayes' rule and some universal prior, that is, a prior that assigns a positive probability to any computable theory.

Algorithmic composition is the technique of using algorithms to create music.

Learning classifier systems, or LCS, are a paradigm of rule-based machine learning methods that combine a discovery component with a learning component. Learning classifier systems seek to identify a set of context-dependent rules that collectively store and apply knowledge in a piecewise manner in order to make predictions. This approach allows complex solution spaces to be broken up into smaller, simpler parts.

Underwriting (UW) services are provided by some large financial institutions, such as banks, insurance companies and investment houses, whereby they guarantee payment in case of damage or financial loss and accept the financial risk for liability arising from such guarantee. An underwriting arrangement may be created in a number of situations including insurance, issues of security in a public offering, and bank lending, among others. The person or institution that agrees to sell a minimum number of securities of the company for commission is called the underwriter.

Greenshoe, or over-allotment clause, is the term commonly used to describe a special arrangement in a U.S. registered share offering, for example an initial public offering (IPO), which enables the investment bank representing the underwriters to support the share price after the offering without putting their own capital at risk. This clause is codified as a provision in the underwriting agreement between the leading underwriter, the lead manager, and the issuer or vendor. The provision allows the underwriter to purchase up to 15% in additional company shares at the offering share price.

In statistics, the k-nearest neighbors algorithm (k-NN) is a non-parametric supervised learning method first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. It is used for classification and regression. In both cases, the input consists of the k closest training examples in a data set. The output depends on whether k-NN is used for classification or regression:

Conditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without considering "neighbouring" samples, a CRF can take context into account. To do so, the predictions are modelled as a graphical model, which represents the presence of dependencies between the predictions. What kind of graph is used depends on the application. For example, in natural language processing, "linear chain" CRFs are popular, for which each prediction is dependent only on its immediate neighbours. In image processing, the graph typically connects locations to nearby and/or similar locations to enforce that they receive similar predictions.

In data analysis, anomaly detection is generally understood to be the identification of rare items, events or observations which deviate significantly from the majority of the data and do not conform to a well defined notion of normal behaviour. Such examples may arouse suspicions of being generated by a different mechanism, or appear inconsistent with the remainder of that set of data.

In machine learning, feature learning or representation learning is a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data. This replaces manual feature engineering and allows a machine to both learn the features and use them to perform a specific task.

This glossary of artificial intelligence is a list of definitions of terms and concepts relevant to the study of artificial intelligence, its sub-disciplines, and related fields. Related glossaries include Glossary of computer science, Glossary of robotics, and Glossary of machine vision.

The following outline is provided as an overview of and topical guide to machine learning:

MuZero is a computer program developed by artificial intelligence research company DeepMind to master games without knowing their rules. Its release in 2019 included benchmarks of its performance in go, chess, shogi, and a standard suite of Atari games. The algorithm uses an approach similar to AlphaZero. It matched AlphaZero's performance in chess and shogi, improved on its performance in Go, and improved on the state of the art in mastering a suite of 57 Atari games, a visually-complex domain.

Soft computing is an umbrella term used to describe types of algorithms that produce approximate solutions to unsolvable high-level problems in computer science. Typically, traditional hard-computing algorithms heavily rely on concrete data and mathematical models to produce solutions to problems. Soft computing was coined in the late 20th century. During this period, revolutionary research in three fields greatly impacted soft computing. Fuzzy logic is a computational paradigm that entertains the uncertainties in data by using levels of truth rather than rigid 0s and 1s in binary. Next, neural networks which are computational models influenced by human brain functions. Finally, evolutionary computation is a term to describe groups of algorithm that mimic natural processes such as evolution and natural selection.

Self-supervised learning (SSL) is a paradigm in machine learning where a model is trained on a task using the data itself to generate supervisory signals, rather than relying on external labels provided by humans. In the context of neural networks, self-supervised learning aims to leverage inherent structures or relationships within the input data to create meaningful training signals. SSL tasks are designed so that solving it requires capturing essential features or relationships in the data. The input data is typically augmented or transformed in a way that creates pairs of related samples. One sample serves as the input, and the other is used to formulate the supervisory signal. This augmentation can involve introducing noise, cropping, rotation, or other transformations. Self-supervised learning more closely imitates the way humans learn to classify objects.

References

↑ Quintana, David; Cristóbal Luque; Pedro Isasi (2005). "Evolutionary rule-based system for IPO underpricing prediction". Proceedings of the 7th annual conference on Genetic and evolutionary computation. pp. 983–989. doi:10.1145/1068009.1068176. hdl: 10016/4081 . ISBN 1595930108. S2CID 3035047.
↑ Luque, Cristóbal; David Quintana; J. M. Valls; Pedro Isasi (2009). "Two-layered evolutionary forecasting for IPO underpricing". 2009 IEEE Congress on Evolutionary Computation. Piscatawy, NJ, USA: IEEE Press. pp. 2374–2378. doi:10.1109/cec.2009.4983237. ISBN 978-1-4244-2958-5. S2CID 1733801.
↑ Brabazon, Anthony; Jiang Dang; Ian Dempsy; Michael O'Neill; David M. Edelman (2010). "Natural Computing in Finance: A Review" (PDF). Handbook of Natural Computing.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Quintana, David; Cristóbal Luque; Pedro Isasi (2005). "Evolutionary rule-based system for IPO underpricing prediction". Proceedings of the 7th annual conference on Genetic and evolutionary computation. pp. 983–989. doi:10.1145/1068009.1068176. hdl: 10016/4081 . ISBN 1595930108. S2CID 3035047.

[2] Luque, Cristóbal; David Quintana; J. M. Valls; Pedro Isasi (2009). "Two-layered evolutionary forecasting for IPO underpricing". 2009 IEEE Congress on Evolutionary Computation. Piscatawy, NJ, USA: IEEE Press. pp. 2374–2378. doi:10.1109/cec.2009.4983237. ISBN 978-1-4244-2958-5. S2CID 1733801.

[3] Brabazon, Anthony; Jiang Dang; Ian Dempsy; Michael O'Neill; David M. Edelman (2010). "Natural Computing in Finance: A Review" (PDF). Handbook of Natural Computing.

[1]

[2]

[3]

IPO underpricing algorithm

Contents