FAISS (Facebook AI Similarity Search)[3] is an open-sourcelibrary for similarity search and clustering of vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning.
FAISS is written in C++ with complete wrappers for Python and C. Some of the most useful algorithms are implemented on the GPU using CUDA.[4]
Features
FAISS is organized as a toolbox that contains a variety of indexing methods that commonly involve a chain of components (preprocessing, compression, non-exhaustive search, etc.). The scope of the library is intentionally limited to focus on ANNS algorithmic implementation and to avoid facilities related to database functionality, distributed computing or feature extraction algorithms.[5]
FAISS is designed with the following assumptions:[5]
Primary data type for vector representation is FP32. The support of other floating-point formats, such as BF16 and FP16, is provided.
Prefer batches of input queries over a single input query for the search.
Emphasize on allowing users to write a fast prototyping code using its Python wrappers.
The code should be as open as possible, so that users can access all the implementation details of the indexes.
The following major categories of indexing methods are supported:
FAISS has a standalone Vector Codec functionality for the lossy compression of vectors, allowing to trade the representation accuracy for the binary size.[17]
↑ Sivic; Zisserman (2003). "Video Google: A text retrieval approach to object matching in videos". Proceedings Ninth IEEE International Conference on Computer Vision. pp.1470–1477 vol.2. doi:10.1109/ICCV.2003.1238663. ISBN0-7695-1950-4.
↑ Fu, Cong; Xiang, Chao; Wang, Changxu; Cai, Deng (2017). "Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph". arXiv:1707.00143 [cs.LG].
↑ Andre, Fabien; Kermarrec, Anne-Marie; Le Scouarnec, Nicolas (1 May 2021). "Quicker ADC: Unlocking the Hidden Potential of Product Quantization With SIMD". IEEE Transactions on Pattern Analysis and Machine Intelligence. 43 (5): 1666–1677. arXiv:1812.09162. doi:10.1109/TPAMI.2019.2952606. PMID31722477.
↑ Babenko, Artem; Lempitsky, Victor (June 2014). "Additive Quantization for Extreme Vector Compression". 2014 IEEE Conference on Computer Vision and Pattern Recognition. pp.931–938. doi:10.1109/CVPR.2014.124. ISBN978-1-4799-5118-5.
↑ Martinez, Julieta; Zakhmi, Shobhit; Hoos, Holger H.; Little, James J. (2018). "LSQ++: Lower Running Time and Higher Recall in Multi-codebook Quantization". Computer Vision – ECCV 2018. Lecture Notes in Computer Science. Vol.11220. pp.508–523. doi:10.1007/978-3-030-01270-0_30. ISBN978-3-030-01269-4.
↑ Huijben, Iris A. M.; Douze, Matthijs; Muckley, Matthew; van Sloun, Ruud J. G.; Verbeek, Jakob (2024). "Residual Quantization with Implicit Neural Codebooks". arXiv:2401.14732 [cs.LG].
Autofaiss - automatically create Faiss knn indices with the most optimal similarity search parameters
This page is based on this Wikipedia article Text is available under the CC BY-SA 4.0 license; additional terms may apply. Images, videos and audio are available under their respective licenses.