Small data is data that is 'small' enough for human comprehension. [1] It is data in a volume and format that makes it accessible, informative and actionable. [2]
The term "big data" is about machines and "small data" is about people. [3] This is to say that eyewitness observations or five pieces of related data could be small data. Small data is what we used to think of as data. The only way to comprehend Big data is to reduce the data into small, visually-appealing objects representing various aspects of large data sets (such as histogram, charts, and scatter plots). Big Data is all about finding correlations, but Small Data is all about finding the causation, the reason why. [4]
A formal definition of small data has been proposed by Allen Bonde, former vice-president of Innovation at Actuate - now part of OpenText: "Small data connects people with timely, meaningful insights (derived from big data and/or “local” sources), organized and packaged – often visually – to be accessible, understandable, and actionable for everyday tasks." [5]
Another definition of small data is:
It was estimated (2016) that “If one takes the top 100 biggest innovations of our time, perhaps around 60% to 65% percent are really based on Small Data.” [4] as Martin Lindstrom puts it. Small data includes everything from Snapchat to simple objects such as the post-it note. Lindstrom believes we become so focused on Big-Data that we tend to forget about more basic concepts and creativity. Lindstrom defines Small Data "as seemingly insignificant observations you identify in consumers’ homes, is everything from how you place your shoes on how you hang your paintings". He thus considers that one should perfectly master the basic (Small Data) in order to mine and find correlations.
Bonde has written about the topic for Forbes, [7] Direct Marketing News, [8] CMO.com [9] and other publications.
According to Martin Lindstrom, in his book, Small Data: "{In customer research, small data is} Seemingly insignificant behavioural observations containing very specific attributes pointing towards an unmet customer need. Small data is the foundation for breakthrough ideas or completely new ways to turnaround brands." [10] His approach is based on the combination of the observation of small samples with intuition. [11] Marketers can obtain market insights from gathering Small Data by engaging with and observing people in their own environments. [11] In comparison to Big Data, Small Data has the power to trigger emotions and to provide insights into the reasons behind the behaviours of customers. [12] It may uncover detailed information on a person's extroversion or introversion, self-confidence, whether one is having problems in his/her relationship, etc. [12] According to Lindstrom, relationships among people and customer segments are organized around four criteria:
Many companies underestimate the power of Small Data, using samples of millions of consumers instead of recognizing the value of closely observing small samples in their market research. [11] In his book, Lindstrom defines "7Cs", which companies should consider in the attempt to derive meaningful customer insights and market trends through small data from their customers: [12]
Some of Lindstrom's clients such as Lowes Foods looked at data in a different way and actually chose to live with the customer. “As you enter their store, they have now created an amazing community where every staff member acts in a character mood, based on Small Data”. [4] The supermarket made everything it can to make the customer feel at home. All the behaviours of employees are inspired by customer feedbacks gathered from interviews directly done at customer’s home.
Researchers at Cornell University started developing applications to monitor health problems in patients, based on small data. This is an initiative of Cornell's Small Data Lab, [13] in close cooperation with Weill Cornell Medicine College, led by Deborah Estrin.
The Small Data Lab developed a series of apps, focusing not only on gathering data from patients' pain but also tracking habits in areas such as grocery shopping. In the case of patients with rheumatoid arthritis for example, which has flares and remissions that do not follow a particular cycle, the app gathers information passively, thus allowing to forecast when a flare might be coming up based on small changes in behaviour. Other apps developed also include monitoring online grocery shopping, to use this information from every user to adapt their groceries to the recommendations of nutritionists, or monitoring email language to identify patterns that might indicate "fluctuations in cognitive performance, fatigue, side effects of medication or poor sleep, and other conditions and treatments that are typically self-reported and self-medicated". [14]
The United States Postal Service (USPS) used optical character recognition (OCR) to automatically read and process 98% of all hand-addressed mail and 99.5% of machine-printed mail. By combining this technology with its small data sample of US zip codes, the USPS can now process more than 36,000 pieces of mail per hour. [15]
In 2015, Boeing established the analytics lab for aerospace data in cooperation with the Carnegie Mellon University to leverage the university's leadership in machine learning, language technologies and data analytics. [16] One of the initiatives projects aims to by standardize maintenance logs using AI to dramatically reduce costs.
Currently, there is no standardized procedure to document maintenance logs leading to small but highly unstructured data sets. As a result, it becomes highly difficult for maintenance workers to translate these variations in maintenance logs within a short period of time. However, with AI and a narrow data set of common aircraft maintenance terminology, it becomes possible to dynamically translate these logs in real time. By using AI to enhance the speed and accuracy of the airline maintenance workflow, airlines stand to save billions according to the Harvard Business Review. [17]
Carnegie Mellon University (CMU) is a private research university in Pittsburgh, Pennsylvania, United States. The institution was established in 1900 by Andrew Carnegie as the Carnegie Technical Schools. In 1912, it became the Carnegie Institute of Technology and began granting four-year degrees. In 1967, it became Carnegie Mellon University through its merger with the Mellon Institute of Industrial Research, founded in 1913 by Andrew Mellon and Richard B. Mellon and formerly a part of the University of Pittsburgh.
In marketing, market segmentation or customer segmentation is the process of dividing a consumer or business market into meaningful sub-groups of current or potential customers known as segments. Its purpose is to identify profitable and growing segments that a company can target with distinct marketing strategies.
Consumer behaviour is the study of individuals, groups, or organisations and all activities associated with the purchase, use and disposal of goods and services. It encompasses how the consumer's emotions, attitudes, and preferences affect buying behaviour. Consumer behaviour emerged in the 1940–1950s as a distinct sub-discipline of marketing, but has become an interdisciplinary social science that blends elements from psychology, sociology, social anthropology, anthropology, ethnography, ethnology, marketing, and economics.
The Cornell Lab of Ornithology is a member-supported unit of Cornell University in Ithaca, New York, which studies birds and other wildlife. It is housed in the Imogene Powers Johnson Center for Birds and Biodiversity in Sapsucker Woods Sanctuary. Approximately 250 scientists, professors, staff, and students work in a variety of programs devoted to the Lab's mission: interpreting and conserving the Earth's biological diversity through research, education, and citizen science focused on birds. Work at the Lab is supported primarily by its 100,000 members and supporters.
Neuromarketing is a commercial marketing communication field that applies neuropsychology to market research, studying consumers' sensorimotor, cognitive, and affective responses to marketing stimuli. The potential benefits to marketers include more efficient and effective marketing campaigns and strategies, fewer product and campaign failures, and ultimately the manipulation of the real needs and wants of people to suit the needs and wants of marketing interests.
A target market, also known as serviceable obtainable market (SOM), is a group of customers within a business's serviceable available market at which a business aims its marketing efforts and resources. A target market is a subset of the total market for a product or service.
'Shopper marketing' is "a discipline that focuses on the customer experience and the customer journey."It focuses on the consumer's path to purchasing a product, from first being aware of the product, to consideration and through to the purchase of it. It separates itself from retail marketing which focuses on engaging the customer in-store only.
Product planning is the ongoing process of identifying and articulating market requirements that define a product's feature set. It serves as the basis for decision-making about price, distribution and promotion. Product planning is also the means by which companies and businesses can respond to long-term challenges within the business environment, often achieved by managing the product throughout its life cycle using various marketing strategies, including product extensions or improvements, increased distribution, price changes and promotions. It involves understanding the needs and wants of core customer groups so products can target key customer desires and allows a firm to predict how a product will be received within a market upon launch.
Customer Communications Management (CCM) is a software that companies uses to communicate with the customers. Originally, customer communications referred to printed documents, archived digital documents, and email. Organizations' digital transformation of customer communications expanded communication distribution including SMS, in-app notifications, responsive design mobile experiences and messages over common social media platforms.
Social media marketing is the use of social media platforms and websites to promote a product or service. Although the terms e-marketing and digital marketing are still dominant in academia, social media marketing is becoming more popular for both practitioners and researchers.
Data as a service (DaaS) is a cloud-based software tool used for working with data, such as managing data in a data warehouse or analyzing data with business intelligence. It is enabled by software as a service (SaaS). Like all "as a service" (aaS) technology, DaaS builds on the concept that its data product can be provided to the user on demand, regardless of geographic or organizational separation between provider and consumer. Service-oriented architecture (SOA) and the widespread use of APIs have rendered the platform on which the data resides as irrelevant.
MacKeeper is a cleanup utility for macOS. MacKeeper was developed by ZeoBIT, later acquired by Kromtech, and is currently owned by Clario Tech.
Act-On Software is a software-as-a-service product for marketing automation. The company is headquartered in Portland, Oregon and was founded in 2008, originally retailing its software exclusively through Cisco, which provided $2 million in funding.
Priya Narasimhan is a Professor of Electrical & Computer Engineering at Carnegie Mellon University in Pittsburgh, Pennsylvania. She is a serial entrepreneur, and the CEO and Founder of YinzCam, a U.S.-based technology company that provides the official mobile apps for 200+ professional sports teams, leagues, venues, and events in the United States, Canada, Mexico, U.K., Australia, New Zealand, and South America.
Chris Harrison is a British-born, American computer scientist and entrepreneur, working in the fields of human–computer interaction, machine learning and sensor-driven interactive systems. He is a professor at Carnegie Mellon University and director of the Future Interfaces Group within the Human–Computer Interaction Institute. He has previously conducted research at AT&T Labs, Microsoft Research, IBM Research and Disney Research. He is also the CTO and co-founder of Qeexo, a machine learning and interaction technology startup.
Psychographic segmentation has been used in marketing research as a form of market segmentation which divides consumers into sub-groups based on shared psychological characteristics, including subconscious or conscious beliefs, motivations, and priorities to explain, and predict consumer behavior. Developed in the 1970s, it applies behavioral and social sciences to explore to understand consumers’ decision-making processes, consumer attitudes, values, personalities, lifestyles, and communication preferences. It complements demographic and socioeconomic segmentation, and enables marketers to target audiences with messaging to market brands, products or services. Some consider lifestyle segmentation to be interchangeable with psychographic segmentation, marketing experts argue that lifestyle relates specifically to overt behaviors while psychographics relate to consumers' cognitive style, which is based on their "patterns of thinking, feeling and perceiving".
Elizabeth Wayne is an Assistant Professor of Biomedical Engineering and Chemical Engineering at Carnegie Mellon University and former Postdoc at the Center for Nanotechnology in Drug Delivery at the University of North Carolina at Chapel Hill. Wayne was a 2017 TED fellow and is a member of a number of professional societies, including the National Society of Black Physicists.
Geoffrey J. Gordon is a professor at the Machine Learning Department at Carnegie Mellon University in Pittsburgh and director of research at the Microsoft Montréal lab. He is known for his research in statistical relational learning and on anytime dynamic variants of the A* search algorithm. His research interests include multi-agent planning, reinforcement learning, decision-theoretic planning, statistical models of difficult data, computational learning theory, and game theory.
Customer data or consumer data refers to all personal, behavioural, and demographic data that is collected by marketing companies and departments from their customer base. To some extent, data collection from customers intrudes into customer privacy, the exact limits to the type and amount of data collected need to be regulated. The data collected is processed in customer analytics. The data collection is thus aimed at insights into customer behaviour and, eventually, profit maximization by consolidation and expansion of the customer base.
A cloud laboratory is a heavily automated, centralized research laboratory where scientists can run an experiment from a computer in a remote location. Cloud laboratories offer the execution of life science research experiments under a cloud computing service model, allowing researchers to retain full control over experimental design. Users create experimental protocols through a high-level API and the experiment is executed in the cloud laboratory, with no need for the user to be involved.