Social data analysis

Last updated

Social data analysis is the data-driven analysis of how people interact in social contexts, often with data obtained from social networking services. The goal may be to simply understand human behavior or even to propagate a story of interest to the target audience. Techniques may involve understanding how data flows within a network, identifying influential nodes (people, entities etc.), or discovering trending topics.

Contents

Social data analysis usually comprises two key steps: 1) gathering data generated from social networking sites (or through social applications), and 2) analysis of that data, in many cases requiring real-time (or near real-time) data analysis, measurements which understand and appropriately weigh factors such as influence, reach, and relevancy, an understanding of the context of the data being analyzed, and the inclusion of time horizon considerations. In short, social data analytics involves the analysis of social media in order to understand and surface insights which is embedded within the data. [1]

Social data analysis can provide a new slant on business intelligence where social exploration of data can lead to important insights that the user of analytics did not envisage/explore. The term was introduced by Martin Wattenberg in 2005 [2] and recently also addressed as big social data analysis in relation to big data computing.

Systems are available to assist users in analyzing social data. They allow users to store data sets and create corresponding visual representations. The discussion mechanisms often use frameworks such as a blogs and wikis to drive this social exploration/Collaborative intelligence.

Obtaining social data

Social networking services are increasingly popular with the development of Web 2.0. Many of these services provide APIs that allow easy access to their data by responding to user queries with the requested data in the form of XML or JSON formatted strings. In order to protect privacy of their users, services such as Facebook require that the person requesting data has the necessary data access permissions. Services may also charge users for access to their data. Sources of social data include Twitter, Facebook, news websites, Wikipedia and We Feel Fine.

Some APIs only allow access to data in small quantities, hence indexing the data in bulk can become a challenge. Six_Apart was the first social media company to provide a (free) firehose of content for all the posts in their network (provided over XMPP). Twitter later came along and provided a firehose as did companies like Spinn3r, Datasift, and GNIP.

Methods of analysis

In most cases, we want to find out the relationships between social data and another event or we want to get interesting results from social data analyses to predict some events. There are some outstanding articles in this field, including Twitter Mood Predicts The Stock Market, [3] Predicting The Present With Google Trends [4] etc. In order to accomplish these goals, we need the appropriate methods to do the analyses. Usually, we use statistic methods, methods of machine learning or methods of data mining to do the analyses.

Universities all over the world are opening graduate program in Social Data Analysis.

Key concepts

When talking about social data analytics, there are a number of factors it's important to keep in mind (which we noted earlier): [1]

See also

Related Research Articles

Analysis

Analysis is the process of breaking a complex topic or substance into smaller parts in order to gain a better understanding of it. The technique has been applied in the study of mathematics and logic since before Aristotle, though analysis as a formal concept is a relatively recent development.

Customer relationship management (CRM) is a process in which a business or other organization administers its interactions with customers, typically using data analysis to study large amounts of information.

Social network analysis Analysis of social structures using network and graph theory

Social network analysis (SNA) is the process of investigating social structures through the use of networks and graph theory. It characterizes networked structures in terms of nodes and the ties, edges, or links that connect them. Examples of social structures commonly visualized through social network analysis include social media networks, memes spread, information circulation, friendship and acquaintance networks, business networks, knowledge networks, difficult working relationships, social networks, collaboration graphs, kinship, disease transmission, and sexual relationships. These networks are often visualized through sociograms in which nodes are represented as points and ties are represented as lines. These visualizations provide a means of qualitatively assessing networks by varying the visual representation of their nodes and edges to reflect attributes of interest.

Analytics Discovery, interpretation, and communication of meaningful patterns in data

Analytics is the systematic computational analysis of data or statistics. It is used for the discovery, interpretation, and communication of meaningful patterns in data. It also entails applying data patterns towards effective decision-making. It can be valuable in areas rich with recorded information; analytics relies on the simultaneous application of statistics, computer programming and operations research to quantify performance.

Enterprise feedback management (EFM) is a system of processes and software that enables organizations to centrally manage deployment of surveys while dispersing authoring and analysis throughout an organization. EFM systems typically provide different roles and permission levels for different types of users, such as novice survey authors, professional survey authors, survey reporters and translators. EFM can help an organization establish a dialogue with employees, partners, and customers regarding key issues and concerns and potentially make customer-specific real time interventions. EFM consists of data collection, analysis and reporting.

Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modelling, and machine learning that analyze current and historical facts to make predictions about future or otherwise unknown events.

Path analysis, is the analysis of a path, which is a portrayal of a chain of consecutive events that a given user or cohort performs during a set period of time while using a website, online game, or eCommerce platform. As a subset of behavioral analytics, path analysis is a way to understand user behavior in order to gain actionable insights into the data. Path analysis provides a visual portrayal of every event a user or cohort performs as part of a path during a set period of time.

Social media measurement

Social media measurement and social media analytics or social listening is a way of computing popularity of a brand or company by extracting information from social media channels, such as blogs, wikis, news sites, micro-blogs such as Twitter, social networking sites, video/photo sharing websites, forums, message boards and user-generated content from time to time. In other words, this is the way to caliber success of social media marketing strategies used by a company or a brand. It is also used by companies to gauge current trends in the industry. The process first gathers data from different websites and then performs analysis based on different metrics like time spent on the page, click through rate, content share, comments, text analytics to identify positive or negative emotions about the brand.

Netnography, an online research method originating in ethnography, is understanding social interaction in contemporary digital communications contexts. Netnography is a specific set of research practices related to data collection, analysis, research ethics, and representation, rooted in participant observation. In netnography, a significant amount of the data originates in and manifests through the digital traces of naturally occurring public conversations recorded by contemporary communications networks. Netnography uses these conversations as data. It is an interpretive research method that adapts the traditional, in-person participant observation techniques of anthropology to the study of interactions and experiences manifesting through digital communications.

Big data Information assets characterized by high volume, velocity, and variety

Big data is a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data-processing application software. Data with many fields (columns) offer greater statistical power, while data with higher complexity may lead to a higher false discovery rate. Big data analysis challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy, and data source. Big data was originally associated with three key concepts: volume, variety, and velocity. The analysis of big data presents challenges in sampling, and thus previously allowing for only observations and sampling. Therefore, big data often includes data with sizes that exceed the capacity of traditional software to process within an acceptable time and value.

Learning analytics is the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs. The growth of online learning since the 1990s, particularly in higher education, has contributed to the advancement of Learning Analytics as student data can be captured and made available for analysis. When learners use an LMS, social media, or similar online tools, their clicks, navigation patterns, time on task, social networks, information flow, and concept development through discussions can be tracked. The rapid development of massive open online courses (MOOCs) offers additional data for researchers to evaluate teaching and learning in online environments.

Educational data mining (EDM) describes a research field concerned with the application of data mining, machine learning and statistics to information generated from educational settings. At a high level, the field seeks to develop and improve methods for exploring this data, which often has multiple levels of meaningful hierarchy, in order to discover new insights about how people learn in the context of such settings. In doing so, EDM has contributed to theories of learning investigated by researchers in educational psychology and the learning sciences. The field is closely tied to that of learning analytics, and the two have been compared and contrasted.

The fields of marketing and artificial intelligence converge in systems which assist in areas such as market forecasting, and automation of processes and decision making, along with increased efficiency of tasks which would usually be performed by humans. The science behind these systems can be explained through neural networks and expert systems, computer programs that process input and provide valuable output for marketers.

Social media intelligence refers to the collective tools and solutions that allow organizations to analyze conversations, respond to social signals and synthesize social data points into meaningful trends and analysis, based on the user's needs. Social media intelligence allows one to utilize intelligence gathering from social media sites, using both intrusive or non-intrusive means, from open and closed social networks. This type of intelligence gathering is one element of OSINT.

Topsy Labs

Topsy Labs was a social search and analytics company based in San Francisco, California. The company was a certified Twitter partner and maintained a comprehensive index of tweets, numbering in the hundreds of billions, dating back to Twitter's inception in 2006.

Behavioral analytics is a recent advancement in business analytics that reveals new insights into the behavior of consumers on eCommerce platforms, online games, web and mobile applications, and IoT. The rapid increase in the volume of raw event data generated by the digital world enables methods that go beyond typical analysis by demographics and other traditional metrics that tell us what kind of people took what actions in the past. Behavioral analysis focuses on understanding how consumers act and why, enabling accurate predictions about how they are likely to act in the future. It enables marketers to make the right offers to the right consumer segments at the right time.

Social media mining is the process of obtaining big data from user-generated content on social media sites and mobile apps in order to extract patterns, form conclusions about users, and act upon the information, often for the purpose of advertising to users or conducting research. The term is an analogy to the resource extraction process of mining for rare minerals. Resource extraction mining requires mining companies to shift through vast quantities of raw ore to find the precious minerals; likewise, social media mining requires human data analysts and automated software programs to shift through massive amounts of raw social media data in order to discern patterns and trends relating to social media usage, online behaviours, sharing of content, connections between individuals, online buying behaviour, and more. These patterns and trends are of interest to companies, governments and not-for-profit organizations, as these organizations can use these patterns and trends to design their strategies or introduce new programs, new products, processes or services.

Media intelligence uses data mining and data science to analyze public, social and editorial media content. It refers to marketing systems that synthesize billions of online conversations into relevant information. This allow organizations to measure and manage content performance, understand trends, and drive communications and business strategy.

Social media analytics is the process of gathering and analyzing data from social networks such as Facebook, Instagram, LinkedIn and Twitter. It is commonly used by marketers to track online conversations about products and companies. One author defined it as "the art and science of extracting valuable hidden insights from vast amounts of semi-structured and unstructured social media data to enable informed and insightful decision making."

Laura Wattenberg is a name expert, entrepreneur, and author of The Baby Name Wizard. She is known for deriving cultural insights from scientific analysis of name usage, as well as creating innovative interactive tools to communicate these insights. Wattenberg also co-founded the name generating website Nymbler with Icosystem. Wattenberg is frequently quoted in the media on name-related topics.

References

  1. 1 2 IBM Emerging Technology - jStart - On the Horizon - Social data analytics
  2. 2005: Baby Names, Visualization, and Social Data Analysis Martin Wattenberg. IEEE Symposium on Information Visualization.
  3. Bollen, Johan; Mao, Huinan; Zeng, Xiaojun (2011). "Twitter mood predicts the stock market". Journal of Computational Science. 2 (1): 1–8. arXiv: 1010.3003 . doi:10.1016/j.jocs.2010.12.007. S2CID   14727513.
  4. Choi, Hyunyoung; Varian, Hal (2012). "Predicting the present with google trends" (PDF). Economic Record. 88 (s1): 2–9. doi:10.1111/j.1475-4932.2012.00809.x. S2CID   155467748.