Social data revolution

Last updated

The social data revolution is the shift in human communication patterns towards increased personal information sharing and its related implications, made possible by the rise of social networks in the early 2000s. This phenomenon has resulted in the accumulation of unprecedented amounts of public data. [1]

Contents

This large and frequently updated data source has been described as a new type of scientific instrument for the social sciences. [2] Several independent researchers have used social data to "nowcast" and forecast trends such as unemployment, flu outbreaks, [3] mood of whole populations, [4] travel spending and political opinions in a way that is faster, more accurate and cheaper than standard government reports or Gallup polls. [2]

Social data refers to data individuals create that is knowingly and voluntarily shared by them. Cost and overhead previously rendered this semi-public form of communication unfeasible, but advances in social networking technology from 2004–2010 has made broader concepts of sharing possible. [5] The types of data users are sharing include geolocation, medical data, [6] dating preferences, open thoughts, interesting news articles, etc.

The social data revolution enables not only new business models like the ones on Amazon.com but also provides large opportunities to improve decision-making for public policy and international development. [7]

The analysis of large amounts of social data leads to the field of computational social science. Classic examples include the study of media content [8] or social media content. [3] [4] [9]

Evolution of social data

Every internet activity leaves behind traces of data (a digital footprint) which can be used to learn more about the user. [10] As use of the internet is becoming more widespread, the datafication of the world is progressing rapidly: Currently, around 16 zettabytes of data are produced per year and for the year 2025 163 zettabytes of data are expected. [11] This has led to data becoming a critical commodity. [10] This ties together all societal actors: Public institutions, private firms, as well as individuals, each relying on data in a unique way.

Governments have been collecting data for centuries to ensure the continuance of institutional systems, through limiting the risk of defaulting credits, collecting tax based on income and providing the necessary infrastructure under consideration of their citizens' demographic distribution. [12] In its beginnings, this data entailed written information for record keeping and control, including a census system. [12]

This analogue process was very time- and cost-intensive, leaving little room for interpreting larger data sets. [12] Meanwhile, corporate technological developments have moved this offline data into the digital age, allowing visualization and data analytics. [12] [10] In the public sphere, connecting the survey and poll methodologies with database computing, resulted in the ability to gather and store large data sets on individuals. [10]

Web 2.0 and social network sites

Over the last few decades, the internet has shifted from being used mostly as a source of information about the world to being primarily used for communication, user-generated content, data sharing, and community building. [13] This is what many consider to be the development of "Web 2.0" social network sites such as Facebook and YouTube are the foundation of the development of Web 2.0 and the shift to social data sharing. [13]

Early examples of social data websites are Craigslist and the wishlists of Amazon.com. Both enable users to communicate information to anybody who is looking for it. They differ in their approach to identity. Craigslist leverages the power of anonymity, while Amazon.com leverages the power of persistent identity, based on the history of the customer with the firm. The job market is even being shaped by the information people share about themselves on sites like LinkedIn and Facebook. [14]

Examples of more sophisticated social data sites are Twitter and Facebook. On Twitter, sending a message or tweet is as simple as sending an SMS text message. Twitter made this C2W, customer to the world: Any tweet a user sends can potentially be read by the entire world. Facebook focuses on interactions between friends, C2C in traditional language. It provides many ways for collecting data from its users: "tag" a friend in a photo, "comment" on what they posted, or just "like" it. These data are the basis for sophisticated models of the relationships between users. They can be used to significantly increase the relevance of what is shown to the user, and for advertising purposes. [15]

By 2009, the popularity of social networking sites had increased to four times of what it had been in 2005. [16] As of 2013, Twitter has over 250 million users sharing almost 500 million tweets per day, and Facebook has well over one billion users around the world. [17]

Business sector and social data

Companies often use the data that is shared via social networking sites and other forms of data sharing avenues, advertisers, etc. [18] Social networking sites, for example, can sell user data to advertisers and other entities which they can then influence consumer decisions. [13] Data mining is also used to gather this information. [18]

While websites and other applications were the origins of this data collection, with improvements in technology, many devices that are used in daily life have the ability to collect data on individuals and therefore are increasing the amount of personal data that is available (ex. smartphones, tech watches, music devices, etc.). [19] [20]

This growth of people's digital identity – the information available via these electronic sources- is being used by companies and organizations to improve products and services and to reduce costs by targeting what consumers want/expect. [20] The data that can be gathered can include shopping experiences, social media preferences, demographic information and more. [18]

Using this data can allow for better personalization of products and has become an expected and vital aspect of product use and production. [19] The data that is accessible about consumers can be used to infer behavioral patterns of consumers. [21] For example, location information is used to assess when and where consumers are going to target ads and promotions based on what stores consumers are going to. [21] Online retailers also have gained insight as to how better personalize the online shopping experience through data gathered during the online transaction. [22]

Businesses can even use consumer data to determine whether different shelf spacing of products has an effect on consumer purchasing decisions as well as assess potential cross-item marketing potentials based on items often purchased together. [23]

Social commerce

While businesses and advertisers often take advantage of the consumer data available, consumers also use other users' information for their purchase decisions. Social commerce sites are where consumers share product/service experiences and opinions and other information. [24] A famous example of such a site is Pinterest which has over 100 million users. [24] These sites and other online sources of product/brand information are influential on consumer's purchasing decisions. [25] It is estimated that about 67% of online customers use this information in making their purchase decisions. [24] These sites create an environment that is considered trusted by consumers since the information is coming from other consumers. [24]

Other uses of social data

With the vast amount of data available about individuals that are accessible, the potential uses of this information are growing.

The healthcare sector has many potential uses for this data. Information gathered from social media, and other social data sharing sources can be used to predict the flu, disease outbreaks, how emergency responses are handled, and more. [26] With the use of Twitter and geotags, medical researchers can evaluate the health of a particular neighborhood and use that information to provide better outreach and services. [26] Medtronic has developed a digital blood glucose meter that allows health care providers and patients know about low levels. [19]

Social data can also be used to assess reactions to crises. [27] After Hurricane Sandy, researchers used Twitter to evaluate the emotions and issues that those affected were facing. [27] This information can potentially be used to help better prepare and respond to future crises.

This data can be used to assist with urban planning. The city of Boston has used rider information from Uber to improve transportation planning and road maintenance. [19]

Computational social science

Using social data for research purposes has led to the development of computational social science. Computational social science combines social science, computer science, and network science. [28] This field emerged in 2009. [29] Before the rise of social data and the technological advances that supported it, researchers were limited to a narrow view of information based on individuals since their primary form of research relied on interviews. [29] With the vast amount of social data available today, researchers can now analyze a wider group and can obtain a broader view of information. They can use social networks, cell phone data, and perform online experiments that allow them to gather more information than before. [29]

Privacy concerns

With the amount of data available about individuals accessible by many sources, privacy has become a major concern. Security breaches of customer and other social information such as the compromise of more than 56 million Home Depot customers' credit card information [19] have impacted the concern of privacy with social data. How companies are using, and the potential misuse of the personal information gathered is a concern for the majority of consumers. [19] [20] Despite this, many people do not know how social networking sites and other sources are using and selling their data. [30] In 2014 study, only 25% of online users knew that their location could be accessed and only 14% knew that their web-surfing history could be accessed and shared. [19]

Even though privacy concern is a critical factor in people's sharing of personal information on the internet and overall internet involvement, [22] most people are willing to share this information if the benefits of doing so outweigh the potential privacy and security costs. [18] [20] Consumers enjoy the personalization of products and services that are possible because of this information gathering and despite the concerns, continue to use them. [19]

International development

"From a macro-perspective, it is expected that Big Data-informed decision-making will have a similar positive effect on efficiency and productivity as ICT have had during the recent decade."

Hilbert 2013

In his study of the data revolution in international development, Social Sciences Professor at UC Davis, Martin Hilbert, argued that the natural next step from information societies, fueled by ICT, since the late 1990s are knowledge societies informed by Big Data analysis. Decision-making informed by big data analysis has improved both efficiency and productivity in the developed world. Hilbert examines the challenges and potential of the data revolution on "the unruly world of international development." [7]

Types of data

Hilbert identified four types of data available in large quantities by 2013: words, locations, nature, and behavior. [7]

Words

Individual interactions with the internet, such as words in comments, social media postings, and Google search term volumes, offer an increasingly large source of big data. Typically statistics are generated through a census or a probability survey, for example, the Annual Social and Economic Supplement (ASEC), Current Population Survey (CPS), American Community Survey (ACS), National Health Interview Survey (NHIS) in the United States or administrative records, such as payroll, unemployment, Social Security income taxes, scanner data and credit card data and other commercial transaction records. [31]

"Google has analyzed clusters of search terms by region in the United States to predict flu outbreaks faster than was possible using hospital admission records."

Shaw 2014 "Why "Big Data" Is a Big Deal"

Weatherhead University Professor Gary King described how the revolution is not just regarding the quantity of data available but in the ability to do something with the data to benefit society. [32]

Location

Global Positioning System (GPS)-enabled mobile tablets, phones, Radio-frequency identification (RFID) chips (part of Automatic identification and data capture (AIDC) technologies), telematics, Location-based games, etc. provide data on absolute location and relative movement.

Nature

Hilbert categorizes data on natural processes under 'Nature' which includes sensors that provide data on moisture in the air and temperature. [7]

Behavior

Data can be generated from user-behavior in multiplayer online games, [7] such as League of Legends , World of Warcraft , Minecraft , Call of Duty , and Dota 2 . Nathan Eagle's, a computer scientist at the Santa Fe Institute in New Mexico, began using cellphones in the early 2000s to collect accurate, large-scale data about real social interactions. [33] [34] [35] The project was named one of the "10 Technologies Most Likely To Change The Way We Live" by the MIT Technology Review. [36]

See also

Related Research Articles

Privacy is the ability of an individual or group to seclude themselves or information about themselves, and thereby express themselves selectively.

Personalized marketing, also known as one-to-one marketing or individual marketing, is a marketing strategy by which companies leverage data analysis and digital technology to deliver individualized messages and product offerings to current or prospective customers. Advancements in data collection methods, analytics, digital electronics, and digital economics, have enabled marketers to deploy more effective real-time and prolonged customer experience personalization tactics.

Internet privacy involves the right or mandate of personal privacy concerning the storing, repurposing, provision to third parties, and displaying of information pertaining to oneself via Internet. Internet privacy is a subset of data privacy. Privacy concerns have been articulated from the beginnings of large-scale computer sharing.

Image sharing, or photo sharing, is the publishing or transfer of digital photos online. Image sharing websites offer services such as uploading, hosting, managing and sharing of photos. This function is provided through both websites and applications that facilitate the upload and display of images. The term can also be loosely applied to the use of online photo galleries that are set up and managed by individual users, including photoblogs. Sharing means that other users can view but not necessarily download images, and users can select different copyright options for their images.

User-generated content Online content created by users

User-generated content (UGC), alternatively known as user-created content (UCC), is any form of content, such as images, videos, text, and audio, that has been posted by users on online platforms such as social media, discussion forums and wikis. It is a product consumers create to disseminate information about online products or the firms that market them.

Social media Internet services for sharing personal information and ideas

Social media are interactive digital channels that facilitate the creation and sharing of information, ideas, interests, and other forms of expression through virtual communities and networks. While challenges to the definition of social media arise due to the variety of stand-alone and built-in social media services currently available, there are some common features:

  1. Social media are interactive Web 2.0 Internet-based applications.
  2. User-generated content—such as text posts or comments, digital photos or videos, and data generated through all online interactions—is the lifeblood of social media.
  3. Users create service-specific profiles for the website or app that are designed and maintained by the social media organization.
  4. Social media helps the development of online social networks by connecting a user's profile with those of other individuals or groups.
Digital marketing Marketing of products or services using digital technologies or digital tools

Digital marketing is the component of marketing that uses the Internet and online based digital technologies such as desktop computers, mobile phones and other digital media and platforms to promote products and services. Its development during the 1990s and 2000s changed the way brands and businesses use technology for marketing. As digital platforms became increasingly incorporated into marketing plans and everyday life, and as people increasingly use digital devices instead of visiting physical shops, digital marketing campaigns have become prevalent, employing combinations of search engine optimization (SEO), search engine marketing (SEM), content marketing, influencer marketing, content automation, campaign marketing, data-driven marketing, e-commerce marketing, social media marketing, social media optimization, e-mail direct marketing, display advertising, e–books, and optical disks and games have become commonplace. Digital marketing extends to non-Internet channels that provide digital media, such as television, mobile phones, callback, and on-hold mobile ring tones. The extension to non-Internet channels differentiates digital marketing from online advertising.

Microblogging is an online broadcast medium that exists as a specific form of blogging. A micro-blog differs from a traditional blog in that its content is typically smaller in both actual and aggregated file size. Micro-blogs "allow users to exchange small elements of content such as short sentences, individual images, or video links", which may be the major reason for their popularity. These small messages are sometimes called micro posts.

Targeted advertising Form of advertising

Targeted advertising is a form of advertising, including online advertising, that is directed towards an audience with certain traits, based on the product or person the advertiser is promoting. These traits can either be demographic with a focus on race, economic status, sex, age, generation, level of education, income level, and employment, or psychographic focused on the consumer values, personality, attitude, opinion, lifestyle and interest. This focus can also entail behavioral variables, such as browser history, purchase history, and other recent online activities. Targeted advertising is concentrated in certain traits and consumers who are likely to have a strong preference. These individuals will receive messages instead of those who have no interest and whose preferences do not match a particular product's attributes. This eliminates waste.

Social network advertising, also social media targeting, is a group of terms that are used to describe forms of online advertising/digital marketing that focus on social networking services. One of the major benefits of this type of advertising is that advertisers can take advantage of the users' demographic information and target their ads appropriately.

Behavioral retargeting is a form of online targeted advertising by which online advertising is targeted to consumers based on their previous internet behaviour. Retargeting tags online users by including a pixel within the target webpage or email, which sets a cookie in the user's browser. Once the cookie is set, the advertiser is able to show ads to that user elsewhere on the internet via an ad exchange.

Social media marketing is the use of social media platforms and websites to promote a product or service. Although the terms e-marketing and digital marketing are still dominant in academia, social media marketing is becoming more popular for both practitioners and researchers. Most social media platforms have built-in data analytics tools, enabling companies to track the progress, success, and engagement of ad campaigns. Companies address a range of stakeholders through social media marketing, including current and potential customers, current and potential employees, journalists, bloggers, and the general public. On a strategic level, social media marketing includes the management of a marketing campaign, governance, setting the scope and the establishment of a firm's desired social media "culture" and "tone."

Since the arrival of early social networking sites in the early 2000s, online social networking platforms have expanded exponentially, with the biggest names in social media in the mid-2010s being Facebook, Instagram, Twitter and Snapchat. The massive influx of personal information that has become available online and stored in the cloud has put user privacy at the forefront of discussion regarding the database's ability to safely store such personal information. The extent to which users and social media platform administrators can access user profiles has become a new topic of ethical consideration, and the legality, awareness, and boundaries of subsequent privacy violations are critical concerns in advance of the technological age.

Active users Performance metric for success of an internet product

Active users is a measurement metric that is commonly used to measure the level of engagement for a particular product or object, by quantifying the number of active interactions from visitors within a relevant range of time . The metric has many uses in both commerce and academia, such as on social networking services, online games, or mobile apps. Although having extensive uses in digital behavioural learning, prediction and reporting, it also has impacts on the privacy and security, and ethical factors should be considered thoroughly. Like any metric, active users may have limitations and criticisms. Active Users is relatively new or neologistic in nature, that became important with the rise of the commercialised internet, with uses in communication and social-networking. It measures how many users visit or interact with the product or service over a given interval or period. This metric is commonly assessed per month as monthly active users (MAU), per week as weekly active users (WAU), per day as daily active users (DAU) or peak concurrent users (PCU).

The fields of marketing and artificial intelligence converge in systems which assist in areas such as market forecasting, and automation of processes and decision making, along with increased efficiency of tasks which would usually be performed by humans. The science behind these systems can be explained through neural networks and expert systems, computer programs that process input and provide valuable output for marketers.

Dataveillance Monitoring and collecting online data and metadata

Dataveillance is the practice of monitoring and collecting online data as well as metadata. The word is a portmanteau of data and surveillance. Dataveillance is concerned with the continuous monitoring of users' communications and actions across various platforms. For instance, dataveillance refers to the monitoring of data resulting from credit card transactions, GPS coordinates, emails, social networks, etc. Using digital media often leaves traces of data and creates a digital footprint of our activity. Unlike sousveillance, this type of surveillance is not often known and happens discreetly. Dataveillance may involve the surveillance of groups of individuals. There exist three types of dataveillance: personal dataveillance, mass dataveillance, and facilitiative mechanisms.

Data philanthropy describes a form of collaboration in which private sector companies share data for public benefit. There are multiple uses of data philanthropy being explored from humanitarian, corporate, human rights, and academic use. Since introducing the term in 2011, the United Nations Global Pulse has advocated for a global "data philanthropy movement".

Social profiling is the process of constructing a social media user's profile using his or her social data. In general, profiling refers to the data science process of generating a person's profile with computerized algorithms and technology. There are various platforms for sharing this information with the proliferation of growing popular social networks, including but not limited to LinkedIn, Google+, Facebook and Twitter.

The gathering of personally identifiable information (PII) is the practice of collecting public and private personal data that can be used to identify an individual for both legal and illegal applications. PII owners often view PII gathering as a threat and violation of their privacy. Meanwhile, entities such as information technology companies, governments, and organizations use PII for data analysis of consumer shopping behaviors, political preference, and personal interests.

Privacy settings are "the part of a social networking website, internet browser, piece of software, etc. that allows you to control who sees information about you". With the growing prevalence of social networking services, opportunities for privacy exposures also grows. Privacy settings allow a person to control what information is shared on these platforms.

References

  1. Weigend, Andreas. "The Social Data Revolution". Harvard Business Review . Retrieved July 15, 2009.
  2. 1 2 Hubbard, Douglas (2011). Pulse: The New Science of Harnessing Internet Buzz to Track Threats and Opportunities. John Wiley & Sons.
  3. 1 2 Vasileios Lampos; Nello Cristianini (2012). "Nowcasting Events from the Social Web with Statistical Learning". ACM Transactions on Intelligent Systems and Technology. 3 (4): 1–22. doi:10.1145/2337542.2337557. S2CID   8297993. 72.
  4. 1 2 Thomas Lansdall‐Welfare; Vasileios Lampos; Nello Cristianini (August 2012). "Nowcasting the mood of the nation". Significance Magazine. Vol. 9, no. 4. pp. 26–28. doi:10.1111/j.1740-9713.2012.00588.x.
  5. Swathi Dharshana Naidu (Dec 2009). "Social Data Revolution". Posterous . Retrieved 2010-07-08.
  6. Dyson, Esther (March 23, 2010). "Health, not Health Care!". Huffington Post . Retrieved 2010-06-08.
  7. 1 2 3 4 5 Hilbert, Martin (2013). "Big Data for Development: From Information- to Knowledge Societies". SSRN Scholarly Paper (2205145). Rochester, NY: Social Science Research Network. SSRN   2205145.{{cite journal}}: Cite journal requires |journal= (help)
  8. Detecting macropatterns in global media content
  9. Twitter Mood: The Effects of the Recession on Public Mood in the UK
  10. 1 2 3 4 West, Sarah Myers (2017). "Data Capitalism: Redefining the Logics of Surveillance and Privacy". Business & Society: 1–22.
  11. Cave, Andrew (13 April 2017). "What Will We Do When the World's Data Hits 163 Zettabytes In 2025?". Forbes. Retrieved 30 May 2018.
  12. 1 2 3 4 Mayer-Schönberger, Viktor; Cukier, Kenneth (2013). Big Data: A Revolution That Will Transform How We Live, Work and Think. London, UK: John Murray (Publishers).
  13. 1 2 3 Fuchs, Christian. 2011. "Web 2.0, Prosumption, and Surveillance." Surveillance & Society 8(3): 288-309.
  14. Reid Hoffman (June 26, 2009). "Future of Jobs & Social Data Revolution". Techaffair.com. Retrieved 2010-07-02.
  15. Dyson, Esther (February 11, 2008). "The Coming Ad Revolution". The Wall Street Journal . Retrieved 2010-04-10.
  16. Donde, Deepa S., Chopade, Neha, and Ranjith, P.V. 2012. "Social networking sites: a new era of 21st century." SIES Journal of Management 8(1): 66-73.
  17. Osatuyi, Babajide. 2013. "Information sharing on social media sites." Computers in Human Behavior 29(6): 2622-2631.
  18. 1 2 3 4 Jai, Tun-Min, and King, Nancy J. 2016. "Privacy versus reward: Do loyalty programs increase consumers' willingness to share personal information with third-party advertisers and data brokers?" Journal of Retailing and Consumer Services 28: 296-303.
  19. 1 2 3 4 5 6 7 8 Morey, Timothy, Forbath, Theodore, and Schoop, Allison. 2015. "Customer data: designing for transparency and trust." Harvard Business Review 93(5): 96-105
  20. 1 2 3 4 Roeber, Bjoern; Rehse, Olaf; Knorrek, Robert; Thomsen, Benjamin (2015). "Personal data: How context shapes consumers' data sharing with organizations from various sectors". Electronic Markets. 25 (2): 95. doi:10.1007/s12525-015-0183-0. S2CID   28025341.
  21. 1 2 Smith, Natasha. 2015. "The datafication of marketing." DM News: 16+. Retrieved from http://go.galegroup.com/
  22. 1 2 Lee, Seungsin; Lee, Younghee; Lee, Joing-In; Park, Jungkun (2015). "Personalized E-Services: Consumer Privacy Concern and Information Sharing". Social Behavior and Personality. 43 (5): 729. doi:10.2224/sbp.2015.43.5.729.
  23. Tsai, Chieh-Yuan; Huang, Sheng-Hsiang (2014). "A data mining approach to optimise shelf space allocation in consideration of customer purchase and moving behaviours". International Journal of Production Research. 53 (3): 850. doi:10.1080/00207543.2014.937011. S2CID   110688389.
  24. 1 2 3 4 Liu, Libo, Cheung, Christy M.K., and Lee, Matthew K.O. 2016. "An empirical investigation of information sharing behavior on social commerce sites." International Journal of Information Management 36(5): 686-699.
  25. Chen, Jie, Teng, Lefa, Yu, Ying, and Yu, Xeer. 2016. "The effect of online information sources on purchase intentions between consumers with high and low susceptibility to informational influence." Journal of Business Research 69(2): 467-475.
  26. 1 2 Nguyen, Duc T., and Jung, Jai E. 2016. "Real-time event detection for online behavioral analysis of big social data." Future Generation Computer Systems 66: 137-145.
  27. 1 2 Spence, Patric R., Lachlan, Kenneth A., and Rainear, Adam M. 2016. "Social media and crisis research: Data collection and directions." Computers in Human Behavior 54: 667-672.
  28. Chang, R. M., Kauffman, R.J., and Kwon, Y. 2014. Understanding the paradigm shift to computational social science in the presence of big data. Decision, 63, 67-80.
  29. 1 2 3 Mann, A. 2016. Core concept: computational social science. PNAS, 113(3). 468-470. doi: 10.1073/pnas.1524881113
  30. Lilley, Stephen, Frances S. Grodzinsky and Andra Gumbus. 2012. "Revealing the Commercialized and Compliant Facebook User." Journal of Information, Communication & Ethics in Society 10(2):82-92
  31. "Survey Methodology" (PDF), StatsCan, December 19, 2014, retrieved December 19, 2013
  32. Shaw, Jonathan (March 2014), "Why "Big Data" Is a Big Deal: Information science promises to change the world", Harvard Magazine, retrieved December 23, 2016
  33. Nature News, April 2009
  34. Reality Mining downloads
  35. Reality mining whitepaper
  36. Eagle's Harvard Biography