Julia Silge

Last updated

Julia Silge
Julia Silge for BocoupLLC.jpg
Silge speaks in 2017
Born (1978-06-10) June 10, 1978 (age 45)
Alma mater
Known for
Scientific career
Fields
Institutions
Website juliasilge.com

Julia Silge (born June 10, 1978) is an American data scientist and software engineer. She has developed tools for statistical modelling in the R programming language, [1] [2] including the text mining package tidytext. [3] Silge currently works for Posit, formerly known as RStudio. [1]

Contents

Education and career

Silge studied physics at Texas A&M University, graduating in 2000. She obtained her M.A. (2002) and PhD (2005) in astronomy from the University of Texas at Austin. [2] [4] She was an adjunct professor at University of New Haven and Quinnipiac University from 2006 to 2008.[ citation needed ]

Silge has worked as a data scientist for several companies, most recently Stack Overflow and RStudio. [1] [2] At Stack Overflow, she researched the popularity of different programming languages [5] and skills for technologists. [6] She also began working on tidytext, an R package for text mining, with colleague David Robinson. Their book Text Mining with R: A Tidy Approach (2017) drew on examples of text analysis ranging from Jane Austen novels, [7] popular songs, [8] NASA metadata, and Twitter archives. [9]

In February 2017, Silge made the news when she used a note attached to a pizza delivery to contact her senator Orrin Hatch to object to the nomination of Betsy DeVos as Secretary of Education, after failing to reach Hatch by phone. [10] [11]

Selected publications

Related Research Articles

<span class="mw-page-title-main">Susan Collins</span> American politician (born 1952)

Susan Margaret Collins is an American politician serving as the senior United States senator from Maine. A member of the Republican Party, she has held her seat since 1997 and is Maine's longest-serving senator.

<span class="mw-page-title-main">R (programming language)</span> Programming language for statistics

R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinformaticians and statisticians for data analysis and developing statistical software. The core R language is augmented by a large number of extension packages containing reusable code and documentation.

<span class="mw-page-title-main">Betsy DeVos</span> American politician & philanthropist

Elisabeth Dee DeVos is an American politician, philanthropist, and former government official who served as the 11th United States secretary of education from 2017 to 2021. DeVos is known for her conservative political activism, and particularly her support for school choice, school voucher programs, and charter schools. She was Republican national committeewoman for Michigan from 1992 to 1997 and served as chair of the Michigan Republican Party from 1996 to 2000, and again from 2003 to 2005. She has advocated for the Detroit charter school system and she is a former member of the board of the Foundation for Excellence in Education. She has served as chair of the board of the Alliance for School Choice and the Acton Institute and headed the All Children Matter PAC.

<span class="mw-page-title-main">LAMP (software bundle)</span> Acronym for a common web hosting solution

LAMP is an acronym denoting one of the most common software stacks for many of the web's most popular applications. However, LAMP now refers to a generic software stack model and its components are largely interchangeable.

<span class="mw-page-title-main">Markdown</span> Plain text markup language

Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber created Markdown in 2004 as a markup language that is appealing to human readers in its source code form. Markdown is widely used in blogging, instant messaging, online forums, collaborative software, documentation pages, and readme files.

The following tables provide a comparison of numerical analysis software.

A software repository, or repo for short, is a storage location for software packages. Often a table of contents is also stored, along with metadata. A software repository is typically managed by source or version control, or repository managers. Package managers allow automatically installing and updating repositories, sometimes called "packages".

<span class="mw-page-title-main">RStudio</span> Integrated development environment for R

RStudio is an integrated development environment for R, a programming language for statistical computing and graphics. It is available in two formats: RStudio Desktop is a regular desktop application while RStudio Server runs on a remote server and allows accessing RStudio using a web browser.

<span class="mw-page-title-main">Julia (programming language)</span> Dynamic programming language

Julia is a high-level, general-purpose dynamic programming language. Its features are well suited for numerical analysis and computational science.

<span class="mw-page-title-main">Hadley Wickham</span> New Zealand statistician

Hadley Alexander Wickham is a New Zealand statistician known for his work on open-source software for the R statistical programming environment. He is the chief scientist at Posit, PBC and an adjunct professor of statistics at the University of Auckland, Stanford University, and Rice University. His work includes the data visualisation system ggplot2 and the tidyverse, a collection of R packages for data science based on the concept of tidy data.

<span class="mw-page-title-main">Visual Studio Code</span> Source code editor developed by Microsoft

Visual Studio Code, also commonly referred to as VS Code, is a source-code editor made by Microsoft with the Electron Framework, for Windows, Linux and macOS. Features include support for debugging, syntax highlighting, intelligent code completion, snippets, code refactoring, and embedded Git. Users can change the theme, keyboard shortcuts, preferences, and install extensions that add functionality.

<span class="mw-page-title-main">Julia Keleher</span> 40th Secretary of the Puerto Rico Department of Education

Julia Beatrice Keleher is an American educator, entrepreneur, author and speaker. She gained notoriety while serving as the Secretary of the Puerto Rico Department of Education (PRDE) from 2017 to 2019.

rnn (software) Machine Learning framework written in the R language

rnn is an open-source machine learning framework that implements recurrent neural network architectures, such as LSTM and GRU, natively in the R programming language, that has been downloaded over 100,000 times.

Jennifer "Jenny" Bryan is a data scientist and an associate professor of statistics at the University of British Columbia where she developed the Master in Data Science Program. She is a statistician and software engineer at RStudio from Vancouver, Canada and is known for creating open source tools which connect R to Google Sheets and Google Drive.

The tidyverse is a collection of open source packages for the R programming language introduced by Hadley Wickham and his team that "share an underlying design philosophy, grammar, and data structures" of tidy data. Characteristic features of tidyverse packages include extensive use of non-standard evaluation and encouraging piping.

David G. Robinson is a data scientist at the Heap analytics company. He is a co-author of the tidytext R package and the O’Reilly book, Text Mining with R. Robinson has previously worked as a chief data scientist at DataCamp and as a data scientist at Stack Overflow. He was also a data engineer at Flatiron Health in 2019.

<span class="mw-page-title-main">R package</span> Extensions to the R statistical programming language

R packages are extensions to the R statistical programming language. R packages contain code, data, and documentation in a standardised collection format that can be installed by users of R, typically via a centralised software repository such as CRAN. The large number of packages available for R, and the ease of installing and using them, has been cited as a major factor driving the widespread adoption of the language in data science.

FastAPI is a modern web framework for building RESTful APIs in Python. It was first released in 2018 and has quickly gained popularity among developers due to its ease of use, speed and robustness. FastAPI is based on Pydantic and uses type hints to validate, serialize, and deserialize data. It also automatically generates OpenAPI documentation for APIs built with it.

References

  1. 1 2 3 "About RStudio". RStudio. Archived from the original on February 20, 2021. Retrieved November 2, 2020.
  2. 1 2 3 "Stack Overflow profile of Julia Silge". Stack Overflow. Retrieved March 3, 2021.
  3. Tache, Nicole (July 26, 2017). "R's tidytext turns messy text into valuable insight". O'Reilly Media. Retrieved March 22, 2018.
  4. "Julia Silge Resume". Julia Silge. Retrieved February 4, 2018.
  5. "Which programming languages earn you the most money? Use this calculator to check". ZDNet. Retrieved February 23, 2018.
  6. "These are the 10 skills to learn if you want to advance in a career in tech". Business Insider. Retrieved February 23, 2018.
  7. "The Life Changing Magic of Tidying Text". Julia Silge. Retrieved February 23, 2018.
  8. "The states that Americans sing about most". The Washington Post. Retrieved February 23, 2018.
  9. "R's tidytext turns messy text into valuable insight". The Washington Post. July 26, 2017. Retrieved March 31, 2018.
  10. "She had something to say about Betsy DeVos. So she sent her senator a pizza — with a message". The Washington Post. Retrieved February 4, 2018.
  11. Castrodale, Jelisa (February 6, 2017). "Senator's Voicemail Was Full, So Concerned Woman Sent Pizza to Protest DeVos". Vice. Retrieved November 2, 2020.