File Handling in R
Estimated reading: 4 minutes 278 views

R – Web Data Access (with Code Explanation)


Introduction – Fetching Data from the Web in R

In the world of data science, information isn’t always stored locally. Often, data is available through web APIs, URLs, or online datasets. R provides powerful tools for accessing, reading, and processing web-based data—whether it’s plain text, CSV, JSON, XML, or content from REST APIs.

In this guide, you’ll learn:

  • How to fetch data from web URLs
  • Read files (CSV, JSON, XML) hosted online
  • Access and parse REST APIs using httr and jsonlite
  • Line-by-line explanations of each example

1. Reading Web Files with Base R

Reading CSV from a URL

url <- "https://people.sc.fsu.edu/~jburkardt/data/csv/airtravel.csv"
data <- read.csv(url)
head(data)

Explanation:

  • The url variable stores the CSV file’s web address.
  • read.csv() works directly with URLs just like local files.
  • head(data) shows the first few rows to confirm import.

2. Reading Web-Based JSON Data

Using jsonlite::fromJSON()

library(jsonlite)
json_url <- "https://api.github.com/repos/hadley/ggplot2"
data <- fromJSON(json_url)
data$name
data$owner$login

Explanation:

  • fromJSON() downloads and converts the API JSON response into a list or data frame.
  • Access elements with $ like regular R objects.

3. Fetching REST API Data with httr

Install and load httr:

install.packages("httr")
library(httr)

Example: GET Request to OpenWeather API (mock URL)

res <- GET("https://api.openweathermap.org/data/2.5/weather?q=London&appid=your_api_key")
content <- content(res, "parsed")
content$weather[[1]]$main

Explanation:

  • GET() sends an HTTP GET request to the API.
  • content() extracts and parses the JSON response into R-readable format.
  • Nested data is accessed with $ and [[ ]].

Replace "your_api_key" with a real API key.


4. Reading XML Data from a Web URL

library(xml2)
xml_url <- "https://www.w3schools.com/xml/note.xml"
doc <- read_xml(xml_url)
xml_text(xml_find_all(doc, ".//body"))

Explanation:

  • read_xml() fetches and parses XML from the given URL.
  • xml_find_all() selects the <body> tag.
  • xml_text() extracts the content.

5. Downloading Web Files to Local System

download.file("https://people.sc.fsu.edu/~jburkardt/data/csv/hw_200.csv", 
              destfile = "hw_data.csv", 
              method = "auto")

Explanation:

  • Downloads the remote CSV to a local file named "hw_data.csv".
  • method = "auto" lets R pick the best available download method.

6. APIs with Headers or Tokens

For APIs requiring authentication:

res <- GET("https://api.example.com/data",
           add_headers(Authorization = "Bearer YOUR_TOKEN"))

Explanation:

  • add_headers() lets you attach authentication or custom headers to the request.
  • Useful for OAuth tokens, API keys, etc.

Tips for Working with Web Data

TaskTip/Function
Check URL validityUse httr::status_code()
Convert JSON → Data Framejsonlite::fromJSON(flatten = TRUE)
Parse nested listsUse str() to explore structure
Handle rate-limited APIsUse Sys.sleep() between requests

Summary – Recap & Next Steps

R provides seamless access to web-based resources—from static CSV files to dynamic REST APIs. These tools allow real-time integration of external data directly into your R workflow.

Key Takeaways:

  • Use read.csv() or read_xml() for static web files
  • Use jsonlite::fromJSON() for reading JSON APIs
  • Use httr::GET() and content() for full API control
  • Use download.file() to save remote data locally
  • Add headers/tokens for authenticated API access

Real-World Relevance:
Used in web scraping, public data portals (e.g., World Bank, GitHub, NASA APIs), financial and weather services, dashboards, and automated reporting.


FAQs – Web Data in R

Can I read a CSV file from a URL directly in R?
Yes:

data <- read.csv("https://example.com/data.csv")

How do I access an API that returns JSON?
Use:

fromJSON("https://api.example.com/endpoint")

What if the API requires authentication?
Add headers using httr::add_headers() or pass a token:

GET("url", add_headers(Authorization = "Bearer TOKEN"))

How do I avoid downloading the file multiple times?
Use file.exists() to check before downloading:

if (!file.exists("data.csv")) download.file(url, "data.csv")

How can I check the structure of nested JSON?
Use:

str(fromJSON("file.json"))

Share Now :
Share

R – Web Data Access

Or Copy Link

CONTENTS
Scroll to Top