File Handling in R
Estimated reading: 4 minutes 42 views

🌐 R – Web Data Access (with Code Explanation)


🧲 Introduction – Fetching Data from the Web in R

In the world of data science, information isn’t always stored locally. Often, data is available through web APIs, URLs, or online datasets. R provides powerful tools for accessing, reading, and processing web-based data—whether it’s plain text, CSV, JSON, XML, or content from REST APIs.

🎯 In this guide, you’ll learn:

  • How to fetch data from web URLs
  • Read files (CSV, JSON, XML) hosted online
  • Access and parse REST APIs using httr and jsonlite
  • Line-by-line explanations of each example

🌍 1. Reading Web Files with Base R

✅ Reading CSV from a URL

url <- "https://people.sc.fsu.edu/~jburkardt/data/csv/airtravel.csv"
data <- read.csv(url)
head(data)

🔍 Explanation:

  • The url variable stores the CSV file’s web address.
  • read.csv() works directly with URLs just like local files.
  • head(data) shows the first few rows to confirm import.

🔗 2. Reading Web-Based JSON Data

✅ Using jsonlite::fromJSON()

library(jsonlite)
json_url <- "https://api.github.com/repos/hadley/ggplot2"
data <- fromJSON(json_url)
data$name
data$owner$login

🔍 Explanation:

  • fromJSON() downloads and converts the API JSON response into a list or data frame.
  • Access elements with $ like regular R objects.

🔄 3. Fetching REST API Data with httr

✅ Install and load httr:

install.packages("httr")
library(httr)

✅ Example: GET Request to OpenWeather API (mock URL)

res <- GET("https://api.openweathermap.org/data/2.5/weather?q=London&appid=your_api_key")
content <- content(res, "parsed")
content$weather[[1]]$main

🔍 Explanation:

  • GET() sends an HTTP GET request to the API.
  • content() extracts and parses the JSON response into R-readable format.
  • Nested data is accessed with $ and [[ ]].

✅ Replace "your_api_key" with a real API key.


📦 4. Reading XML Data from a Web URL

library(xml2)
xml_url <- "https://www.w3schools.com/xml/note.xml"
doc <- read_xml(xml_url)
xml_text(xml_find_all(doc, ".//body"))

🔍 Explanation:

  • read_xml() fetches and parses XML from the given URL.
  • xml_find_all() selects the <body> tag.
  • xml_text() extracts the content.

📥 5. Downloading Web Files to Local System

download.file("https://people.sc.fsu.edu/~jburkardt/data/csv/hw_200.csv", 
              destfile = "hw_data.csv", 
              method = "auto")

🔍 Explanation:

  • Downloads the remote CSV to a local file named "hw_data.csv".
  • method = "auto" lets R pick the best available download method.

🔐 6. APIs with Headers or Tokens

For APIs requiring authentication:

res <- GET("https://api.example.com/data",
           add_headers(Authorization = "Bearer YOUR_TOKEN"))

🔍 Explanation:

  • add_headers() lets you attach authentication or custom headers to the request.
  • Useful for OAuth tokens, API keys, etc.

🧠 Tips for Working with Web Data

TaskTip/Function
Check URL validityUse httr::status_code()
Convert JSON → Data Framejsonlite::fromJSON(flatten = TRUE)
Parse nested listsUse str() to explore structure
Handle rate-limited APIsUse Sys.sleep() between requests

📌 Summary – Recap & Next Steps

R provides seamless access to web-based resources—from static CSV files to dynamic REST APIs. These tools allow real-time integration of external data directly into your R workflow.

🔍 Key Takeaways:

  • Use read.csv() or read_xml() for static web files
  • Use jsonlite::fromJSON() for reading JSON APIs
  • Use httr::GET() and content() for full API control
  • Use download.file() to save remote data locally
  • Add headers/tokens for authenticated API access

⚙️ Real-World Relevance:
Used in web scraping, public data portals (e.g., World Bank, GitHub, NASA APIs), financial and weather services, dashboards, and automated reporting.


❓ FAQs – Web Data in R

❓ Can I read a CSV file from a URL directly in R?
✅ Yes:

data <- read.csv("https://example.com/data.csv")

❓ How do I access an API that returns JSON?
✅ Use:

fromJSON("https://api.example.com/endpoint")

❓ What if the API requires authentication?
✅ Add headers using httr::add_headers() or pass a token:

GET("url", add_headers(Authorization = "Bearer TOKEN"))

❓ How do I avoid downloading the file multiple times?
✅ Use file.exists() to check before downloading:

if (!file.exists("data.csv")) download.file(url, "data.csv")

❓ How can I check the structure of nested JSON?
✅ Use:

str(fromJSON("file.json"))

Share Now :

Leave a Reply

Your email address will not be published. Required fields are marked *

Share

R – Web Data Access

Or Copy Link

CONTENTS
Scroll to Top