3️⃣ 📂 Pandas Reading & Writing Files (I/O Tools)
Estimated reading: 3 minutes 32 views

🧾 Pandas Read/Write JSON Files – Import and Export Structured JSON Data


🧲 Introduction – Why Use JSON with Pandas?

JSON (JavaScript Object Notation) is a popular format for structured data interchange. It’s widely used in APIs, configurations, logs, and NoSQL systems. Pandas provides powerful functions to read and write JSON files directly into DataFrames, making it easy to process and analyze semi-structured data.

🎯 In this guide, you’ll learn:

  • How to read and write JSON files using read_json() and to_json()
  • Supported orientations (records, split, index, columns)
  • Formatting options like indenting, line-delimited JSON, and compression
  • Real-world examples and API-compatible formats

📥 1. Reading a Simple JSON File

import pandas as pd

df = pd.read_json('data.json')
print(df.head())

✅ Assumes JSON is a list of records (dictionaries).

Example data.json content:

[
  {"Name": "Alice", "Age": 25},
  {"Name": "Bob", "Age": 30}
]

👉 Output:

    Name  Age
0  Alice   25
1    Bob   30

🔄 2. Supported Orientations in read_json()

OrientationStructure Expected
recordsList of dicts (rows) → default
splitDict with index, columns, data
indexDict of dicts; outer keys = row labels
columnsDict of lists; keys = column names
valuesList of lists without labels
tableJSON Table Schema-compliant format
df = pd.read_json('data.json', orient='records')

✅ Always match orient with the structure of your JSON file.


📤 3. Writing JSON with to_json()

df.to_json('output.json', orient='records', indent=2)

👉 Output (formatted):

[
  {
    "Name": "Alice",
    "Age": 25
  },
  {
    "Name": "Bob",
    "Age": 30
  }
]

indent makes it human-readable. Use orient to control structure.


🧾 4. JSON Output Orient Options

OrientDescription
recordsList of dictionaries (best for APIs)
splitColumns + index + data
indexNested dict with index as outer keys
columnsDict of columns
valuesPure 2D list format
tableJSON Table Schema format
df.to_json('out.json', orient='split')

🌐 5. Read JSON from a URL or API

df = pd.read_json('https://api.example.com/data.json')

✅ Works if the endpoint returns a valid JSON structure.


🧪 6. Line-Delimited JSON (lines=True)

Used when each row is a separate JSON object on its own line.

File: data_lines.jsonl

{"Name": "Alice", "Age": 25}
{"Name": "Bob", "Age": 30}
df = pd.read_json('data_lines.jsonl', lines=True)
df.to_json('output_lines.jsonl', orient='records', lines=True)

✅ Ideal for streaming or log data and APIs like Elasticsearch.


🗜️ 7. JSON Compression Support

df.to_json('data.json.gz', compression='gzip')
df = pd.read_json('data.json.gz', compression='gzip')

✅ Supports gzip, bz2, xz, zip, etc.


⚠️ 8. Handle Non-UTF Encodings

df = pd.read_json('data.json', encoding='ISO-8859-1')

✅ Useful when reading data from legacy systems.


📌 Summary – Recap & Next Steps

With Pandas, reading and writing JSON becomes as easy as working with CSVs. Whether you’re processing structured logs, API responses, or saving analysis output—read_json() and to_json() give you powerful options.

🔍 Key Takeaways:

  • Use read_json() to load JSON objects into DataFrames
  • Use orient to match the JSON structure with DataFrame format
  • lines=True handles newline-delimited JSON rows
  • Supports compression and encoding for performance and compatibility

⚙️ Real-world relevance: Perfect for API data pipelines, event logs, configuration exports, and cloud-native workflows.


❓ FAQs – Reading & Writing JSON in Pandas

❓ What’s the default format expected by read_json()?
✅ A list of dictionaries (records), one per row.

❓ What’s the difference between records and split orientation?

  • records: List of rows as dicts (row-wise)
  • split: Dict with keys: index, columns, data

❓ Can I write line-delimited JSON for log streaming?
✅ Yes. Use orient='records', lines=True.

❓ Is JSON faster than CSV for large files?
❌ Generally no. CSV is faster to load. But JSON is more structured and supports nested data.

❓ Can I export a subset of the DataFrame to JSON?
✅ Yes. Slice the DataFrame first:

df[['Name', 'Age']].to_json('people.json', orient='records')

Share Now :

Leave a Reply

Your email address will not be published. Required fields are marked *

Share

Pandas Read/Write JSON Files

Or Copy Link

CONTENTS
Scroll to Top