JSON and CSV Formats in Python: Reading, Writing, and Data Analysis
We've already covered the basics of working with files and text files in Python. Now we're moving on to structured data — formats that are used to store and exchange data in a structured way. 🧩
We'll focus on two of the most popular formats:
- JSON (JavaScript Object Notation) — a data exchange format widely used in web applications and APIs
- CSV (Comma-Separated Values) — a simple format for representing tabular data
These formats are very common and are used in all areas of programming — from web development to data analysis.
JSON: JavaScript Object Notation
JSON (JavaScript Object Notation) is a text-based data exchange format, similar to dictionaries and lists in Python. It is easy to read by both humans and machines.
JSON Data Structure
JSON supports the following data types:
- Objects (dictionaries): {"name": "Alice", "age": 30}
- Arrays (lists): [1, 2, 3, 4]
- Strings: "Hello, world!"
- Numbers: 42, 3.14
- Boolean values: true, false
- null (corresponds to None in Python)
The json Module in Python
Python provides a built-in json module for working with this format:
Python 3.13import json # Simple Python dictionary person = { "name": "Anna", "age": 28, "city": "Moscow", "languages": ["Python", "JavaScript"] } # Converting a Python dictionary to a JSON string json_string = json.dumps(person, ensure_ascii=False, indent=2) print("JSON string:") print(json_string)JSON string: { "name": "Anna", "age": 28, "city": "Moscow", "languages": [ "Python", "JavaScript" ] }# Converting a JSON string back to a Python object parsed_data = json.loads(json_string) print("\nConverted back to Python:") print(f"Name: {parsed_data['name']}") print(f"Age: {parsed_data['age']}") print(f"Programming languages: {', '.join(parsed_data['languages'])}")Converted back to Python: Name: Anna Age: 28 Programming languages: Python, JavaScript
Writing JSON to a File and Reading from a File
Here's how to save data to a JSON file and then read it:
Python 3.13import json # Student data students = [ {"id": 1, "name": "Ivan", "scores": [85, 90, 78]}, {"id": 2, "name": "Maria", "scores": [92, 88, 95]} ] # Writing to a file with open('students.json', 'w', encoding='utf-8') as file: json.dump(students, file, ensure_ascii=False, indent=2) print("Data written to students.json file")Data written to students.json file# Reading from a file with open('students.json', 'r', encoding='utf-8') as file: loaded_students = json.load(file) print(f"\nLoaded {len(loaded_students)} students:") for student in loaded_students: avg_score = sum(student['scores']) / len(student['scores']) print(f" {student['name']}: average score {avg_score:.1f}")Loaded 2 students: Ivan: average score 84.3 Maria: average score 91.7
Main Methods of the json Module
The ensure_ascii=False parameter allows correct saving of non-ASCII characters and other Unicode symbols, and indent makes the output more readable.
CSV: Comma-Separated Values
CSV (Comma-Separated Values) is a simple text format for representing tabular data, where table rows are file lines, and columns are separated by commas (or other delimiters).
CSV looks something like this:
Python 3.13Name,Age,City Anna,28,Moscow Ivan,35,Saint Petersburg
The csv Module in Python
Python provides a built-in csv module for working with this format:
Python 3.13import csv # Data to write data = [ ['Name', 'Age', 'City'], # Headers ['Anna', '28', 'Moscow'], ['Ivan', '35', 'Saint Petersburg'], ['Maria', '22', 'Kazan'] ] # Writing to a CSV file with open('people.csv', 'w', newline='', encoding='utf-8') as file: writer = csv.writer(file) writer.writerows(data) print("Data written to people.csv file")Data written to people.csv file# Reading from a CSV file with open('people.csv', 'r', encoding='utf-8') as file: reader = csv.reader(file) # Reading headers (first line) headers = next(reader) print(f"\nHeaders: {headers}") # Reading data print("\nData:") for row in reader: print(f" {row[0]}, {row[1]} years old, city {row[2]}")Headers: ['Name', 'Age', 'City'] Data: Anna, 28 years old, city Moscow Ivan, 35 years old, city Saint Petersburg Maria, 22 years old, city Kazan
Using DictReader and DictWriter
For more convenient work with CSV, you can use DictReader and DictWriter, which allow you to work with data as dictionaries:
Python 3.13import csv # Writing to CSV using DictWriter data = [ {'Name': 'Alex', 'Profession': 'Engineer', 'Salary': 85000}, {'Name': 'Kate', 'Profession': 'Designer', 'Salary': 75000}, {'Name': 'Sergey', 'Profession': 'Programmer', 'Salary': 110000} ] with open('employees.csv', 'w', newline='', encoding='utf-8') as file: fieldnames = ['Name', 'Profession', 'Salary'] writer = csv.DictWriter(file, fieldnames=fieldnames) writer.writeheader() # Writing headers writer.writerows(data) # Writing data print("Employee data written to file")Employee data written to file# Reading from CSV using DictReader with open('employees.csv', 'r', encoding='utf-8') as file: reader = csv.DictReader(file) print("\nEmployees:") for row in reader: print(f" {row['Name']} - {row['Profession']}, salary: {row['Salary']} units")Employees: Alex - Engineer, salary: 85000 units Kate - Designer, salary: 75000 units Sergey - Programmer, salary: 110000 units
Main Features of Working with CSV
- Delimiters: Although CSV stands for "Comma-Separated Values", in practice other delimiters (semicolon, tab) can be used
- Quotes: If a value contains a delimiter or quotes, it is enclosed in quotes
- Escaping: If there are quotes inside a value, they are escaped
Python 3.13import csv # Example with a different delimiter data = [ ['Product', 'Price', 'In Stock'], ['Laptop', '45000', 'Yes'], ['Smartphone', '25000', 'No'] ] # Writing using semicolon with open('products.csv', 'w', newline='', encoding='utf-8') as file: writer = csv.writer(file, delimiter=';') writer.writerows(data) print("Data written with delimiter ';'")Data written with delimiter ';'# Reading with the correct delimiter with open('products.csv', 'r', encoding='utf-8') as file: reader = csv.reader(file, delimiter=';') for row in reader: print(' '.join(row))Product Price In Stock Laptop 45000 Yes Smartphone 25000 No
Practical Example: Sales Data Analysis
Let's consider an example where we first save sales data in CSV, then analyze it and save the results in JSON:
Python 3.13import csv import json # Creating sales data sales = [ ['Date', 'Product', 'Category', 'Price', 'Quantity'], ['2023-01-05', 'HP Laptop', 'Electronics', '45000', '2'], ['2023-01-10', 'Apple Smartphone', 'Electronics', '85000', '3'], ['2023-01-15', 'Book "Python"', 'Books', '1200', '5'], ['2023-02-10', 'Microwave', 'Home Appliances', '7000', '1'] ] # Step 1: Save data to CSV with open('sales.csv', 'w', newline='', encoding='utf-8') as file: writer = csv.writer(file) writer.writerows(sales) print("Sales data saved to CSV")Sales data saved to CSV# Step 2: Read and analyze data with open('sales.csv', 'r', encoding='utf-8') as file: reader = csv.reader(file) headers = next(reader) # Skip headers # Preparing variables for analysis total_revenue = 0 sales_by_category = {} # Data analysis for row in reader: date, product, category, price, quantity = row revenue = float(price) * int(quantity) # Total revenue total_revenue += revenue # Revenue by category if category in sales_by_category: sales_by_category[category] += revenue else: sales_by_category[category] = revenue # Output analysis results print(f"\nTotal revenue: {total_revenue} units") print("\nRevenue by category:") for category, rev in sales_by_category.items(): print(f" {category}: {rev} units")Total revenue: 354000.0 units Revenue by category: Electronics: 345000.0 units Books: 6000.0 units Home Appliances: 7000.0 units# Step 3: Save analysis results to JSON results = { "total_revenue": total_revenue, "sales_by_category": sales_by_category } with open('sales_analysis.json', 'w', encoding='utf-8') as file: json.dump(results, file, ensure_ascii=False, indent=2) print("\nAnalysis results saved to JSON")Analysis results saved to JSON# Step 4: Check saved JSON with open('sales_analysis.json', 'r', encoding='utf-8') as file: saved_results = json.load(file) print("\nContents of the JSON file with results:") print(json.dumps(saved_results, ensure_ascii=False, indent=2))Contents of the JSON file with results: { "total_revenue": 354000.0, "sales_by_category": { "Electronics": 345000.0, "Books": 6000.0, "Home Appliances": 7000.0 } }
In this example we:
- Created a CSV file with sales data
- Read the data and calculated revenue by category
- Saved the analysis results to a JSON file
- Read the saved JSON to ensure correctness
Understanding Check
Which code correctly reads data from a JSON file in Python?
