Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit cd078ef

Browse files
authored
Add files via upload
1 parent 98523c3 commit cd078ef

File tree

3 files changed

+165
-90
lines changed

3 files changed

+165
-90
lines changed

‎README.md

Lines changed: 94 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -1,90 +1,94 @@
1-
# Visualizing-Sales-Data-with-NumPy-and-Matplotlib
2-
# 📊 Sales Data Analysis with NumPy and Matplotlib
3-
4-
This project is a beginner-to-intermediate level data analysis of sales data using **pandas**, **NumPy**, and **Matplotlib**. It demonstrates how to read, clean, analyze, and visualize sales information from a CSV file.
5-
6-
---
7-
8-
## 🧠 Objectives
9-
10-
- Convert raw sales data into useful insights.
11-
- Calculate total revenue per product.
12-
- Use NumPy for array manipulation and slicing.
13-
- Visualize results with a colorful, labeled bar chart.
14-
15-
---
16-
17-
## 🗂️ Dataset Description
18-
19-
The dataset contains the following columns:
20-
21-
- **Product**: The name of the product.
22-
- **Quantity**: Units sold.
23-
- **Price**: Unit price in dollars.
24-
- **Date**: Date of sale.
25-
26-
---
27-
28-
## 🧮 Analysis Steps
29-
30-
1. **Read CSV Data** using `pandas`.
31-
2. **Convert Columns to Numeric** types with error handling.
32-
3. **Calculate Revenue** per row (Price ×ばつ Quantity).
33-
4. **Convert DataFrame to NumPy Array** for slicing and filtering.
34-
5. **Extract Unique Products** and compute:
35-
- Total revenue per product.
36-
- Percentage share of total revenue.
37-
6. **Visualize the Results** using `Matplotlib`:
38-
- Each product is assigned a unique color.
39-
- Products are displayed as numbered bars.
40-
- A dynamic legend explains which number corresponds to which product.
41-
42-
---
43-
44-
## 📈 Output Example
45-
46-
![Bar Chart](revenue_chart.png) <!-- You can upload and link your actual chart -->
47-
48-
---
49-
50-
## 🛠️ Technologies Used
51-
52-
- Python
53-
- pandas
54-
- NumPy
55-
- Matplotlib
56-
57-
---
58-
59-
## 💡 What You Will Learn
60-
61-
- Data cleaning with `pandas`
62-
- NumPy slicing and boolean masking
63-
- Revenue calculation by category
64-
- Building clear, colorful visualizations
65-
- Working with legends and layout in `Matplotlib`
66-
67-
---
68-
69-
## 🚀 Future Improvements
70-
71-
- Group data by date and analyze revenue trends over time.
72-
- Add Seaborn or Plotly for interactive visualizations.
73-
- Build a simple dashboard using Streamlit.
74-
75-
---
76-
77-
## 📬 Contact
78-
79-
If you like this project or have questions, feel free to connect:
80-
81-
- GitHub: [DataFalcon 🦅]
82-
83-
- Email: [tammahakki700@gmail.com]
84-
85-
---
86-
87-
## 🔖 License
88-
89-
This project is open-source and available under the [MIT License](LICENSE).
90-
1+
# 📊 Sales Data Analysis with NumPy and Matplotlib
2+
3+
This project is a beginner-to-intermediate level data analysis of sales data using **pandas**, **NumPy**, and **Matplotlib**. It demonstrates how to read, clean, analyze, and visualize sales information from a CSV file.
4+
5+
---
6+
7+
## 📂 Files
8+
- `sales2_1.csv`: The dataset.
9+
- `sales2.py`: Main Python script that processes and visualizes the data.
10+
- `revenue_profit_chart.png`: Output chart showing revenue per product.
11+
12+
## 🧠 Objectives
13+
14+
- Convert raw sales data into useful insights.
15+
- Calculate total revenue per product.
16+
- Use NumPy for array manipulation and slicing.
17+
- Visualize results with a colorful, labeled bar chart.
18+
19+
---
20+
21+
## 🗂️ Dataset Description
22+
23+
The dataset contains the following columns:
24+
25+
- **Product**: The name of the product.
26+
- **Quantity**: Units sold.
27+
- **Price**: Unit price in dollars.
28+
- **Date**: Date of sale.
29+
30+
---
31+
32+
## 🧮 Analysis Steps
33+
34+
1. **Read CSV Data** using `pandas`.
35+
2. **Convert Columns to Numeric** types with error handling.
36+
3. **Calculate Revenue** per row (Price ×ばつ Quantity).
37+
4. **Convert DataFrame to NumPy Array** for slicing and filtering.
38+
5. **Extract Unique Products** and compute:
39+
- Total revenue per product.
40+
- Percentage share of total revenue.
41+
6. **Visualize the Results** using `Matplotlib`:
42+
- Each product is assigned a unique color.
43+
- Products are displayed as numbered bars.
44+
- A dynamic legend explains which number corresponds to which product.
45+
46+
---
47+
48+
## 📈 Output Example
49+
50+
![Bar Chart](revenue_chart.png) <!-- You can upload and link your actual chart -->
51+
52+
---
53+
54+
## 🛠️ Technologies Used
55+
56+
- Python
57+
- pandas
58+
- NumPy
59+
- Matplotlib
60+
61+
---
62+
63+
## 💡 What You Will Learn
64+
65+
- Data cleaning with `pandas`
66+
- NumPy slicing and boolean masking
67+
- Revenue calculation by category
68+
- Building clear, colorful visualizations
69+
- Working with legends and layout in `Matplotlib`
70+
71+
---
72+
73+
## 🚀 Future Improvements
74+
75+
- Group data by date and analyze revenue trends over time.
76+
- Add Seaborn or Plotly for interactive visualizations.
77+
- Build a simple dashboard using Streamlit.
78+
79+
---
80+
81+
## 📬 Contact
82+
83+
If you like this project or have questions, feel free to connect:
84+
85+
- GitHub: [DataFalcon 🦅]
86+
87+
- Email: [tammahakki700@gmail.com]
88+
89+
---
90+
91+
## 🔖 License
92+
93+
This project is open-source and available under the [MIT License](LICENSE).
94+

‎sales2.csv

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
Product,Quantity,Price,Date
2+
Laptop,2,750,2024年06月01日
3+
Headphones,5,50,2024年06月02日
4+
Phone,1,500,2024年06月03日
5+
Laptop,1,750,2024年06月03日
6+
Phone,3,500,2024年06月04日
7+
Smartwatch,4,200,2024年06月05日
8+
Laptop,1,800,2024年06月06日
9+
Phone,2,550,2024年06月07日
10+
Tablet,2,400,2024年06月08日
11+
Smartwatch,1,220,2024年06月09日

‎sales2.py

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
import pandas as pd
2+
import numpy as np
3+
import matplotlib.pyplot as plt
4+
5+
# Read data
6+
sale = pd.read_csv("sales2.csv")
7+
print(sale)
8+
# Make sure that Price and Quantity are numbers.
9+
sale["Price"] = pd.to_numeric(sale["Price"], errors="coerce")
10+
sale["Quantity"] = pd.to_numeric(sale["Quantity"], errors="coerce")
11+
# calculate the total
12+
sale["total"] = sale["Price"] * sale["Quantity"]
13+
print(sale["total"])
14+
#Converts the "Product" and "total" columns from the DataFrame sale into a NumPy array and stores it in data.
15+
data = sale[["Product", "total"]].to_numpy()
16+
17+
just_5 = data[0:5,:] #Selects the first 5 rows from the data array (all columns) and stores them in just_5.
18+
just_product = data[:,0] #Extracts the first column (i.e., the "Product" names) from data and stores it in just_product.
19+
just_revenue = data[:,1] #Extracts the second column (i.e., the "total" revenue values) from data and stores it in just_revenue.
20+
revenue_just_last_3 = data[7:,1] #Selects the revenue values (second column) starting from the 8th row to the end and stores them in revenue_just_last_3.
21+
22+
print (f"first five lines :\n{just_5}\njust products :\n{just_product}\njust revenue : {just_revenue}\nlast three products revenue : {revenue_just_last_3}\n")
23+
# calculate Total revenues
24+
25+
total = np.sum(just_revenue)
26+
print(f"Total revenues : {total}")
27+
# Create a product list without duplication
28+
29+
unique = np.unique(data[:,0])
30+
print(unique)
31+
# make a storages lists
32+
33+
products = []
34+
revenues = []
35+
#This loop calculates the total revenue for each unique product, prints the revenue and its percentage of the total, and stores the results.
36+
37+
for product in unique:
38+
mask = data[:,0] == product
39+
rows = data[mask][:,1].astype(float)
40+
revenue = rows.sum()
41+
print(f"{product} : {revenue:.2f} , {(revenue/total)*100:.2f}%")
42+
products.append(product)
43+
revenues.append(revenue)
44+
# Generate different random colors for each column
45+
colors = plt.cm.tab20(np.linspace(0, 1, len(products)))
46+
# Output products as numerical values
47+
x_labels = list(range(1, len(products) + 1))
48+
49+
plt.figure(figsize=(10,6))
50+
plt.bar(range(len(products)), revenues, color=colors)
51+
plt.title("Total Revenue per Product")
52+
plt.xlabel("Product")
53+
plt.ylabel("Revenue")
54+
plt.xticks(ticks=range(len(products)), labels=range(1,len(products) + 1))
55+
legend_text = "\n".join([f"{i+1} = {product}"for i,product in enumerate(products)])
56+
plt.figtext(0.83,0.8, legend_text, fontsize=10, va="center")
57+
plt.subplots_adjust(right=0.8)
58+
plt.savefig("revenue_chart.pdf", bbox_inches="tight", dpi =300)
59+
plt.show()
60+

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /