In this project, we will analyze data for a telecom company called Techconnect. At the end of this project, you will have a fair understanding of how to approach data analysis when working with a large database.
Why Use SQL?
Before we dive into the project, let’s briefly discuss why we use SQL for data analysis. While Python is a reliable tool for most data analysis tasks, it can encounter inefficiencies when dealing with large datasets containing millions of rows. This is where SQL shines. SQL was designed to analyze massive amounts of data efficiently, allowing us to get insights from large databases in seconds.
We will use Microsoft SQL Server to run all our SQL code for this project. Connect to SQL Server using the following information:
- Server Name: dap-projects-database.database.windows.net
- Username: hackathon
- Password: ———————
- Database Name: dapDB
You will find a dapDB database with 3 tables named customers, location, and current_offer_balance.
Let me introduce you to TechConnect, a mobile services provider offering voice, data, SMS, and various value-added services to its vast customer base. To stay competitive, TechConnect relies on data-driven decision-making and customer-centric strategies. Our task as data analysts is to transform raw data into actionable insights that will shape the company’s decisions, product offerings, and marketing campaigns.
- Table: dbo.current_officers_balance
The “current_officers_balance” table was the backbone of TechConnect’s data infrastructure. It stored detailed information about each customer’s account and usage patterns. The table held records of customer IDs, call durations, data consumption, messaging activities, revenue details, and various other metrics. This data allowed TechConnect to gain insights into customer behavior, identify potential upselling opportunities, and provide personalized services.
- Table: dbo.customers
The “customers” table contained a comprehensive list of all TechConnect’s customers. It included activation dates, tenure information, and each customer’s spending habits. This data helped the company understand the customer lifecycle, and loyalty trends and tailor marketing strategies based on customer preferences and behavior.
- Table: dbo.location
The “location” table held vital information about the geographical distribution of customers and the network infrastructure. It stored site IDs, regions, and cell types associated with specific cellular towers and base stations.
This data will enable TechConnect to optimize network coverage, identify potential network expansion areas, and assess regional performance.
TechConnect’s data analysts and business intelligence teams (You
) will work diligently to transform raw data into actionable insights. The insights generated from the database will play a pivotal role in shaping the company’s decisions, product offerings, and marketing campaigns.
The company is still in its early stages, so even though we are making money and we have achieved product-market-fit (creating a product or service that people are willing to pay for), we must ask important questions that will ensure that the company stays afloat.
The most important question we must answer is Why people pay for our services? What exactly are they paying for, and who are our customers? To reduce costs and make more money, we must pinpoint exactly what our customers need most. This is the goal of any business analysis. Let’s dive into it!
Customer spending behavior
1.0 How do customers spend money on different services such as voice, data, SMS, and other offerings? Identify the most popular services among customers.
SELECT 'Voice' AS Service, SUM(VOICE_SPENT) AS TotalSpent FROM dbo.customers UNION ALL SELECT 'Data' AS Service, SUM(DATA_SPENT) AS TotalSpent FROM dbo.customers UNION ALL SELECT 'SMS' AS Service, SUM(SMS_SPENT) AS TotalSpent FROM dbo.customers UNION ALL SELECT 'Other Services' AS Service, SUM(OTHER_SPENT) AS TotalSpent FROM dbo.customers ORDER BY TotalSpent DESC;
Voice is the most popular service that our customers spend money on. This means our customers mostly make calls, send text messages, and buy data.
Result of the query
You only have read permissions; however, right-click on the results to save the results as a CSV file for your visualizations.
Column chart of spending Vs service
Since voice is the most popular service, lets take a closer look at the customers that make calls.
2.0 What is the device_type of customers that spend money to make calls?
--Voice Spending Variation Based on Customer's Device Type -- Query to get voice spending for smartphone and non-smartphone users SELECT CASE WHEN dbo.location.cell_type = 2 THEN 'Smartphone' --assuming 2 is smartphone WHEN dbo.location.cell_type = 3 THEN 'Non-Smartphone' ELSE 'Other-Devices' --null values as other-devices END AS device_type, SUM(VOICE_SPENT) AS sum_voice_spent --AVG(VOICE_SPENT) AS average_voice_spent we can't use mean cuz of outliers FROM dbo.customers LEFT JOIN dbo.location ON dbo.customers.Cell_ID = dbo.location.CELL_ID GROUP BY dbo.location.cell_type;
Majority of our voice customers are smartphone users.
Result of the query
Column chart of Voice_spent Vs Device_type
3.0 Top recharge method used by customers
--top recharge method used by customers SELECT SUM(RECHARGE_EVD_AMOUNT) AS evd, SUM(RECHARGE_MOMO_AMOUNT) AS momo, SUM(RECHARGE_CARD_AMOUNT) AS recharge_card FROM dbo.customers;
Most customers use recharge card as their recharge method
4.0 Can existing customers be upsold to higher-value data plans?
SELECT TOP 10 Customer_ID, SUM(Data_Vol_MB) AS Total_Data_MB, SUM(TOTAL_RECHARGE_AMOUNT) AS TOTAL_RECHARGE_AMOUNT FROM dbo.customers GROUP BY Customer_ID HAVING SUM(Data_Vol_MB) = 0 ORDER BY TOTAL_RECHARGE_AMOUNT DESC;
Yes, some customers have high recharge amount but no data at all, so they can be convinced to buy data.
It’s good to know what customers want, but where exactly is the money coming from
Now let’s look at the MONEY.
Main sources of revenue
1.0 Who are our best customers?
--Top 5 customers by total revenue: SELECT TOP 5 cob.Customer_ID AS top_customers_by_revenue, SUM(cob.tot_rev) AS total_revenue, SUM(cust.VOICE_SPENT + cust.DATA_SPENT + cust.SMS_SPENT + cust.OTHER_SPENT) AS total_spending FROM dbo.current_offers_balance AS cob JOIN dbo.customers cust ON cob.Customer_ID = cust.Customer_ID GROUP BY cob.Customer_ID ORDER BY total_revenue DESC; --Top 5 customers by spending SELECT TOP 5 cob.Customer_ID AS top_customers_by_spending, SUM(cob.tot_rev) AS total_revenue, SUM(cust.VOICE_SPENT + cust.DATA_SPENT + cust.SMS_SPENT + cust.OTHER_SPENT) AS total_spending FROM dbo.current_offers_balance AS cob JOIN dbo.customers cust ON cob.Customer_ID = cust.Customer_ID GROUP BY cob.Customer_ID ORDER BY total_spending DESC;
From the tables below, our top spenders don’t make us the most money, but customers who make us the most (total_revenue) may not be the best spenders. This goes to show that revenue does not correlate with spending.
top customers by revenue and spending
2.0 What are the main sources of revenue for the company?
--Revenue breakdown by revenue types SELECT SUM(cob.rev_data_total) AS data_revenue, SUM(cob.rev_voice_int) AS voice_revenue, SUM(cob.rev_other_vas) AS vas_revenue, SUM(cob.rev_rentals) AS rentals_revenue FROM dbo.current_offers_balance cob;
Voice is the major revenue source of the company. However, we are losing a lot of money from rentals. This means our customers borrow data, voice, and value-added services from us; this borrowing of services is causing us to lose potential revenue.
3.0 Do we make more money from incoming calls and texts sent to our customers, or otherwise?
--percentage of total revenue from services coming in Vs out SELECT (SUM(tot_rev_in) / SUM(tot_rev)) * 100 AS total_rev_in_percentage, (SUM(tot_rev_out) / SUM(tot_rev)) * 100 AS total_rev_out_percentage FROM dbo.current_offers_balance;
We make more money from outgoing services when our customers send text messages to outside networks or when our customers make calls to other networks.
4.0 How much money have we lost from customers who left (churned).
-- Calculate total revenue for each distinct value of status (pie chart) SELECT 'ACTIVE' AS status, SUM(tot_rev) AS tot_rev FROM dbo.current_offers_balance WHERE status = 'ACTIVE' UNION ALL SELECT 'DORMANT & INACTIVE' AS status, SUM(tot_rev) AS tot_rev FROM dbo.current_offers_balance WHERE status IN ('DORMANT', 'INACTIVE') UNION ALL SELECT 'CHURNED' AS status, SUM(tot_rev) AS tot_rev FROM dbo.current_offers_balance WHERE status IN ('CHURNED', 'CHURN');
From the table below, it appears that we didn’t make much money from customers who left or churned; compared to revenue from our active customers.
5.0 Do we make more money from international services and transactions?
--international vs domestic voice revenue SELECT SUM(rev_voice_int) + SUM(rev_voice_roam_incoming) + SUM(rev_voice_roam_outgoing) AS total_international_revenue, SUM(tot_rev) - (SUM(rev_voice_int) + SUM(rev_voice_roam_incoming) + SUM(rev_voice_roam_outgoing)) AS total_domestic_revenue FROM dbo.current_offers_balance;
No, we make more money locally.
6.0 Which handset models bring more revenue?
-- Calculate total revenue for each distinct value of hs_model SELECT TOP 5 hs_model, SUM(tot_rev) AS total_revenue FROM dbo.current_offers_balance GROUP BY hs_model ORDER BY SUM(tot_rev) DESC;
We make the most money from android smartphone users.
Revenue and customer engagement performance across different regions.
1.0 Compare customers’ spending patterns in different regions and identify which services are more popular in each region.
--Compare the spending patterns of customers in different regions and -- Identify which services are more popular in each region SELECT TOP 5 loc.region, SUM(cust.VOICE_SPENT) AS total_voice_spent, SUM(cust.DATA_SPENT) AS total_data_spent, SUM(cust.SMS_SPENT) AS total_sms_spent, CASE WHEN SUM(cust.VOICE_SPENT) > SUM(cust.DATA_SPENT) AND SUM(cust.VOICE_SPENT) > SUM(cust.SMS_SPENT) AND SUM(cust.VOICE_SPENT) > SUM(cust.OTHER_SPENT) THEN 'Voice' WHEN SUM(cust.DATA_SPENT) > SUM(cust.VOICE_SPENT) AND SUM(cust.DATA_SPENT) > SUM(cust.SMS_SPENT) AND SUM(cust.DATA_SPENT) > SUM(cust.OTHER_SPENT) THEN 'Data' WHEN SUM(cust.SMS_SPENT) > SUM(cust.VOICE_SPENT) AND SUM(cust.SMS_SPENT) > SUM(cust.DATA_SPENT) AND SUM(cust.SMS_SPENT) > SUM(cust.OTHER_SPENT) THEN 'SMS' ELSE 'Other' END AS popular_service FROM dbo.customers AS cust JOIN dbo.location AS loc ON cust.Cell_ID = loc.CELL_ID GROUP BY loc.region ORDER BY total_voice_spent DESC; --we order by voice since voice is the most popular service across all regions;
The top regions are the “Greater Accra Region” and the “Ashanti Region” by spending. Voice is the most popular service for all regions.
Spending rate might not determine revenue, meaning we might make more money from regions where customers spent less. So, let us look at regional revenue.
2.0 What are the most profitable regions?
--Compare the total_revenue of customers in different regions SELECT TOP 5 l.region AS Region, SUM(cob.tot_rev) AS Total_Revenue FROM dbo.current_offers_balance cob JOIN dbo.customers cu ON cu.Customer_ID = cob.Customer_ID JOIN dbo.location l ON cu.Cell_ID = l.CELL_ID GROUP BY l.region ORDER BY Total_Revenue DESC;
The top regions are the “Greater Accra Region” and the “Western Region” by revenue.
Analyze customer churn
1.0 Which regions have the highest churn rate?
--Actionable insight: Use insights from high revenue regions to create personalized offerings and improve customer experiences in other regions. --regions with the highest churn rate --The Volta Region has the highest churn rate but 3rd highest revenue SELECT TOP 5 l.region, AVG(c.tenure) AS avg_tenure, SUM(CASE WHEN co.churn_date IS NOT NULL THEN 1 ELSE 0 END) * 100.0 / COUNT(c.customer_id) AS churn_rate_percentage --if there is a date for churn, then true or 1 FROM dbo.location l LEFT JOIN dbo.customers c ON l.cell_id = c.cell_id LEFT JOIN dbo.current_offers_balance co ON c.customer_id = co.customer_id GROUP BY l.region ORDER BY churn_rate_percentage DESC;
The Volta Region has the highest churn rate but brings the 3rd highest revenue.
2.0 Evaluate the effectiveness of the loyalty program. How customers engage with it.
The loyalty program was implemented to encourage our active customers to stay with us and continue to pay for our services.
SELECT DISTINCT loyalty_points_balance FROM dbo.current_offers_balance; SELECT DISTINCT loyalty_points_redeemed FROM dbo.current_offers_balance; SELECT DISTINCT loyalty_points_earned FROM dbo.current_offers_balance;
It’s time to try something other than loyalty points because customers have not responded. Either that or it was recently introduced and needs some time to kick off.
Track customers’ adoption of different mobile network technologies (2G).
1.0 Compare the revenue from customers on 2G networks to those from others.
SELECT '2G' AS nw_2g_ind, SUM(tot_rev) AS Total_Revenue FROM dbo.current_offers_balance WHERE nw_2g_ind = -1 UNION ALL SELECT '3G' AS nw_2g_ind, SUM(tot_rev) AS Total_Revenue FROM dbo.current_offers_balance WHERE nw_2g_ind = 0 UNION ALL SELECT '4G' AS nw_2g_ind, SUM(tot_rev) AS Total_Revenue FROM dbo.current_offers_balance WHERE nw_2g_ind = 1;
We should advertise to those using 2G network to upgrade to 3G or 4G.
After queries have been written and beautiful dashboards have been built, the single most important thing a data analyst has to give is actionable insights. So from our analysis, here are some things I recommend:
- Focus on Voice Services: Voice is the most popular service among customers, so it is essential to maintain and improve the voice service quality. Additionally, consider introducing voice-based value-added services to engage customers further.
- Target Smartphone Users: the majority of voice customers are smartphone users. This presents an opportunity to create smartphone-specific offers, promotions, and services to cater to this segment.
- Promote Data Plans to High-Recharge Customers: Identify customers with high recharge amounts but do not use data. Target these customers with personalized offers to upsell higher-value data plans.
- Optimize Recharge Methods: As most customers use recharge cards, consider promoting mobile money and electronic voucher distribution (EVD) methods to increase convenience and accessibility.
- Address Churn Rate in Volta Region: The Volta Region has the highest churn rate despite being a high-revenue region. Investigate the reasons behind the churn and implement strategies to improve customer retention in this area.
- Reduce Losses from Rentals: Analyze the rental services and identify ways to reduce losses from customers borrowing services. Encourage customers to opt for regular data and voice plans instead of rentals.
- Upgrade 2G Customers: Encourage customers on 2G networks to upgrade to 3G or 4G to capitalize on the revenue potential from these advanced services.
If you want to become a programmer, then this article is your guide to becoming one. It explains everything from start to finish on how to build technical skills and what to do.