Google Data Analytics Case Study

Hi, Happy holidays. Its been months since my last post, because work.. and life.. still, over one hundred new visitors monthly on this sparse blog.. :)Thank you all..

This post is different. It’s on a case study for the Google Data Analytics Certificate I just completed. It’ll be a long write up detailing the analysis process for the case study.
The scenario is: I’m a junior data analyst for Cyclistic, a fictional Bike share company in Chicago. I’ll share the slides presentation, then walk through the data analysis process.

The tools used in this study are:
– Microsoft Excel
– Rstudio
– GitHub
– Tableau
– Adobe Reader
– Google Slides

The analysis process is categorized into 6 phases.
1. Ask
2. Prepare
3. Process
4. Analyze
5. Share
6. Act

  1. Ask
    In the Ask phase, we do two things.
    a). We define the problem to be solved (or the business case)
    The problem to be solved is usually given by your team leader or manager. For this project, it is provided by the case study assignment.
    Show how Annual Members and Casual Riders use Cylistic bikes differently.
    b). We make sure we fully understand the stakeholders expectations.
    Our stakeholders here are:
    – Lily Moreno (Team Leader/Manager)
    – The marketing team
    – Cyclistic Executives.
    They want to create marketing strategies to convert casual riders to annual members for the company to earn more money. To do this, the stakeholders need to understand how casual riders use the company product differently from annual members. To help them understand, our task is to show the differences.

  2. Prepare
    In Prepare phase, we’ll verifiy our data source integrity, credibility and reliability. Then collect and store the data for use in our analysis.
    Our data source here is provided by the case study assignment. Link with access to 12 csv files hosted on AWS for each month from Nov 2020 to Oct 2021. I downloaded the files for study in Excel. I think the best way to get to know your data is in Excel. All 12 files containing data on all rides for each month have the same structure of 13 columns.
    1. ride i_d
    2. bike_type
    3. date/time_of _ride_start
    4. date/time_of_ride_end
    5. start_station_name
    6. start_station_id
    7. end_station_name
    8. end_station_id
    9. start_station_latitude
    10. start_station_logitude
    11. end_station_latittude
    12. end_station_logitude
    13. customer_type

    After getting to know the data, I did some quick data cleaning on all 12 files, by
    1. Editing a few column names.
    2. Adding one more column on all 12 files. “day_of_the_week” to show the day of the week the customer rented a bike. The Excel formula for this is : TEXT((reference cell), “dddd”)
    3. Save editied files as an excel workbook (.xlsx).

  3. Process
    Here, we do more cleaning. Combining datasets and transforming it to make our information more useful for analysis. This includes removing inconsistencies, outliers, typos etc that could skew our analysis.
    For this phase, I chose R. SQL may probably be more practical here, but I like R a lot more, I think its very elegant. Its my preferred programming language.
    The reason we’re moving to R from Excel mainly is because we need to combine all 12 files into one and they won’t fit in Excel. Excel has a maximum capacity of about 1 million rows. These 12 files combined is around 5.4 million rows.

    In R-studio, we
    1. Combined all 12 files into one.
    2. Calculated the average length of each ride. “avg_ride_length” column added.
    3. Further cleaning by deleting inconsistencies such as negative ride lengths and “TEST” rides. Around 1,500 rows was removed, approximately 0.03% of our data.
    4. Export the clean data as a .csv file for further analysis and visualization in Tableau.
    5. Save the R-script for future reference.
    For libraries and functions I used in R, you can view my R-script on GitHub.

  4. Analyze
    In the Analyze phase, basically we try to make sense of our processed data. We transform and organize it into useful information. So that we can draw conclusions, make predictions and drive informed decision making.
    Here, I chose Tableau. Although we can do analysis and create charts in R with ggplot2, it’s amazing what you can do in R.
    Still, I don’t think writing lines of code is the best way to explore data for insights. In R, you have to already know what your output will look like, then you write the code to generate it.
    With Tableau, you don’t really know. You explore, play with, filter, mix, match and generate visuals from the data to make complex concepts and numbers easier to see and understand.

    Using Tableau Desktop, I connected to the csv file downloaded from Rstudio, then started exploring to see what we can come up with. This is the most fun part really. From our data, we were able to extract nine unique insights shown in charts. Once satisfied with our exploration and insights gotten, we then edit our visuals to make it clear, simple and to best show what the data is saying.

  5. Share
    In the share phase, we’ll interpret the results of our analysis and share our conclusion and recommendations to help stakeholders make data-driven decisions.

    We’re done editing our charts in Tableau Desktop.
    Next, we save our work to Tableau Public online. Tableau doesn’t let you save your work locally without first uploading it to your account online.
    This done, we then download it in PDF. PNG files from Tableau public is not good. Best option is PDF. Save the pdf files and open it up in Adobe reader and then export as a jpeg file. Then upload the jpegs in creating our Google Slides in Chrome.

  6. Act
    This is the final phase in Data Analysis. Here, we take all of the insights we’ve got from our data and put them to work to solve the original business case: To show how Annual Members and Casual Riders use Cyclistic bikes differently so that our stakeholders can create marketing strategies to convert casual riders to annual members for the company to earn more money. Here’s a good time to go through the Google slides again and you can see that we have successfully accomplished our task.

    Act is what we’re doing right now. Me writing this and you reading.. if it was for real, there will be a presentation of the slides and stakeholders nodding heads in comprehension like in the movies saying stuff like “excellent” then going back to the “drawing board” and creating marketing strategies/allocating budget based on the insights and recommendations from the data analysis. harharhar..

    Alright! This brings us to the end of this post. Not really sure why I’m smiling r/n but this again was fun. Thank you for being here. Your curiosity and our shared interest brought you. Its the end of the year. Happy Holidays!!

When will Covid-19 end?

Girl meeting her friends at a city bar for drinks at night

It’ll be nice to have a future date or timeframe set when everything will be back to normal. But, there isn’t.
To try and answer the question, lets look at some numbers.

Philippines Total Population 2021: 109,856,591 (April 04, 2021. CPD)
Total Number of people infected: 795,051 (April 03, 2021. DOH)
Percentage of Total Population infected: 0.72% (less than 1%)

Total number confirmed of Vaccines acquired: 2,525,600
Total number of people vaccinated: 737,569 (Source)
Percentage of Total population vaccinated: 0.67% (less than 1%)

Based on the premise that the vaccines are 100% effective against current Covid strain and new variants, for covid-19 to end, we need herd immunity.

Herd immunity: is when a large part of the population (around 67%) is immune or resistant to the virus, so the virus has no where left to go or be transmitted and eventually the virus dies out.
This is the primary target of all vaccination programs.

To reach Herd immunity in Philippines, 67% of the population: around 74.4 million people need to be vaccinated.
This is when Covid-19 will end.
At the moment, we are at less than 1% vaccinated.

Going by the report in the Manila Bulletin, the government is on track to acquire 170 million vaccine doses, with target to vaccinate 70 million people this year (Source). How the government will handle deployment and coordination to vaccinate 70 million people remains to be seen. At least, we are on our way. 737,569 confirmed vaccinated is a lot more than zero. Its a start.


Artwork credit: https://mixkit.co/@casandrabanuelos/

Market Capitalization vs Total Assets: San Miguel/ Tesla examples

This article is continued from our previous post: Largest companies in Philippines by market capitalization.


If you look up San Miguel Corporation (SMC) on Wikipedia, you’ll see that SMC Total Assets is stated as P1.8 trillion in 2020. Converted to USD, P1.8t/50 = $36 billion. Note from the previous chart that SMC estimated market cap value is $5.7B.
Does this mean SMC stock is undervalued by over $30B? Short answer: No.
Lets have a look at it.
Total Assets is all items of values and resources a company owns. Property, buildings, factories, equipment, inventory, cash, patent rights and so on are the assets. The assets generate income for the company. SMC values its total assets at $36B.
As SMC is listed on the stock exchange, the “Market” (financial institutions, investors, accountants, bankers, the general public) has looked at it, its total assets, future expected incomes, its liabilities, operation costs.. and having considered everything, right now at this moment will pay $5.7B to buy the company. And they do. Thousands of company shares are bought and sold daily on the stock exchange at the estimated market value of $5.7B and not $36B.

Tesla
Quick google search shows Tesla has a market capitalization of around $580 billion, and total assets of $52 billion. Does it mean the company is overvalued by over $520B? There is no short answer here. Tesla’s different.
The point here is the Market, having considered everything, although Tesla’s assets (everything of value the company owns) is just $52B, the market is willing to pay $580B to buy the company. And they do. Millions TSLA shares are traded daily on the stock market at price value of over $580B and not 52.

TLDR
Total assets is all the stuff a company owns.
Market capitalization is how much the market will pay to buy the company.

Photo credit: NICE GUYS

Hello World!

Hi, my name is Fola. welcome to our first blog post.

The World

“hello world!” is what a lot of us first wrote in html code when creating our first webpage.

About us:
This website is created to share data vizzes, analysis and insights. I hope you’ll find some information here useful or at least interesting.

About me: I’ve only recently gotten into creating data visuals and analytics. I spent a few months learning Python, SQL and doing some basic data analysis in Jupyterlab using libraries like Matplotlib and Seaborn before stumbling on Tableau.
At the time writing this, its March 19, 2021. I’d say its been two months since I started using Tableau. I’ve always liked graphs, infographics and read posts from Statista and CB insights for years. Since we’ve all been spending more time alone during this pandemic, we get into new stuff. Mine is creating visuals from data.. and Star Wars.. (had passed on it until the Mandalorian).

Contact:
For enquiries, ideas and suggestions on topics we should explore or just general data stuff, feel free to send an email at “[email protected]”.

Subscribe to our mailing list so we can send newsletters to your mailbox in the future. I like reading newsletters. There’s some really good ones like “While You Were Working” and “Quartz” . but creating one.. weekly/monthly.. should be fun.

At the bottom of each blog post, there’s a link for you to download a pdf copy of the viz.

Thank you. Happy reading.

Fola