Skip to main content

Insights from Data with BigQuery: Challenge Lab [GSP787]



Link to the Video: https://youtu.be/jdZd834oysE

STEPS

1.    In the Cloud Console, navigate to Menu > BigQuery.

2.    Click + ADD DATA > Explore public datasets from the left pane.

3.    Search covid19_open_data and then select COVID-19 Open Data

4.    Use Filter to locate the table covid19_open_data under the covid19_open_data dataset.


Query 1: Total Confirmed Cases

SELECT

  SUM(cumulative_confirmed) AS total_cases_worldwide

FROM

  `bigquery-public-data.covid19_open_data.covid19_open_data`

WHERE

  date = "2020-04-15"


Query 2: Worst Affected Areas

with deaths_by_states as (

SELECT subregion1_name as state, sum(cumulative_deceased) as death_count

FROM `bigquery-public-data.covid19_open_data.covid19_open_data` 

where country_name="United States of America" and date='2020-04-10' and subregion1_name is NOT NULL

group by subregion1_name

)

select count(*) as count_of_states

from deaths_by_states

where death_count > 100


Query 3: Identifying Hotspots

SELECT * FROM (

SELECT subregion1_name as state, sum(cumulative_confirmed) as total_confirmed_cases

FROM `bigquery-public-data.covid19_open_data.covid19_open_data`

WHERE country_code="US" AND date='2020-04-10' AND subregion1_name is NOT NULL

GROUP BY subregion1_name

ORDER BY total_confirmed_cases DESC ) WHERE total_confirmed_cases > 1000


Query 4: Fatality Ratio

SELECT sum(cumulative_confirmed) as total_confirmed_cases, sum(cumulative_deceased) as total_deaths, (sum(cumulative_deceased)/sum(cumulative_confirmed))*100 as case_fatality_ratio 

FROM `bigquery-public-data.covid19_open_data.covid19_open_data` 

where country_name="Italy" AND date BETWEEN '2020-04-01'and '2020-04-30'


Query 5: Identifying specific day

SELECT date

FROM `bigquery-public-data.covid19_open_data.covid19_open_data` 

where country_name="Italy" and cumulative_deceased>10000

order by date asc

limit 1


Query 6: Finding days with zero net new cases

WITH india_cases_by_date AS (

  SELECT

    date,

    SUM( cumulative_confirmed ) AS cases

  FROM

    `bigquery-public-data.covid19_open_data.covid19_open_data`

  WHERE

    country_name ="India"

    AND date between '2020-02-21' and '2020-03-15'

  GROUP BY

    date

  ORDER BY

    date ASC 

 )

 

, india_previous_day_comparison AS 

(SELECT

  date,

  cases,

  LAG(cases) OVER(ORDER BY date) AS previous_day,

  cases - LAG(cases) OVER(ORDER BY date) AS net_new_cases

FROM india_cases_by_date

)

 

select count(*)

from india_previous_day_comparison

where net_new_cases=0


Query 7: Doubling rate

WITH us_cases_by_date AS (

  SELECT

    date,

    SUM(cumulative_confirmed) AS cases

  FROM

    `bigquery-public-data.covid19_open_data.covid19_open_data`

  WHERE

    country_name="United States of America"

    AND date between '2020-03-22' and '2020-04-20'

  GROUP BY

    date

  ORDER BY

    date ASC 

 )

 

, us_previous_day_comparison AS 

(SELECT

  date,

  cases,

  LAG(cases) OVER(ORDER BY date) AS previous_day,

  cases - LAG(cases) OVER(ORDER BY date) AS net_new_cases,

  (cases - LAG(cases) OVER(ORDER BY date))*100/LAG(cases) OVER(ORDER BY date) AS percentage_increase

FROM us_cases_by_date

)

 

select Date, cases as Confirmed_Cases_On_Day, previous_day as Confirmed_Cases_Previous_Day, percentage_increase as Percentage_Increase_In_Cases

from us_previous_day_comparison

where percentage_increase > 10


Query 8: Recovery rate

WITH cases_by_country AS (

  SELECT

    country_name AS country,

    sum(cumulative_confirmed) AS cases,

    sum(cumulative_recovered) AS recovered_cases

  FROM

    bigquery-public-data.covid19_open_data.covid19_open_data

  WHERE

    date = '2020-05-10'

  GROUP BY

    country_name

 )

 

, recovered_rate AS 

(SELECT

  country, cases, recovered_cases,

  (recovered_cases * 100)/cases AS recovery_rate

FROM cases_by_country

)

 

SELECT country, cases AS confirmed_cases, recovered_cases, recovery_rate

FROM recovered_rate

WHERE cases > 50000

ORDER BY recovery_rate desc

LIMIT 10


Query 9: CDGR - Cumulative Daily Growth Rate

WITH
  france_cases AS (
  SELECT
    date,
    SUM(cumulative_confirmed) AS total_cases
  FROM
    `bigquery-public-data.covid19_open_data.covid19_open_data`
  WHERE
    country_name="France"
    AND date IN ('2020-01-24',
      '2020-04-10')
  GROUP BY
    date
  ORDER BY
    date)
, summary as (
SELECT
  total_cases AS first_day_cases,
  LEAD(total_cases) OVER(ORDER BY date) AS last_day_cases,
  DATE_DIFF(LEAD(date) OVER(ORDER BY date),date, day) AS days_diff
FROM
  france_cases
LIMIT 1
)
select first_day_cases, last_day_cases, days_diff, POWER((last_day_cases/first_day_cases),(1/days_diff))-1 as cdgr
from summary

 

GO AND FIRST COLLECT POINTS OF THE FIRST 9 TASKS. ONLY THEN PROCEED


Create a Datastudio report

Method 1:

SELECT

  date, SUM(cumulative_confirmed) AS country_cases,

  SUM(cumulative_deceased) AS country_deaths

 

FROM

  `bigquery-public-data.covid19_open_data.covid19_open_data`

 

WHERE

  date BETWEEN '2020-03-15'

  AND '2020-04-30'

  AND country_name ="United States of America"

GROUP BY date



1.    Click on EXPLORE DATA > Explore with Data Studio

2.    Authorize Data Studio to access BigQuery.

3.  You may fail to create a report for the first-time login of Data Studio. Click + Blank Report and accept the Terms of Service. Go back to the BigQuery page and click Explore with Data Studio again.

4.    In the new Data Studio report, select Add a chart > Time series Chart.

5.    Add country_cases and country_deaths to the Metric field.

6.    Click Save to commit the change.

Method 2: (if method 1 does not work)

Step 1: Go to https://datastudio.google.com/

   Step 2: Select Blank Report

   Step 3: Accept the terms and continue then say "yes to all" and proceed

   Step 4: Again select blank report if you are redirected back to the previous page

   Step 5: Click on BigQuery and press "Authorise"

   Step 6: Select "Custom Query" and then select the project and type:

SELECT

  date, SUM(cumulative_confirmed) AS country_cases,

  SUM(cumulative_deceased) AS country_deaths

 

FROM

  `bigquery-public-data.covid19_open_data.covid19_open_data`

 

WHERE

  date BETWEEN '2020-03-15'

  AND '2020-04-30'

  AND country_name ="United States of America"

GROUP BY date


     After this you will successfully receive your points


 Congratulations you have completed your Challenge Lab!


SAKURA SATIO









Comments

Popular posts from this blog

A Tour of Google Cloud Hands-on Labs [GSP282]

Link to the Lab:  https://www.cloudskillsboost.google/focuses/2794?parent=catalog Link to the Video:  https://youtu.be/KFjyI-o60W8 Link to the Channel:  https://www.youtube.com/channel/UCiWP5KYh4MWj-yp3j-DMIrA Answers are marked bold and are highlighted in light yellow This builds a temporary environment in Google Cloud. Start lab (button) Credit Time Score When the timer reaches 00:00:00, you will lose access to your temporary Google Cloud environment. False True Some labs have tracking, which scores your completion of hands-on lab activities. False True In order to receive completion credit for a lab that has tracking, you must complete the required hands-on lab activities. False True What field is NOT found in the left pane? Project ID System admin Password Open Google Console The username in the left panel, which resembles googlexxxxxx_student@qwiklabs.net, is a Cloud IAM identity. True False An organizing entity for anything you build with Google Cloud. Password...

Deploy and Manage Cloud Environments with Google Cloud: Challenge Lab [GSP314]

Link to the Lab:  https://google.qwiklabs.com/focuses/10417?parent=catalog Link to the youtube video:  https://youtu.be/eraSK1T45Do Link to my youtube channel:  https://www.youtube.com/channel/UCiWP5KYh4MWj-yp3j-DMIrA   DO NOT START WITHOUT LAB RESOURCES! DO NOT MAKE HASTE. COPY PASTE STEP BY STEP OTHERWISE IT WILL NOT WORK. TASK 1: GO TO COMPUTE ENGINE > VM INSTANCES > OPEN SSH OF kraken-jumphost TYPE THE COMMANDS IN SSH: cd /work/dm sed -i s/SET_REGION/us-east1/g prod-network.yaml gcloud deployment-manager deployments create prod-network --config=prod-network.yaml gcloud config set compute/zone us-east1-b gcloud container clusters create kraken-prod --num-nodes 2 --network kraken-prod-vpc --subnetwork kraken-prod-subnet --zone us-east1-b gcloud container clusters get-credentials kraken-prod cd /work/k8s for F in $(ls *.yaml); do kubectl create -f $F; done