Data Integration – Task 1
,SECTION A
In this business report, I will identify the top three most watched movies along with the revenue
generated by each of those movies. This could be useful to see the most popular movies, how
much revenue each one creates, and if you might want to increase the price based on supply
and demand. This could also tell you if it might be wise to increase your inventory of those
specific movies.
1. Describe the data used for the report.
A. The film id was counted based on the rental id to see how many times each movie
was rented. Then it was grouped by the film id and ordered by count descending.
Then the top three was chosen with the limit three. Then the sum of the payment
amount was found for each.
2. Identify two or more specific tables from the given dataset that will provide the data
necessary for the detailed and the summary sections of the report.
A. The tables that were used to generate this report was: rental, inventory, film and
payment.
3. Identify the specific fields that will be included in the detailed and the summary sections of
the report.
A. The data needed for this report was rental id, inventory id, film id, film title more for
visual, and payment amount. The film id was counted based on the rental id to see
how many times each movie was rented. Then it was grouped by the film id and
ordered by count descending. Then the top three was chosen with the limit three.
Then the sum of the payment amount was found for each.
4. Identify one field in the detailed section that will require a custom transformation and
explain why it should be transformed. For example, you might translate a field with a
value of ‘N’ to ‘No’ and ‘Y’ to ‘Yes’.
A. The transformed data was the payment amount of the summary table. It was
aggregated to find the total amount of revenue generated for each of the top three
movies. But did write a ‘SELECT’ statement that would pull the data and change the
‘amount’ column from a numeric to money using the CASE function for ease of
reading; to distinguish from all the generic numbers within the table.
B. In the detailed section, it is raw data and no reason to transform any data as of yet
in the process. The transformation would take place in the summary section.
5. Explain the different business uses of the detailed and the summary sections of the report.
A. This could be useful to see the most popular movies, how much revenue each one
creates, and if you might want to increase the price based on supply and demand.
This could also tell you if it might be wise to increase your inventory of those
specific movies. Help monitor consumer trends to assist in delivering similar movies
which will improve the customer experience. This could especially be useful to
increase traffic/business.
6. Explain how frequently your report should be refreshed to remain relevant to
stakeholders.
A. This report could possibly be ran once a month to give a good idea of what movies
are popular at that moment in time and the revenue generated. But would expect to
be based on the volume of the store.
SECTION B
Write a SQL code that creates the tables to hold your report sections.
--Create detailed Table
DROP TABLE IF EXISTS detailed;
, CREATE TABLE detailed (
rental_id int,
inventory_id int,
film_id int,
title varchar(90),
amount numeric (7,2)
);
--Verifying detailed table
SELECT * FROM detailed;
--Create the summary Table
DROP TABLE IF EXISTS summary;
CREATE TABLE summary(
rental_cnt int,
film_id int,
title VARCHAR(90),
rental_rev numeric(7,2)
);
SELECT * FROM summary;
SECTION C
Write a SQL query that will extract the raw data needed for the Detailed section of your report
from the source database and verify the data’s accuracy.
--Extract raw data and populate detailed data
INSERT INTO detailed(
rental_id,
inventory_id,
film_id,
title,
amount
)
SELECT
r.rental_id,
i.inventory_id,
f.film_id,
f.title,
CAST(p.amount AS money)
FROM payment p
INNER JOIN rental r ON r.rental_id = p.rental_id
INNER JOIN inventory i ON i.inventory_id = r.inventory_id
INNER JOIN film f ON f.film_id = i.film_id;
SECTION D
Write code for function(s) that perform the transformation(s) you identified in part A4.
--CREATING FUNCTION