Exam (elaborations)

COMPSCI 70 Discrete Mathematics for Computer Science Homework 3 Table Manipulation and Visualization

Rating

Sold

Pages

Grade

A+

Uploaded on

24-10-2021

Written in

2021/2022

COMPSCI 70 Discrete Mathematics for Computer Science Homework 3 Table Manipulation and Visualization

Institution

Course

Content preview

9/12/2018 hw03

Homework 3: Table Manipulation and Visualization

Reading:

Visualization (https://www.inferentialthinking.com/chapters/07/visualization.html)

Please complete this notebook by filling in the cells provided. Before you begin, execute the following cell to load
the provided tests. Each time you start your server, you will need to execute this cell again to load the tests.

Homework 3 is due Thursday, 9/13 at 11:59pm. Start early so that you can come to office hours if you're stuck.
Check the website for the office hours schedule. You will receive an early submission bonus point if you turn in
your final submission by Wednesday, 9/12 at 11:59pm. Late work will not be accepted as per the policies
(http://data8.org/fa18/policies.html) of this course.

Throughout this homework and all future ones, please be sure to not re-assign variables throughout the
notebook! For example, if you use max_temperature in your answer to one question, do not reassign it later
on. Moreover, please be sure to only put your written answers in the provided cells.

In [6]: # Don't change this cell; just run it.

import numpy as np
from datascience import *

# These lines do some fancy plotting magic.\n",
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')

from client.api.notebook import Notebook
ok = Notebook('hw03.ok')
_ = ok.auth(inline=True)

=====================================================================
Assignment: Homework 3: Table Manipulation and Visualization
OK, version v1.12.5
=====================================================================

Successfully logged in as

Differences between Universities

https://datahub.berkeley.edu/user/alanlai200/nbconvert/html/materials-fa18/materials/fa18/hw/hw03/hw03.ipynb?download=false 1/16

,9/12/2018 hw03

Question 1. Suppose you're choosing a university to attend, and you'd like to quantify how dissimilar any two
universities are. You rate each university you're considering on several numerical traits. You decide on a very
detailed list of 1000 traits, and you measure all of them! Some examples:

The cost to attend (per year).
The average Yelp review of nearby Thai restaurants.
The USA Today ranking of the Medical school.
The USA Today ranking of the Engineering school.

You decide that the dissimilarity between two universities is the total of the differences in their traits. That is, the
dissimilarity is:

the sum of
the absolute values of
the 1000 differences in their trait values.

In the next cell, we've loaded arrays containing the 1000 trait values for Stanford and Berkeley. Compute the
dissimilarity (according to the above technique) between Stanford and Berkeley. Call your answer
dissimilarity . Use a single line of code to compute the answer.

Note: The data we're using aren't real -- we made them up for this exercise, except for the cost-of-attendance
numbers, which were found online.

In [7]: stanford = Table.read_table("stanford.csv").column("Trait value")
berkeley = Table.read_table("berkeley.csv").column("Trait value")

dissimilarity = sum(abs(stanford-berkeley))
dissimilarity

Out[7]: 14060.558701067917

In [8]: _ = ok.grade('q1_1')

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Running tests

---------------------------------------------------------------------
Test summary
Passed: 1
Failed: 0
[ooooooooook] 100.0% passed

Question 2. Why do we sum up the absolute values of the differences in trait values, rather than just summing
up the differences?

https://datahub.berkeley.edu/user/alanlai200/nbconvert/html/materials-fa18/materials/fa18/hw/hw03/hw03.ipynb?download=false 2/16

, 9/12/2018 hw03

When subtracting the differences in trait value, the value can be either positive or negative. But our goal is to
determine the dissimilarity so a -4 trait means that Berkeley is higher for that item and +4 trait value means that
Standford is higher for that item but both value show the same value for dissimilarity.

Weighing the traits

After computing dissimilarities between several schools, you notice a problem with your method: the scale of the
traits matters a lot.

Since schools cost tens of thousands of dollars to attend, the cost-to-attend trait is always a much bigger number
than most other traits. That makes it affect the dissimilarity a lot more than other traits. Two schools that differ in
cost-to-attend by $900 , but are otherwise identical, get a dissimilarity of 900. But two schools that differ in
graduation rate by 0.9 (a huge difference!), but are otherwise identical, get a dissimilarity of only 0.9.

One way to fix this problem is to assign different "weights" to different traits. For example, we could fix the
problem above by multiplying the difference in the cost-to-attend traits by .001, so that a difference of $900 in
the attendance cost results in a dissimilarity of $900 × .001, or 0.9.

Here's a revised method that does that for every trait:

1. For each trait, subtract the two schools' trait values.
2. Then take the absolute value of that difference.
3. Now multiply that absolute value by a trait-specific number, like .001 or 2 .
4. Now, sum the 1000 resulting numbers.

Question 3. Suppose you've already decided on a weight for each trait. These are loaded into an array called
weights in the cell below. weights.item(0) is the weight for the first trait, weights.item(1) is the weight
for the second trait, and so on. Use the revised method to compute a revised dissimilarity between Berkeley and
Stanford.

Hint: Using array arithmetic, your answer should be almost as short as in question 1.

In [9]: weights = Table.read_table("weights.csv").column("Weight")

revised_dissimilarity = sum(abs(stanford-berkeley) * weights)
revised_dissimilarity

Out[9]: 505.98313211458805

In [10]: _ = ok.grade('q1_3')

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Running tests

---------------------------------------------------------------------
Test summary
Passed: 1
Failed: 0
[ooooooooook] 100.0% passed

https://datahub.berkeley.edu/user/alanlai200/nbconvert/html/materials-fa18/materials/fa18/hw/hw03/hw03.ipynb?download=false 3/16

Report Copyright Violation

Written for

Course: CPSC

All documents for this subject (102)

Document information

Uploaded on: October 24, 2021
Number of pages: 16
Written in: 2021/2022
Type: Exam (elaborations)
Contains: Questions & answers

Subjects

compsci 70 discrete mathematics for computer science homework 3 table manipulation and visualization

R138,94

Get access to the full document:

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

StuviaGuides

4,3

(2247)

Document also available in package deal

Get to know the seller

StuviaGuides West Virgina University

View profile

Sold

15847

Member since

6 year

Number of followers

8357

Documents

5619

Last sold

6 hours ago

Accounting, Finance, Statistics, Computer Science, Nursing, Chemistry, Biology & More — A+ Test Banks, Study Guides & Solutions

As a Top 1st Seller on Stuvia and a nursing professional, my mission is to be your light in the dark during nursing school and beyond. I know how stressful exams and assignments can be, which is why I’ve created clear, reliable, and well-structured resources to help you succeed. I offer test banks, study guides, and solution manuals for all subjects — including specialized test banks and solution manuals for business books. My materials have already supported countless students in achieving higher grades, and I want them to be the guide that makes your academic journey easier too. I’m passionate, approachable, and always focused on quality — because I believe every student deserves the chance to excel.

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can immediately select a different document that better matches what you need.

Pay how you prefer, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card or EFT and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying this summary from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller StuviaGuides. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy this summary for R138,94. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 52759 documents were sold in the last 30 days Founded in 2010, the go-to place to buy summaries for 16 years now

COMPSCI 70 Discrete Mathematics for Computer Science Homework 3 Table Manipulation and Visualization

Content preview

Written for

Document information

Subjects

Document also available in package deal

Get to know the seller

Trending documents

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay how you prefer, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying this summary from?

Will I be stuck with a subscription?

Can Stuvia be trusted?