100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4,6 TrustPilot
logo-home
Exam (elaborations)

COMPSCI 70 Discrete Mathematics for Computer Science Homework 3 Table Manipulation and Visualization

Rating
-
Sold
-
Pages
16
Grade
A+
Uploaded on
24-10-2021
Written in
2021/2022

COMPSCI 70 Discrete Mathematics for Computer Science Homework 3 Table Manipulation and Visualization

Institution
Course

Content preview

9/12/2018 hw03




Homework 3: Table Manipulation and Visualization

Reading:

Visualization (https://www.inferentialthinking.com/chapters/07/visualization.html)



Please complete this notebook by filling in the cells provided. Before you begin, execute the following cell to load
the provided tests. Each time you start your server, you will need to execute this cell again to load the tests.

Homework 3 is due Thursday, 9/13 at 11:59pm. Start early so that you can come to office hours if you're stuck.
Check the website for the office hours schedule. You will receive an early submission bonus point if you turn in
your final submission by Wednesday, 9/12 at 11:59pm. Late work will not be accepted as per the policies
(http://data8.org/fa18/policies.html) of this course.

Throughout this homework and all future ones, please be sure to not re-assign variables throughout the
notebook! For example, if you use max_temperature in your answer to one question, do not reassign it later
on. Moreover, please be sure to only put your written answers in the provided cells.


In [6]: # Don't change this cell; just run it.

import numpy as np
from datascience import *


# These lines do some fancy plotting magic.\n",
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')

from client.api.notebook import Notebook
ok = Notebook('hw03.ok')
_ = ok.auth(inline=True)

=====================================================================
Assignment: Homework 3: Table Manipulation and Visualization
OK, version v1.12.5
=====================================================================

Successfully logged in as




Differences between Universities



https://datahub.berkeley.edu/user/alanlai200/nbconvert/html/materials-fa18/materials/fa18/hw/hw03/hw03.ipynb?download=false 1/16

,9/12/2018 hw03



Question 1. Suppose you're choosing a university to attend, and you'd like to quantify how dissimilar any two
universities are. You rate each university you're considering on several numerical traits. You decide on a very
detailed list of 1000 traits, and you measure all of them! Some examples:

The cost to attend (per year).
The average Yelp review of nearby Thai restaurants.
The USA Today ranking of the Medical school.
The USA Today ranking of the Engineering school.

You decide that the dissimilarity between two universities is the total of the differences in their traits. That is, the
dissimilarity is:

the sum of
the absolute values of
the 1000 differences in their trait values.

In the next cell, we've loaded arrays containing the 1000 trait values for Stanford and Berkeley. Compute the
dissimilarity (according to the above technique) between Stanford and Berkeley. Call your answer
dissimilarity . Use a single line of code to compute the answer.

Note: The data we're using aren't real -- we made them up for this exercise, except for the cost-of-attendance
numbers, which were found online.



In [7]: stanford = Table.read_table("stanford.csv").column("Trait value")
berkeley = Table.read_table("berkeley.csv").column("Trait value")

dissimilarity = sum(abs(stanford-berkeley))
dissimilarity

Out[7]: 14060.558701067917


In [8]: _ = ok.grade('q1_1')

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Running tests

---------------------------------------------------------------------
Test summary
Passed: 1
Failed: 0
[ooooooooook] 100.0% passed




Question 2. Why do we sum up the absolute values of the differences in trait values, rather than just summing
up the differences?




https://datahub.berkeley.edu/user/alanlai200/nbconvert/html/materials-fa18/materials/fa18/hw/hw03/hw03.ipynb?download=false 2/16

, 9/12/2018 hw03



When subtracting the differences in trait value, the value can be either positive or negative. But our goal is to
determine the dissimilarity so a -4 trait means that Berkeley is higher for that item and +4 trait value means that
Standford is higher for that item but both value show the same value for dissimilarity.



Weighing the traits

After computing dissimilarities between several schools, you notice a problem with your method: the scale of the
traits matters a lot.

Since schools cost tens of thousands of dollars to attend, the cost-to-attend trait is always a much bigger number
than most other traits. That makes it affect the dissimilarity a lot more than other traits. Two schools that differ in
cost-to-attend by $900 , but are otherwise identical, get a dissimilarity of 900. But two schools that differ in
graduation rate by 0.9 (a huge difference!), but are otherwise identical, get a dissimilarity of only 0.9.

One way to fix this problem is to assign different "weights" to different traits. For example, we could fix the
problem above by multiplying the difference in the cost-to-attend traits by .001, so that a difference of $900 in
the attendance cost results in a dissimilarity of $900 × .001, or 0.9.

Here's a revised method that does that for every trait:

1. For each trait, subtract the two schools' trait values.
2. Then take the absolute value of that difference.
3. Now multiply that absolute value by a trait-specific number, like .001 or 2 .
4. Now, sum the 1000 resulting numbers.

Question 3. Suppose you've already decided on a weight for each trait. These are loaded into an array called
weights in the cell below. weights.item(0) is the weight for the first trait, weights.item(1) is the weight
for the second trait, and so on. Use the revised method to compute a revised dissimilarity between Berkeley and
Stanford.

Hint: Using array arithmetic, your answer should be almost as short as in question 1.


In [9]: weights = Table.read_table("weights.csv").column("Weight")

revised_dissimilarity = sum(abs(stanford-berkeley) * weights)
revised_dissimilarity

Out[9]: 505.98313211458805


In [10]: _ = ok.grade('q1_3')

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Running tests

---------------------------------------------------------------------
Test summary
Passed: 1
Failed: 0
[ooooooooook] 100.0% passed


https://datahub.berkeley.edu/user/alanlai200/nbconvert/html/materials-fa18/materials/fa18/hw/hw03/hw03.ipynb?download=false 3/16

Written for

Course

Document information

Uploaded on
October 24, 2021
Number of pages
16
Written in
2021/2022
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

R138,94
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached


Document also available in package deal

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
StuviaGuides West Virgina University
Follow You need to be logged in order to follow users or courses
Sold
15847
Member since
6 year
Number of followers
8357
Documents
5619
Last sold
6 hours ago
Accounting, Finance, Statistics, Computer Science, Nursing, Chemistry, Biology & More — A+ Test Banks, Study Guides & Solutions

As a Top 1st Seller on Stuvia and a nursing professional, my mission is to be your light in the dark during nursing school and beyond. I know how stressful exams and assignments can be, which is why I’ve created clear, reliable, and well-structured resources to help you succeed. I offer test banks, study guides, and solution manuals for all subjects — including specialized test banks and solution manuals for business books. My materials have already supported countless students in achieving higher grades, and I want them to be the guide that makes your academic journey easier too. I’m passionate, approachable, and always focused on quality — because I believe every student deserves the chance to excel.

Read more Read less
4,3

2247 reviews

5
1540
4
298
3
178
2
72
1
159

Trending documents

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can immediately select a different document that better matches what you need.

Pay how you prefer, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card or EFT and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions