100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

Summary of Data Science Skills Python DataCamp modules (325235-M-3)

Rating
4,5
(2)
Sold
34
Pages
83
Uploaded on
24-05-2023
Written in
2022/2023

This document includes all modules of the DataCamp modules for Data Science Skills.

Institution
Course











Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
May 24, 2023
Number of pages
83
Written in
2022/2023
Type
Summary

Subjects

Content preview

Summary data science skills
Inhoud
Course 1: Introduction ............................................................................................................................. 3
1.1 Python basics ................................................................................................................................. 3
1.2 Python lists .................................................................................................................................... 3
1.3 Functions and packages................................................................................................................. 5
1.4 Numpy (Numeric Python).............................................................................................................. 6
Course 2: Intermediate python ............................................................................................................... 8
2.1 Matplotlib ...................................................................................................................................... 8
2.2 Dictionaries & pandas.................................................................................................................... 9
2.3 Logic, Control Flow and Filtering ................................................................................................. 13
2.4 Loops ........................................................................................................................................... 15
2.5 Case study: hacker statistics ........................................................................................................ 17
2.5 Summary...................................................................................................................................... 19
Course 3: DataFrames............................................................................................................................ 20
3.1 Transforming DataFrames............................................................................................................ 20
3.2 Aggregating DataFrames; Summary statistics ............................................................................. 21
3.3 Slicing and Indexing DataFrames ................................................................................................. 23
3.4 Creating and Visualizing DataFrames .......................................................................................... 25
Course 4: Supply Chain Analytics in Python .......................................................................................... 28
4.1 Basics of supply chain optimization and PuLP ............................................................................. 28
4.2 Modeling in PuLP ......................................................................................................................... 29
4.3 Solve and evaluate model ........................................................................................................... 32
4.4 Sensitivity and simulation testing of model ................................................................................ 34
Course 5: Cleaning Data in Python ........................................................................................................ 38
5.1 Common data problems .............................................................................................................. 38
5.2 Text and categorical data problems ............................................................................................. 41
5.3 Advanced data problems ............................................................................................................. 43
5.4 Record linkage ............................................................................................................................. 46
Course 6: Cluster analysis ...................................................................................................................... 49
6.1 Introduction to clustering ............................................................................................................ 49
6.2 Hierarchical Clustering ................................................................................................................. 53
6.3 K-Means clustering ...................................................................................................................... 56
6.4 Clustering in the real world ......................................................................................................... 59

,Course 7: Machine Learning with scikit-learn (model testing) .............................................................. 63
7.1 Classification ................................................................................................................................ 63
7.2 Regression ................................................................................................................................... 66
7.3 Fine-tuning your model ............................................................................................................... 68
7.4 Preprocessing and pipelines ........................................................................................................ 70
Course 8: Linear classifiers .................................................................................................................... 73
8.1 Applying logistic regression and SVM .......................................................................................... 73
8.2 Loss functions .............................................................................................................................. 75
8.3 Logistic regression ....................................................................................................................... 77
8.4 Support Vector Machines (SVMs in detail) .................................................................................. 80

,Course 1: Introduction
1.1 Python basics
iPython shell = interactive
Python script > text files > use print to generate output
Use a # to add comments in a python script

Calculator




Variables and types
• Variables: named piece of memory that can store a value.
- Syntax: name = value

Usage:
- Compute an expression's result,
- Store that result into a variable,
- And use that variable later in the program.

• Types: Type(‘variable’)
- Float Decimal number
- Integer Whole number
- Strings Text ‘’’’
- Booleans True/False

> Different behaviour using operators for different types of floats.
> When working with different types -> Convert if necessary before using operators.

1.2 Python lists
Lists; store multiple values
• Lists: Lists are used for storing small amounts of one-dimensional data containing different types.



- But, can’t use directly with arithmetical (matrix) operators (+, -, *, /, ...).
- If you need efficient arrays with arithmetic and better multidimensional tools.

• Sublists: One list can contain more sublists

, Subsetting lists (access information in a list; indexes)

• Element: The number in a list. 1.68 is the fourth element
• Index: The index of an element in the list, it starts at 0. 1.68 has index 3



> To select an element using indexing: Fam[3] gives ‘1.68’
> Negative indexes Fam[-1] gives ‘1.89’

• Slicing: Select multiple elements in a list and creating a new list
Example: fam [3:5] returns [1.68, ‘mom’] (element 3 and 4)

> [Start ; End] -> Start is included, End is excluded!
> [:4] returns indexes 0, 1, 2 and 3 (elements 1, 2, 3, 4)
> [5:] returns indexes 5, 6, 7 (elements 6, 7, 8)




Subsetting lists of lists
x = [["a", "b", "c"],
["d", "e", "f"],
["g", "h", "i"]]
X[rows][columns]
x[2][0] Returns: ‘g’ (sublist 2 , index 0)
x[2][:2] Returns: [‘g’, ‘h’] (sublist 2 , index 0 and 1)
Manipulation Lists (update lists for commands)
• Changing the elements in a list (e.g. change, add, remove elements)


1. Change: Fam [7] = 1.86 Changes the height of dad
2. Change slice: Fam [0:2] = [“Lisa”, 1.74] Changes the 0 and 1 index

3. Adding/extend: Fam + [“me”, 1.79] Adds ‘me’ and 1.79 to the list

4. Remove: del(fam[2]) Removes “emma from the list”
> Watch out because the indexes of the list have now changes!

How lists work




> x and y are the referred to the same list. > Solution: create y as a new list.

Reviews from verified buyers

Showing all 2 reviews
11 months ago

2 year ago

4,5

2 reviews

5
1
4
1
3
0
2
0
1
0
Trustworthy reviews on Stuvia

All reviews are made by real Stuvia users after verified purchases.

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
jesmen12 Tilburg University
Follow You need to be logged in order to follow users or courses
Sold
136
Member since
3 year
Number of followers
70
Documents
13
Last sold
4 days ago

4,0

10 reviews

5
4
4
2
3
4
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can immediately select a different document that better matches what you need.

Pay how you prefer, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card or EFT and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions