100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

Summary Python Data Operations 3: Filtering

Rating
-
Sold
-
Pages
10
Uploaded on
09-12-2022
Written in
2022/2023

Notes of Pandas data operations covered in the Principles of Programming course, part of the Computer Science and AI bachelor degree. The notes are initially written in Jupyter Notebook. They contain practical examples of data operations in python and images to explain the structures and processes. This third section contains: - Conditional Selection - pd.Series and Operators - Basic Filters - Missing Values - Identify Nulls - Filter Nulls - Fill in Nulls - Remove Nulls - Unique Values

Show more Read less
Institution
Course









Whoops! We can’t load your doc right now. Try again or contact support.

Connected book

Written for

Institution
Study
Course

Document information

Summarized whole book?
No
Which chapters are summarized?
Pandas data wrangling
Uploaded on
December 9, 2022
Number of pages
10
Written in
2022/2023
Type
Summary

Subjects

Content preview

Python Data Operations 3: Filtering
(Using the numpy and pandas packages imported in section one.)

This third section contains:

Conditional Selection
pd.Series and Operators
Basic Filters
Missing Values

Identify Nulls
Filter Nulls
Fill in Nulls
Remove Nulls
Unique Values
# create test dataframe
test_df = pd.DataFrame([
['A3', 0, -1, 0, 'si'],
['B1', 1, None, 0, 'no'],
['B3', 4, None, 0, 'no'],
['B3', 5, 1, 0, 'si'],
['A1', 4, 0, None, None],
['A3', 1, 2, 1, 'si'],
['C2', 4, 1, 1, 'no']],
columns=['A', 'B', 'C', 'D', 'E'],
index=[f'R{i}' for i in range(7)]
)
test_df


A B C D E

R0 A3 0 -1.0 0.0 si

R1 B1 1 NaN 0.0 no

R2 B3 4 NaN 0.0 no

R3 B3 5 1.0 0.0 si

R4 A1 4 0.0 NaN None

R5 A3 1 2.0 1.0 si

R6 C2 4 1.0 1.0 no




Conditional selection
In pandas conditional selection is filtering some records according to certain criteria

, The syntax is df[filter] where filter is a sequence of boolean values of the same length
as the table, and the command allows us to select/filter records according to a certain
condition.



# create simple filter
filter = [True, True, False, False, True, True, False]
filter

[True, True, False, False, True, True, False]


# apply simple filter
# .iloc and .loc are equivalent here
test_df.iloc[filter, :]
test_df.loc[filter, :]


A B C D E

R0 A3 0 -1.0 0.0 si

R1 B1 1 NaN 0.0 no

R4 A1 4 0.0 NaN None

R5 A3 1 2.0 1.0 si



# filter columns containing a list of column names
columns = ['A', 'B', 'C']
# apply columns filter
test_df.loc[:, columns]
test_df[columns] #equivalent to previous line
$8.46
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Get to know the seller
Seller avatar
beatricemossberg
3.0
(1)

Also available in package deal

Get to know the seller

Seller avatar
beatricemossberg IE University
Follow You need to be logged in order to follow users or courses
Sold
2
Member since
3 year
Number of followers
2
Documents
11
Last sold
2 year ago
Computer Science and Data Notes

3.0

1 reviews

5
0
4
0
3
1
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions