100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

Summary Unit 10 Big Data - Assignment 2 Distinction

Rating
3.0
(5)
Sold
15
Pages
19
Uploaded on
12-06-2022
Written in
2021/2022

In this assignment, I will be starting as a newbie for an internship at an educational charity. The director of the charity has been interested in how big data and data analytics might be used by the organisation to improve the way they target their efforts so that the work they do can be as effective and organised as possible. Following the brief, the director has been asked to look into statistical tools and techniques, which can be used to analyse and manipulate data. The charity has given me access to a data that contains GCSE results in England. They are trying to spot how the data differs from those by the local authority and also by gender. This assignment has achieved a Distinction grade with all criteria covered. Please message me for more information. Kind regards.

Show more Read less










Whoops! We can’t load your doc right now. Try again or contact support.

Document information

Uploaded on
June 12, 2022
Number of pages
19
Written in
2021/2022
Type
Summary

Content preview

Unit 10 – Assignment 2

Discrete Continuous
(Whole numbers), 1, 5, 2, 3 etc. it allows numerical data such as 1.45,
52.35 etc. This is a more precise
measurement of data.
Ungrouped data Grouped
This is the raw data; it hasn’t been The data that has been put together,
sorted or categorised. It is a list of it is categorised and shown in tables,
numbers. graphs.
Central tendency
This is the single value that attempts
to describe a set of data by identifying
the central position.


Excel formulas

Mode =MODE(Num1, Num2)
Median =MEDIAN(Num1, Num2)
Mode =MODE.MULT(Num1, Num2)
Range =MAX()-MIN()
Q1 = TOP %25 =QUARTILE.INC(num1, num2,1)
Q3 = BOTTOM %25 =QUARTILE.INC(num1, num2,3)
INTER QUARTILE RANGE (IQR) =SUM(Q3cell-Q1cell)
VARIENCE =VAR.S(Num1, Num2)

Standard Deviation STDEV.S(Num1, Num2)


Variance measures how each number within the data set is from the mean and
from every other number in the set.

Standard deviation is a measure of the amount of dispersion of the data set. Low
standard deviation means that the values are close to the mean, high standard
deviation means that the values are spread out.

Introduction to the assignment.

In this assignment, I will be starting as a newbie for an internship at an
educational charity. The director of the charity has been interested in how big
data and data analytics might be used by the organisation to improve the way
they target their efforts so that the work they do can be as effective and
organised as possible. Following the brief, the director has been asked to look
into statistical tools and techniques, which can be used to analyse and
manipulate data. The charity has given me access to a data that contains GCSE
results in England. They are trying to spot how the data differs from those by
the local authority and also by gender.

, When it comes to mentioning how reliable each method is, we must first
research into them and understand their properties and how they work. We must
understand how to use them in our data. Starting with the Median, which is a
method used to find the middle number in a set of data and ensuring this
number is sorted in ascending or descending order. A list of numbers can be a
bit more descriptive of the data set than the average, which we will mention
next. The median can be used as opposed to the mean when there are
anomalies in the sequency that might potentially impact or change the average
of the values. In order for us to determine the median value in a sequency of
numbers, the numbers in the data set must firstly be sorted or arranged in the
value from the lowest to the highest.




Luckily for us, we have the access
to a software that can make our life easier, but
arranging the numbers for us, using a simple formula.
The software used in this example is excel, and as you
can see we have a set of data for Boys and their GCSE
result. For this example, we will be using the 5 A*-C
column. When we input the =MEDIAN(G4:G14) the
software collects the numbers from cell G4 to cell G14,
which is all the numbers in the set, and arranging
them for us from the lowest to the highest. Using the
PREFIX =MEDIAN in front, it allow us to calculate or
in other words the software calculates the MEDIAN value for us and returns it to
1 decimal place.

The second method that we used in our set of data, would be average. In
statistics the mean of a set of numbers, classify as the average value of those
numbers. In order for us to find the Average or the mean we add up all the
numbers and we divide them by how many there are in a set. The average is
quite the same as the mean, they are both measures of the central tendency.
They tell us what the most typical number in a data set or which numbers is best
represents all of the numbers that are included in the set.

, As we can see, we have access to a software that will
do the job for us. By selecting the cells G4 to G14, we are simply commanding
the software to add up all of the numbers and when using the =AVERAGE prefix,
we tell the program to use all of the numbers from G4 to G14 and divide them
by how many they are. The average we receive is rounded up to 1 Decimal
place.




Rounding up the set of data. There are ways we can reduce our data, at least
visually and make it easier to understand by decreasing the amount of decimal
places or rounding it up. In excel there is an option that allows us to do it by
simply clicking on a button.

Let’s say we want to decrease the
amount of data we see in the
columns for the Standard deviation
and the variance. As shown, we
have a lot of decimals and it can be
confusing, especially for people
that are not familiar with it. In
order to reduce it, we simply click

on the following button: this button, allows us to decrease the decimal
places when a field is highlighted. We can also increase the decimals by using

the button next to it . An example of what a reduced variant of this data


might look like is the following:

As you can see, we went all the way from 27.57472.. to 28 and from 28.77072..
to 29. This can be useful when we have to add up and subtract data, it will allow
us to get simpler, but not accurate results.

Throughout this data set, we have
some anomalies such as when using
the mode as shown on the diagram we have experienced an error that comes up
under #N/A, meaning our data is not very reliable. The mode is the value that
appears the most often in a data set and it can be used as a measure of central
tendency. As shown on the image, we can see that no results are displayed for
the mode meaning our data is not so realisable as the result can not be properly
obtained, due to the numbers being in an uncooperative range.

Throughout this data set, we will be working with standard deviation. Standard
deviation is a measure which shows us how much variation (such as the spread,
dispersion) from the mean exists. The standard deviation shows a typical
deviation from the mean. It is a popular measure of variability as it returns to
£15.49
Get access to the full document:
Purchased by 15 students

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Reviews from verified buyers

Showing all 5 reviews
2 year ago

2 year ago

2 year ago

3 year ago

3 year ago

3.0

5 reviews

5
1
4
2
3
0
2
0
1
2
Trustworthy reviews on Stuvia

All reviews are made by real Stuvia users after verified purchases.

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
kaloyantitov Norwich City College of Further and Higher Education
View profile
Follow You need to be logged in order to follow users or courses
Sold
84
Member since
3 year
Number of followers
69
Documents
14
Last sold
5 months ago
Level 2, Level 3, 3 Extended Assignments - Information and Technology.

Please note, all assignments listed on my profile are completed to a distinction standard, they have been marked and graded by teachers. Please send me a message if more information is needed or if something is unclear. Kind Regards

2.8

19 reviews

5
6
4
2
3
1
2
2
1
8

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these revision notes.

Didn't get what you expected? Choose another document

No problem! You can straightaway pick a different document that better suits what you're after.

Pay as you like, start learning straight away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and smashed it. It really can be that simple.”

Alisha Student

Frequently asked questions