PART I: DATA ANALYTICS ● Supply Chain Data Rule of Thumb: Is there a meaningful zero in this variable?
Ask the Question: Types of Analysis ● Customer Relationship Management Data How to Summarize Numerical Data
Descriptive - What happened? ● Human Resource Data ● Counting and Grouping, Proportion, Summing, Averaging
Diagnostic - Why did it happen? Sources of Non-Accounting Data Data Dictionaries
Predictive - Will it happen and if so, when?
● Macroeconomic Data We define Data Dictionary as a centralized repository of information
Prescriptive - What should we do, based on what we expect to happen?
PART II: INTRO TO ACCOUNTING DATA ○ Gross Domestic Product about a data set
Data v. Information ■ A measure of economy-wide performance ● Data about the Data. “Read Me”
● Data are simply raw facts that describe an event ● Current and Historical Stock Prices It contains a separate record for each field (or variable)
○ May have little meaning on their own ○ Unemployment Numbers
● Information is defined as data organized in a meaningful way to be ■ A measure of labor availability
useful to the decision maker ● Social Media
○ Data serves as an input ○ Consumer Price Index
Information = Processed Data + Context ■ A measure of inflation
PART IV: PREPARING FOR DATA ANALYSIS
What is Big Data? - Four V’s ● Analyst Research Reports and Earnings Forecasts
Excel, Tableau and Databases
1. Volume- the scale / amount of the data ○ Housing Market Starts and Price Levels
Excel: Best at data analysis and exploring data. Tool for creating
2. Veracity - the certainty / trustworthiness of the data ■ A key measure of economic status
PivotTables, graphs, and performing statistical analysis
3. Velocity - the speed of the data ● Bureau of Economic Analysis (BEA)
Tableau: Best for data visualization; does not allow original data entry.
4. Variety - the diversity of the data ● Bureau of Labor Statistics (BLS)
What about databases?
Structured v. Unstructured Data ● World Development Indicators (WDI)
A database is a structured data set that can be accessed from a
Structured: Unstructured: ● IMF, OECD, etc.
computer system.
Consolidated Statement of Cash Blog and / or Tweets Potential (Structured) Data Sources: Helpful for Final Project
● Designed to hold big data (Excel: 1,048,576 rows by 16,384 columns)
● Kaggle
● Allow multiple users to access the data at the same time
○ Stock Market
● The most secure method of storing data
○ IMDB Movies
The most popular database model is relational.
● Data.world
A database query is a request for data so we can retrieve it.
○ Amazon Product Reviews
● Structured Query Language (SQL) is the most popular query language
● NYC Open Data
Relational Databases
○ Restaurant Safety
Relational databases break the data into separate tables, each containing a
The Fifth “V”? PART III: TYPES OF DATA
unique list of the items stored
● Big Data = the ability to achieve greater Value through insights Categorial - tend to be represented by words
● Customer Table
from superior analytics ● Ex. Gender, Transaction Types
● Billing Table
Sources of Accounting Data Numerical - meaningful numbers
Tables are linked with each other by primary keys and foreign keys
● Financial Statements | Public ● Ex. Transaction Amount, Net Income, Age
Components of a Relational Database
● Financial Accounting - Related Data | Avail. Rule of Thumb: Whether it makes sense to sum up two values!
Tables: data organized into sets of columns and rows
● General Journal, Special Journal and General Ledger | Categorial Data
● Primary Key
● Managerial Accounting Data | Privately ● Nominal Data - Categorial data that cannot be ranked
○ Unique Identifier in a table
● Tax Data | Avail. ○ Gender - Male or Female
○ Each table must have a Primary Key
Four Principal Financial Statements ○ Transaction Type - Sale or Return
○ Ex. SSN, Costco Membership, Phone #, EmplID, student ID
1. Balance Sheet ○ Depreciation Method - Straight-line, Declining Balance
● Foreign Key
2. Income Statement ● Ordinal Data - Categorial data that implies ranking and sorting
○ “Linking variable” that serves as a bridge between two tables
3. Statement of Cash Flows ○ Gold, Silver and Bronze
○ Each table does NOT need a foreign key
4. Statement of Stockholder’s Equity ○ Survey Answers: Agree, Indifferent, Disagree
PART IV: PREPARING DATA FOR ANALYSIS
Where to find Financial Statements ○ Transaction Dates: How are the dates stored in Excel?
● 10K, 10Q, 8K, etc. How to Summarize Categorial Data
● SEC Edgar ● Nominal Data: Counting and Grouping, Proportion
● Company’s Website ● Ordinal Data: Counting and Grouping, Proportion, Ranking
What is XBRL? Extensive Business Reporting Language is a
language for electronic communication of business data that allows Numerical Data
companies to report financial information in a structured, machine- ● Interval Data - an equal interval between each observation
readable format. ○ SAT Scores, Temperature in Celsius and Fahrenheit
Managerial Accounting Data ○ Calendar Years / Dates
● Budget Data ● Ratio Data - numerical data with absolute “zero” as the point of origin
● Standard Cost Data ○ Height, Weight
● Point-of-Sale Transaction Data ○ Most Accounting figures, sales, net income, depreciation expense “Read-Me” for Relational Databases
Ask the Question: Types of Analysis ● Customer Relationship Management Data How to Summarize Numerical Data
Descriptive - What happened? ● Human Resource Data ● Counting and Grouping, Proportion, Summing, Averaging
Diagnostic - Why did it happen? Sources of Non-Accounting Data Data Dictionaries
Predictive - Will it happen and if so, when?
● Macroeconomic Data We define Data Dictionary as a centralized repository of information
Prescriptive - What should we do, based on what we expect to happen?
PART II: INTRO TO ACCOUNTING DATA ○ Gross Domestic Product about a data set
Data v. Information ■ A measure of economy-wide performance ● Data about the Data. “Read Me”
● Data are simply raw facts that describe an event ● Current and Historical Stock Prices It contains a separate record for each field (or variable)
○ May have little meaning on their own ○ Unemployment Numbers
● Information is defined as data organized in a meaningful way to be ■ A measure of labor availability
useful to the decision maker ● Social Media
○ Data serves as an input ○ Consumer Price Index
Information = Processed Data + Context ■ A measure of inflation
PART IV: PREPARING FOR DATA ANALYSIS
What is Big Data? - Four V’s ● Analyst Research Reports and Earnings Forecasts
Excel, Tableau and Databases
1. Volume- the scale / amount of the data ○ Housing Market Starts and Price Levels
Excel: Best at data analysis and exploring data. Tool for creating
2. Veracity - the certainty / trustworthiness of the data ■ A key measure of economic status
PivotTables, graphs, and performing statistical analysis
3. Velocity - the speed of the data ● Bureau of Economic Analysis (BEA)
Tableau: Best for data visualization; does not allow original data entry.
4. Variety - the diversity of the data ● Bureau of Labor Statistics (BLS)
What about databases?
Structured v. Unstructured Data ● World Development Indicators (WDI)
A database is a structured data set that can be accessed from a
Structured: Unstructured: ● IMF, OECD, etc.
computer system.
Consolidated Statement of Cash Blog and / or Tweets Potential (Structured) Data Sources: Helpful for Final Project
● Designed to hold big data (Excel: 1,048,576 rows by 16,384 columns)
● Kaggle
● Allow multiple users to access the data at the same time
○ Stock Market
● The most secure method of storing data
○ IMDB Movies
The most popular database model is relational.
● Data.world
A database query is a request for data so we can retrieve it.
○ Amazon Product Reviews
● Structured Query Language (SQL) is the most popular query language
● NYC Open Data
Relational Databases
○ Restaurant Safety
Relational databases break the data into separate tables, each containing a
The Fifth “V”? PART III: TYPES OF DATA
unique list of the items stored
● Big Data = the ability to achieve greater Value through insights Categorial - tend to be represented by words
● Customer Table
from superior analytics ● Ex. Gender, Transaction Types
● Billing Table
Sources of Accounting Data Numerical - meaningful numbers
Tables are linked with each other by primary keys and foreign keys
● Financial Statements | Public ● Ex. Transaction Amount, Net Income, Age
Components of a Relational Database
● Financial Accounting - Related Data | Avail. Rule of Thumb: Whether it makes sense to sum up two values!
Tables: data organized into sets of columns and rows
● General Journal, Special Journal and General Ledger | Categorial Data
● Primary Key
● Managerial Accounting Data | Privately ● Nominal Data - Categorial data that cannot be ranked
○ Unique Identifier in a table
● Tax Data | Avail. ○ Gender - Male or Female
○ Each table must have a Primary Key
Four Principal Financial Statements ○ Transaction Type - Sale or Return
○ Ex. SSN, Costco Membership, Phone #, EmplID, student ID
1. Balance Sheet ○ Depreciation Method - Straight-line, Declining Balance
● Foreign Key
2. Income Statement ● Ordinal Data - Categorial data that implies ranking and sorting
○ “Linking variable” that serves as a bridge between two tables
3. Statement of Cash Flows ○ Gold, Silver and Bronze
○ Each table does NOT need a foreign key
4. Statement of Stockholder’s Equity ○ Survey Answers: Agree, Indifferent, Disagree
PART IV: PREPARING DATA FOR ANALYSIS
Where to find Financial Statements ○ Transaction Dates: How are the dates stored in Excel?
● 10K, 10Q, 8K, etc. How to Summarize Categorial Data
● SEC Edgar ● Nominal Data: Counting and Grouping, Proportion
● Company’s Website ● Ordinal Data: Counting and Grouping, Proportion, Ranking
What is XBRL? Extensive Business Reporting Language is a
language for electronic communication of business data that allows Numerical Data
companies to report financial information in a structured, machine- ● Interval Data - an equal interval between each observation
readable format. ○ SAT Scores, Temperature in Celsius and Fahrenheit
Managerial Accounting Data ○ Calendar Years / Dates
● Budget Data ● Ratio Data - numerical data with absolute “zero” as the point of origin
● Standard Cost Data ○ Height, Weight
● Point-of-Sale Transaction Data ○ Most Accounting figures, sales, net income, depreciation expense “Read-Me” for Relational Databases