Big Data Questions And Answers Rated A+ New Update Assured Satisfaction
What is Big Data? - Answer-Big data is a term which is used to describe any data set that is so large and complex that it is difficult to process using traditional applications. T/F: Big Data is an objective term? - Answer-False. Describe at least three sources of Big Data. - Answer-Archives, Machine logs, Public Web, Sensor Data, Social Media State and explain the characteristics of Big Data: Volume - Answer-The vast amount of data that must be dealt with. State and explain the characteristics of Big Data: Velocity - Answer-The speed at which data is being received and processed. State and explain the characteristics of Big Data: Variety - Answer-The many sources from which Big Data can be drawn. State and explain the characteristics of Big Data: Variability - Answer-The inconsistencies which are often found in Big Data sets. State and explain the characteristics of Big Data: Veracity - Answer-The inaccuracies which are often found within big data. State and explain the characteristics of Big Data: Complexity - Answer-The challenges of linking various sources of data to infer a trend. How is Big Data used? - Answer-It is used to drawn trends and patterns from large and varied data sets.Why is Big Data used? - Answer-Used to show relationships and dependencies between events. Provide a real world example of Big Data. - Answer-Obama's 2012 reelection campain. What is Hadoop and why is it used? - Answer-Hadoop is an open source software product for distributed storage and processing of Big Data. Develops a parallel database architecutre running arcoss many different nodes. Why do traditional DMS's fail when big data is involved? - Answer-They aren't flexible enough to handle the variety and velocity of the data. T/F: The big data itself can provide information on the domain it represents. - Answer-False Briefly explain how big data analytics can be used to benefit a business. - Answer-They can be used to predict customer behaviours and preferences. Briefly explain how big data analytics can be used in the financial industry. - Answer-They can be used to make strategic trading decisions. Describe some benefits of using a column-oriented database for storing big data. - Answer-Reduces the computation required for queries. Why are NoSQL databases good for implementing big data storage solutions? - Answer-They are designed to be scalable which helps facilitate big data storage. State a benefit and drawback to using direct-attached storage in a hyperscale computing environment. - Answer-They allow mirrors to support constant avaialability. T/F: Direct-attached hyperscale computer environments include shared storage. - Answer-False State the four main architectures of parallel databases. - Answer-Shared memory, shared disk, shared nothing, hierarchical (hybrid)List a few major open source technologies used to manipulate big data. - Answer-HaDoop, MongoDB, CouchDB, Cassandra What advice would you give to someone about to venture into big data analytics? - Answer-Gather as much data relevant to the domain that is going to be analyzed, avoid queries that will not provide any value. What are Enterprise Resource Planning Systems and when were they first developed? - Answer-Software solutions used by businesses to assist in the organizing of resources used by a firm. 1976. When was MapReduce developed and what purpose did it serve? - Answer-Developed by Google in 2005, breaks up files into small chunks and stores them across a distributed network. Explain the difference between Shared Disk, Shared Memory, and Shared Nothing Architectures. - Answer-For shared disk only the disk is shared, shared memory shares everything, shared nothing only communicate with one another. Dirty data is defined as "unreadable data or attributes due to irrelevant data and becomes inconsistent with other data", what is one negative effect on that? - Answer-Can't merge data sets Why are E-R Models not scalable with Big Data? - Answer-E-R Tables in SQL talk longer to search for relations than clustering. Given lots of sharing of Big Data, what is it called when network speeds are at a loss? A) Big Data Research and Development Initiative B) Time-Seneitive Network Cleaning C) Distributed Systems D) Bottleneck Networking - Answer-D) Bottleneck Networking
Written for
- Institution
- Big Data
- Course
- Big Data
Document information
- Uploaded on
- May 1, 2024
- Number of pages
- 4
- Written in
- 2023/2024
- Type
- Exam (elaborations)
- Contains
- Questions & answers
Subjects
Also available in package deal