100% Pass | Revised
Hadoop - ✔✔- Software that is used to efficiently process large datasets
- Allows clustering commodity hardware together to analyze massive data sets in parallel
- Amazon EMR makes it easy to create and manage fully configured, elastic clusters in the Hadoop
ecosystem
Cache - ✔✔- A temporary storage space or memory that allows fast access to data
Latency - ✔✔- The delay before a transfer of data begins following an instruction for its transfer
- Want to reduce this
Virtualization Hardware - ✔✔- Virtualization of computers as complete hardware platforms, or only the
functionality required to run various operating systems
- Virtualization hides the physical characteristics of a computing platform from the users, presenting
instead an abstract computing platform
How to protect against database failures - ✔✔- Back up your back-up (nightly)
Page 1/13
Crafted for Academic Insight by KatelynWhitman. All rights reserved © 2025
, - Storage in multiple places
Sharding - ✔✔0 Type of database partitioning
- Separates very large databases into smaller, faster, more easily managed parts called data shards
- Can store data across multiple machines
Caching - ✔✔- Process of storing data in a cache (a temporary storage area in a computing environment)
- Shortens data access times, reduces latency, and improves input/output
Autoscaling AWS - ✔✔Helps you maintain application availability and scale your Amazon EC2 capacity
up or down automatically
TCP and UDP - ✔✔Two types of IP traffic
TCP (Transmission Control Protocol) - ✔✔- Connection oriented
- Once a connection is established, data can be sent bidirectionally
UDP (User Datagram Protocol) - ✔✔- Simpler, connectionless Internet Protocol
- Multiple messages are sent as packets in chunks
Bastion host / jump box - ✔✔Bastion host - computer on a network specifically designed and configured
to withstand attacks
Page 2/13
Crafted for Academic Insight by KatelynWhitman. All rights reserved © 2025