100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Class notes

All you need to know for WGU Computer Architecture

Rating
-
Sold
-
Pages
8
Uploaded on
25-02-2025
Written in
2024/2025

This document describes all of the vocabulary and formulas needed to pass the C952 Computer Architecture course at Western Governors University online for a Bachelor's of Science degree in Computer Science.










Whoops! We can’t load your doc right now. Try again or contact support.

Document information

Uploaded on
February 25, 2025
Number of pages
8
Written in
2024/2025
Type
Class notes
Professor(s)
James santorini
Contains
All classes

Subjects

Content preview

WGU Computer Architecture C952 2025
machine language - The language made up of binary-coded instructions that is used directly by the computer
system software - set of programs that enables a computer's hardware devices & application software to work together; it includes the operating system & utility programs.
operating system - software that controls the execution of computer programs and may provide various services
Assembly Language Programming - language that has the same structure & set of commands as machine languages but allows programmers to use symbolic representations of
numeric machine code.
IBM 360/91 - Introduced many new concepts, including dynamic detection of memory hazards, generalized forwarding, and reservation stations. Tomasulo's algorithm The internal
organization of the 360/91 shares many features with the Pentium III and Pentium 4, as well as with several other microprocessors. One major difference was that there was no branch
prediction in the 360/91 and hence no speculation. Another major difference was that there was no commit unit, so once the instructions finished execution, they updated the registers.
Dynamic Random Access Memory (DRAM) -Memory built as an integrated circuit; it provides random access to any location. Access times are 50 nanoseconds & cost per gigabyte
in 2012 was $5 to $10. Multiple DRAMs are used together to contain the instructions and data of a program. In contrast to sequential access memories, such as magnetic tapes, the
RAM portion of the term DRAM means that memory accesses take basically the same amount of time no matter what portion of the memory is read. Modern DRAMS consist of rows
in each bank
frame buffering - A portion of RAM containing a bitmap that drives a video display. It is a memory buffer containing a complete frame of data. The image to be represented
onscreen is stored in the frame buffer, and the bit pattern per pixel is read out to the graphics display at the refresh rate. The animation below shows a frame buffer with a simplified
design of just 4 bits per pixel.
Datapath - The component of the processor that performs arithmetic operations
Control - The component of the processor that commands the datapath, memory, and I/O devices according to the instructions of the program.
Register File - A state element that consists of a set of registers that can be read and written by supplying a register number to be accessed.
provides 1024 scalar 32-bit registers for up to 64 threads.
Integrated circuit - Also called a chip. A device combining dozens to millions of transistors.
Central processor unit (CPU) - Also called processor. The active part of the computer, which contains the datapath & control and which adds numbers, tests numbers, signals I/O
devices to activate, and so on.
Static random access memory (SRAM) - Also memory built as an integrated circuit, but faster and less dense than DRAM.
Instruction set architecture - Also called architecture. An abstract interface between the hardware and the lowest-level software that encompasses all the information necessary to
write a machine language program that will run correctly, including instructions, registers, memory access, I/O, and so on.
Application binary interface (ABI) - The user portion of the instruction set plus the operating system interfaces used by application programmers. It defines a standard for binary
portability across computers. An application binary interface (ABI) is the interface between a user program and the kernel.
Volatile memory - Storage, such as DRAM, that retains data only if it is receiving power.
Nonvolatile Memory - A form of memory that retains data even in the absence of a power source and that is used to store programs between runs. A DVD disk is nonvolatile.
Magnetic disk - Also called hard disk. A form of nonvolatile secondary memory composed of rotating platters coated with a magnetic recording material. Because they are rotating
mechanical devices, access times are about 5 to 20 milliseconds and cost per gigabyte in 2012 was $0.05 to $0.10
Main memory - Also called primary memory. Memory used to hold programs while they are running; typically consists of DRAM in today's computers.
Secondary memory - Nonvolatile memory used to store programs and data between runs; typically consists of flash memory in PMDs and magnetic disks in servers.
Flash memory - A nonvolatile semiconductor memory. It is cheaper and slower than DRAM but more expensive per bit and faster than magnetic disks. Access times are about 5 to 50
microseconds and cost per gigabyte in 2012 was $0.75 to $1.00.
Single Instruction Single Data (SISD) - A uniprocessor
Multiple Instruction Multiple Data (MIMD) - A multiprocessor.
Single Program, Multiple Data Streams (SPMD) - The conventional MIMD programming model, where a single program runs across all processors.
Single Instruction Stream, Multiple Data Streams (SIMD) - The same instruction is applied to many data streams, as in a vector processor.
Data-level parallelism - Parallelism achieved by performing the same operation on independent data
vector-based code - code that is designed to take advantage of a processor's vector processing capabilities, allowing a single instruction to perform the same operation on multiple data
elements simultaneously, resulting in significantly faster execution for data-intensive tasks like scientific calculations or image processing; essentially, it's a way to operate on entire
"vectors" of data at once, rather than processing each data point individually.
LEGv8 - assembly instructions
multimedia extensions (MMX) - An expanded set of instructions supported by a processor that provides multimedia-specific functions.
data hazard (pipeline data hazard) - When a planned instruction cannot execute in the proper clock cycle because data that is needed to execute the instruction are not yet available.
forwarding (bypassing) - A method of resolving a data hazard by retrieving the missing data element from internal buffers rather than waiting for it to arrive from programmer-
visible registers or memory
Structural hazard - When a planned instruction cannot execute in the proper clock cycle because the hardware does not support the combination of instructions that are set to
execute.
Pipelining - Technique that allows the CPU to work on more than one instruction at a time Formula total process time = [longest task * (total load -1)] + total load time R-
format ALU operations - Requires register file and the ALU.
Output - The results of the operation of any system.
Program Counter - The register that contains the address of the next instruction to be executed
temporal locality - The principle stating that if a data location is referenced then it will tend to be referenced again soon.
spatial locality - The principle stating that if a data location is referenced, data locations with nearby addresses will tend to be referenced soon.
Memory hierarchy - A structure that uses multiple levels of memories; as the distance from the processor increases, the size of the memories and the access time both increase.
Block (or line) - The minimum unit of information that can be either present or not present in a cache .
Hit rate - The fraction of memory accesses found in a level of the memory hierarchy.
Miss rate - The fraction of memory accesses not found in a level of the memory hierarchy
miss penalty - The time required to fetch a block into a level of the memory hierarchy from the lower level, including the time to access the block, transmit it from one level to the
other, insert it in the level that experienced the miss, and then pass the block to the requestor.
Hit time - The time required to access a level of the memory hierarchy, including the time needed to determine whether the access is a hit or a miss.
Parallelization - consists of dividing a program into separate components that run in parallel on individual computers in the cluster
Superscalar - Technique primarily associated with hardware. Functional units (ALU, Floating Point Unit, Load/Store Unit) are duplicated in the pipeline of a superscalar processor
which allows the hardware to issue multiple instructions to each unit simultaneously.
ARM architecture - can support 16-bit, a set of rules that define how hardware & software interact when a program is run on a processor. It's a type of reduced instruction set
computing (RISC) architecture.
Amdahl's Law- A formula used to find the maximum improvement possible by improving a particular part of a system. In parallel computing, Amdahl's law is mainly used to
predict the theoretical maximum speedup for program processing using multiple processors
Multiprocessor - A term used to refer to a computer with more than one CPU.
Uniform Memory Access (UMA) - A multiprocessor in which latency to any word in main memory is about the same no matter which processor requests the access, all processors in
a system have equal & consistent access to a shared memory pool
Non-Uniform Memory Access (NUMA) - Varying system memory access times, because of system hardware.
loop unrolling - A technique to get more performance from loops that access arrays, in which multiple copies of the loop body are made & instructions from different iterations are
scheduled together.
Blocking - a failure to retrieve information that is available in memory even though you are trying to produce it can help reduce cache miss rate
Set Associative Cache - A cache that has a fixed number of locations (at least two) where each block can be placed.
RAID 0 (Disk Striping) - Disk Striping. Disk striping requires at least two drives. It does not provide redundancy to data. If any one drive fails, all data is lost.
RAID 1 (mirroring) - Two drives are used in unison, and all data is written to both drives, giving you a mirror or extra copy of the data, in the case that one drive fails
RAID 2 - Bit-level striping with dedicated Hamming-code parity. OBSOLETE.
RAID 3 - Byte-level striping with dedicated parity. OBSOLETE, replaced with RAID 5.
RAID 4 Block-level striping with dedicated parity. Not often used, replaced with RAID 5.
RAID 5 - Disk striping with parity. RAID-5 uses three or more disks & provides fault tolerance.
RAID 6 Disk striping with parity. RAID-6 uses four or more disks and provides fault tolerance. It can survive the failure of two drives.
silicon crystal ingot - A rod composed of a silicon crystal that is between 8 and 12 inches in diameter and about 12 to 24 inches long
.Wafer - A slice from a silicon ingot no more than 0.1 inches thick, used to create chips.

, Instruction Set Architecture (ISA) Also called architecture. An abstract interface between the hardware and the lowest-level software that encompasses all the information
necessary to write a machine language program that will run correctly, including instructions, registers, memory access, I/O, and so on
Application Binary Interface (ABI) -The user portion of the instruction set plus the operating system interfaces used by application programmers. It defines a standard for binary
portability across computers.
Transistor- An on/off switch controlled by an electric signal
very large-scale integrated (VLSI) circuit - A device containing hundreds of thousands to millions of transistors.
Silicon A natural element that is a semiconductor, it can conduct electricity under certain conditions, allowing for precise control over electrical current flow. Excellent conductors
of electricity (using either microscopic copper or aluminum wire) Excellent insulators from electricity (like plastic sheathing or glass) Areas that can conduct or insulate under special
conditions (as a switch)
Semiconductor - A substance that can conduct electricity under some conditions
Die The individual rectangular sections that are cut from a wafer, more informally known as chips.
complementary metal-oxide semiconductor (CMOS) Dominant technology for integrated circuits
LEGv8 - Assembly Language
LEGv8 word A natural unit of access in a computer, usually a group of 32 bits
LEGv8 doubleword Another natural unit of access in a computer, usually a group of 64 bits (8 bytes); corresponds to the size of a register in the LEGv8 architecture
LEGv8 register 64 bits wide more registers will lead to a slower clock frequency
Smaller is faster A very large number of registers may increase the clock cycle time simply because it takes electronic signals longer when they must travel farther. Guidelines such
as "smaller is faster" are not absolutes; 31 registers may not be faster than 32. Even so, the truth behind such observations causes computer designers to take them seriously. In this
case, the designer must balance the craving of programs for more registers with the designer's desire to keep the clock cycle fast. Another reason for not using more than 32 is the
number of bits it would take in the instruction format.
Data transfer instruction A command that moves data between memory and registers.
Address A value used to delineate the location of a specific data element within a memory array.
Load data transfer instruction that copies data from memory to a register
LEGv8 LDUR sum of the constant portion of the instruction and the contents of the second register forms the memory address The U in LDUR stands for unscaled immeditate
base address- starting address of an array in memory (5000 below)
base register - register that holds an array's base address (X22 below)
Offset- a constant value added to a base address to locate a particular array element (8 below)
Big Endian - A CPU or memory architecture in which the most significant byte is stored at the lowest memory address. Least significant byte is stored at highest memory byte.
store register - instruction complementary to load. It copies data from register to memory. Format is similar to load; name of the operation, followed by the register to be stored,
then the base register, & finally the offset to select the array element.
LEGv8 STUR store register
spilling register The process of putting less frequency used variables (or those needed later into memory)
reservation station A buffer within a functional unit that holds the operands and the operation.
commit unit - unit in a dynamic or out-of-order execution pipeline that decides when it is safe to release the result of an operation to programmer-visible registers and memory.
Reorder Buffer buffer that holds results in a dynamically scheduled processor until it is safe to store the results to memory or a register.
out-of-order execution A situation in pipelined execution when an instruction blocked from executing does not cause the following instructions to wait.
in-order commit A commit in which the results of pipelined execution are written to the programmer-visible state in the same order that instructions are fetched.
VLIW style of instruction set architecture that launches many operations that are defined to be independent in a single wide instruction, typically with many separate opcode fields
ARMv8 virtual memory 64-bit addressed. The upper 16 bits are not used, so only 48 bits are used. it is allocated by loads the page table register to refer to the page table of the
process. ARMv8 allows implementations with a smaller virtual address. It also allows physical addresses as large as 48 bits. It supports three options for a minimum page or granule
size: 4, 16, and 64 Kibibyte. address translation (address mapping) the process by which a virtual address is mapped to an address used to access memory
Exception Syndrome Register (ESR) record the cause of the exception
FADDS, FSUBS Single-precision arithmetic
FADDD, FSUBD, FMULD, FDIVD Double-precision arithmetic
FCMPS, FCMPD Single- and double-precision comparison
motivations for virtual memory 1. To allow efficient and safe sharing of memory among several programs and to remove the programming burdens of a small, limited amount of
main memory [still being used today] 2. To allow a single user program to exceed the size of primary emmory.
virtual memory A technique that uses main memory as a "cache" for secondary storage The address is broken into a virtual page number and a page offset
physical address An address in main memory.
Protection A set of mechanisms for ensuring that multiple processes sharing the processor, memory or I/O devices cannot interfere, intentionally or unintentionally, with one another
by reading or writing each other's data. These mechanism also isolate the operating system from a user process.
Page a virtual memory block. all virtual memory system relocate the program as a set of fixed-size blocks page fault a virtual memory miss
virtual address An address that corresponds to a location in virtual space & is translated by address mapping to a physical address when memory is accessed.
page table The table containing the virtual to physical address translations in a virtual memory system. The table, which is stored in memory, is typically indexed by the virtual
page number; each entry in the table contains the physical page number for that virtual page if the page is currently in memory. indexed by the page number from the virtual address
swap space The space on the disk reserved for the full virtual memory space of a process
Reference Bit (Use Bit or Access Bit)
A field that is set whenever a page is accessed and that is used to implement LRU or other replacement schemes. ARMv8 calls it an access bit
Techniques for reducing total max storage required 1. Keep a limit register that restrict the size of the page table for a given process & add more entries as needed.
2. A limit register for each segment specifies the current size of the segment, which grows in units of pages. This type of segmentation is used by many architectures, including
ARMv8 and MIPS. Unlike the type of segmentation discussed in a previous elaboration, this form of segmentation is invisible to the application program, although not to the operating
system. This does not work when the address space is used sparsely rather than contiguous. 3. Apply a hashing function to the virtual address so that the table need to be only the size
of the number of physical pages in the main memory. AKA inverted page table. Lookup process can be more complex because it is not indexed 4. Allow the page tables to be paged. It
works by allowing the page tables to reside in the virtual address space. 5. Multiple levels of page tables & is the solution that ARMv8 uses to reduce the memory footprint of address
translation. This scheme allows the address space to be used in a sparse fashion (multiple noncontiguous segments can be active) without having to allocate the entire page table.
Useful with very large address spaces & in software systems that require noncontiguous allocation. The primary disadvantage of this multi-level mapping is the more complex process
for address translation.
Least Recently Used (LRU) A replacement scheme in which the block replaced is the one that has been unused for the longest time.
dirty bit indicates if a page has been written since being read into memory
Translation Lookaside Buffer (TLB) A cache that keeps track of recently used address mappings to try to avoid an access to the page table. Can be used to improve access
performance by relying on locality of reference TLB size: 16-512 entries
Block size: 1-2 page table entries (typically 4-8 bytes each) Hit time: 0.5-1 clock cycle Miss penalty: 10-100 clock cycles Miss rate: 0.01%-1%
Exception Enable (Interrupt Enable) A signal or action that controls whether the process responds to an exception or not; necessary for preventing the occurrence of exceptions
during intervals before the processor has safely saved the state needed for restart.
The Intrinsity FastMATH TLB The memory system uses 4 KiB pages and just a 32-bit address space; thus, the virtual page number is 20 bits long. The physical address is the
same size as the virtual address. The TLB contains 16 entries, it is fully associative, and it is shared between the instruction and data references. Each entry is 64 bits wide and contains
a 20-bit tag (which is the virtual page number for that TLB entry), the corresponding physical page number (also 20 bits), a valid bit, a dirty bit, and other bookkeeping bits. Like most
ARMv8 systems, it uses software to handle TLB misses.
TLB miss indicates that a page is not in the TLB. Another process then finds and loads the missing page.
$6.99
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Get to know the seller
Seller avatar
josephcorney

Get to know the seller

Seller avatar
josephcorney Southern New Hampshire University
View profile
Follow You need to be logged in order to follow users or courses
Sold
1
Member since
1 year
Number of followers
0
Documents
4
Last sold
11 months ago

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions