CMSI 3510

CMSI 3510: Operating Systems - Cheat Sheet

CPU Architecture

Key Components

  • Control Unit (CU): Directs operations of the processor, fetches instructions from memory, decodes them, and coordinates execution
  • Arithmetic Logic Unit (ALU): Performs arithmetic and logical operations
  • Registers: Small, fast storage locations within the CPU
    • Program Counter (PC): Holds address of next instruction
    • Instruction Register (IR): Holds current instruction being executed
    • Stack Pointer (SP): Points to top of the stack
    • General Purpose Registers: Store operands and intermediate results
  • Cache Memory: Fast memory between CPU and main memory
    • L1 Cache: Smallest, fastest (~32-64 KB per core, ~1-2ns access time)
    • L2 Cache: Medium size (~256 KB - 1 MB per core, ~4-10ns)
    • L3 Cache: Largest, shared across cores (~8-32 MB, ~10-20ns)

Instruction Cycle (Fetch-Decode-Execute)

1. FETCH:    PC β†’ MAR β†’ Memory β†’ MDR β†’ IR
             PC = PC + 1
2. DECODE:   IR β†’ Control Unit (decode instruction)
3. EXECUTE:  Perform operation (ALU, memory access, etc.)
4. STORE:    Write results back to registers/memory

MAR: Memory Address Register | MDR: Memory Data Register

CPU Modes

  • User Mode: Limited privileges, cannot execute privileged instructions, used for user applications
  • Kernel Mode (Supervisor Mode): Full privileges, can execute all instructions, access all memory, control hardware
  • Mode Switching: Triggered by system calls, interrupts, or exceptions

Interrupts

  • Hardware Interrupts: Generated by hardware devices (I/O completion, timer, etc.)
  • Software Interrupts (Traps): Generated by software (system calls, exceptions)
  • Interrupt Handling:
    1. Save current state (PC, registers)
    2. Load interrupt handler address from Interrupt Vector Table (IVT)
    3. Execute interrupt handler in kernel mode
    4. Restore saved state and return to user mode

Pipelining

CPU technique to increase instruction throughput by overlapping execution stages

Classic 5-Stage Pipeline:
1. IF  - Instruction Fetch
2. ID  - Instruction Decode
3. EX  - Execute
4. MEM - Memory Access
5. WB  - Write Back

Time  | IF | ID | EX | MEM | WB |
------|----|----|----|----|-----|
T1    | I1 |    |    |    |     |
T2    | I2 | I1 |    |    |     |
T3    | I3 | I2 | I1 |    |     |
T4    | I4 | I3 | I2 | I1 |     |
T5    | I5 | I4 | I3 | I2 | I1  |

Hazards: Data hazards, control hazards (branches), structural hazards

Multi-Core & Multiprocessing

  • Symmetric Multiprocessing (SMP): Multiple identical processors share memory and I/O
  • Multi-Core: Multiple CPU cores on a single chip
  • Hyper-Threading: Single physical core appears as multiple logical cores
  • Cache Coherence: Ensures consistency when multiple cores cache same memory location (MESI protocol: Modified, Exclusive, Shared, Invalid)

Memory (RAM)

Memory Hierarchy

Fastest/Smallest                    Slowest/Largest
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Registers  β”‚ <1 ns, bytes to KB
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  L1 Cache   β”‚ 1-2 ns, ~32-64 KB
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  L2 Cache   β”‚ 4-10 ns, ~256 KB - 1 MB
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  L3 Cache   β”‚ 10-20 ns, ~8-32 MB
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Main RAM   β”‚ 50-100 ns, GBs
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  SSD        β”‚ 25-100 ΞΌs, 100s GB - TBs
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  HDD        β”‚ 5-10 ms, TBs
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Memory Types

  • DRAM (Dynamic RAM): Main memory, requires periodic refresh, slower but cheaper
  • SRAM (Static RAM): Used for cache, faster, doesn't need refresh, more expensive
  • DDR SDRAM: Double Data Rate, transfers on both clock edges (DDR3, DDR4, DDR5)

Virtual Memory

  • Purpose: Abstraction that provides each process with illusion of large, contiguous address space
  • Benefits: Process isolation, efficient memory use, allows programs larger than physical RAM
  • Translation: Virtual addresses β†’ Physical addresses via page tables
  • Translation Lookaside Buffer (TLB): Cache for page table entries, speeds up address translation

Paging

Memory management scheme that divides memory into fixed-size blocks

  • Page: Fixed-size block in virtual memory (typically 4 KB)
  • Frame: Fixed-size block in physical memory (same size as page)
  • Page Table: Maps virtual pages to physical frames
  • Page Fault: Occurs when accessing page not in physical memory
    1. Trap to OS kernel
    2. Save process state
    3. Find page on disk
    4. Load page into free frame
    5. Update page table
    6. Resume process
Virtual Address Format:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Page Number  β”‚    Offset    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Page Table Entry (PTE):
β”Œβ”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚Validβ”‚Dirtyβ”‚R/Wβ”‚U/Kβ”‚ Frame Number β”‚
β””β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Page Replacement Algorithms

  • FIFO (First-In-First-Out): Replace oldest page, can suffer from Belady's anomaly
  • LRU (Least Recently Used): Replace page not used for longest time, good performance but expensive
  • LFU (Least Frequently Used): Replace page with lowest reference count
  • Clock (Second Chance): Approximates LRU, uses reference bit, circular list
  • Optimal: Replace page that won't be used for longest time (theoretical, requires future knowledge)

Segmentation

  • Segments: Variable-size logical units (code, data, stack, heap)
  • Segment Table: Maps segment numbers to base address and limit
  • Segmentation + Paging: Combine benefits of both (e.g., x86-64 architecture)
  • Fragmentation: External fragmentation possible with pure segmentation

Memory Allocation

  • Contiguous Allocation:
    • First Fit: Allocate first hole that's big enough
    • Best Fit: Allocate smallest hole that fits
    • Worst Fit: Allocate largest hole
  • Non-Contiguous Allocation: Paging and segmentation allow non-contiguous allocation
  • Fragmentation:
    • External: Free memory scattered, can't satisfy requests
    • Internal: Allocated memory larger than needed, waste within allocation unit

Storage Systems

Hard Disk Drives (HDD)

  • Components:
    • Platters: Rotating magnetic disks
    • Spindle: Rotates platters (5400-15000 RPM)
    • Read/Write Heads: Move across platter surface on actuator arm
    • Tracks: Concentric circles on platter
    • Sectors: Arc segments of tracks (typically 512B or 4KB)
    • Cylinders: Set of tracks at same position on all platters
  • Access Time = Seek Time + Rotational Latency + Transfer Time
    • Seek Time: Time to move head to correct track (3-12 ms)
    • Rotational Latency: Time for sector to rotate under head (avg = 0.5 Γ— rotation time)
    • Transfer Time: Time to read/write data
  • Typical Performance: 80-160 MB/s throughput, 5-10 ms latency

Disk Scheduling Algorithms

  • FCFS (First-Come-First-Served): Process requests in order, no optimization
  • SSTF (Shortest Seek Time First): Service request closest to current head position, can starve distant requests
  • SCAN (Elevator): Head moves in one direction, services requests, then reverses
  • C-SCAN (Circular SCAN): Like SCAN but only services in one direction, returns to start without servicing
  • LOOK/C-LOOK: Like SCAN/C-SCAN but only goes as far as last request in each direction
Example: Current head at 50, queue: [98, 183, 37, 122, 14, 124, 65, 67]

FCFS: 50β†’98β†’183β†’37β†’122β†’14β†’124β†’65β†’67  (total: 640)
SSTF: 50β†’65β†’67β†’37β†’14β†’98β†’122β†’124β†’183  (total: 236)
SCAN: 50β†’37β†’14β†’0β†’65β†’67β†’98β†’122β†’124β†’183 (total: 236)

Solid State Drives (SSD)

  • Technology: Flash memory (NAND), no moving parts
  • Structure:
    • Pages: Smallest read/write unit (4-16 KB)
    • Blocks: Smallest erase unit (128-256 pages, ~512 KB - 4 MB)
    • Planes/Dies: Organized for parallel access
  • Performance: 200-3500 MB/s (SATA) or 3500-7000 MB/s (NVMe), 25-100 ΞΌs latency
  • Characteristics:
    • Must erase before write (block-level erase)
    • Limited write cycles (~1000-100000 per cell)
    • Write amplification issue
    • Garbage collection needed
  • Flash Translation Layer (FTL): Maps logical blocks to physical pages, handles wear leveling, garbage collection

RAID (Redundant Array of Independent Disks)

  • RAID 0 (Striping): Data split across disks, no redundancy, high performance
  • RAID 1 (Mirroring): Data duplicated on multiple disks, 100% overhead, high read performance
  • RAID 5 (Striping + Parity): Data + parity striped across β‰₯3 disks, can survive 1 disk failure, (n-1)/n capacity
  • RAID 6 (Double Parity): Like RAID 5 but 2 parity blocks, survives 2 disk failures, (n-2)/n capacity
  • RAID 10 (1+0): Mirrored pairs striped, combines benefits of RAID 0 and 1, 50% capacity

File System Basics

  • File: Named collection of related information stored on secondary storage
  • Directory: Organization structure containing files and subdirectories
  • Metadata: File name, size, timestamps, permissions, owner, location
  • Inode (Unix/Linux): Data structure storing file metadata and pointers to data blocks
  • File Allocation Methods:
    • Contiguous: Consecutive blocks, fast but fragmentation
    • Linked: Each block points to next, no fragmentation, slow random access
    • Indexed: Index block contains pointers to all data blocks
    • Multi-level Indexed: Combines direct, indirect, double indirect pointers (ext2/3/4)

Common File Systems

  • ext4: Linux default, journaling, extent-based allocation, up to 1 EB volume
  • NTFS: Windows, journaling, ACLs, compression, encryption
  • APFS: macOS/iOS, optimized for SSDs, snapshots, encryption
  • FAT32: Simple, widely compatible, 4 GB file limit
  • Btrfs/ZFS: Advanced features: snapshots, checksums, RAID, compression

I/O Systems

I/O Hardware Components

  • I/O Devices: Keyboard, mouse, display, disk, network card, etc.
  • Device Controller: Hardware interface between device and computer
    • Contains registers (status, command, data)
    • May have local buffer
    • Communicates via I/O ports or memory-mapped I/O
  • Device Driver: OS software module that interfaces with device controller
  • I/O Buses:
    • System Bus: CPU to memory
    • PCIe (Peripheral Component Interconnect Express): High-speed expansion bus
    • USB (Universal Serial Bus): External peripheral connection
    • SATA/NVMe: Storage connections

I/O Methods

  • Programmed I/O (Polling):
    • CPU continuously checks device status
    • Simple but CPU-intensive, wasteful
    • Busy waiting
  • Interrupt-Driven I/O:
    • Device signals CPU when ready via interrupt
    • CPU can do other work while waiting
    • Still involves CPU for data transfer
    • Better CPU utilization than polling
  • Direct Memory Access (DMA):
    • Dedicated DMA controller handles data transfer
    • CPU only involved at start and completion
    • Best for large transfers
    • Single interrupt after entire transfer complete
DMA Transfer Process:
1. CPU programs DMA controller (source, dest, size)
2. DMA controller takes control of bus
3. DMA transfers data directly between device & memory
4. DMA sends interrupt when complete
5. CPU handles interrupt, transfer finished

I/O Software Layers

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  User-Level I/O Software    β”‚ (printf, scanf, etc.)
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Device-Independent OS      β”‚ (buffering, naming, error)
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Device Drivers             β”‚ (device-specific code)
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Interrupt Handlers         β”‚ (process interrupts)
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Hardware                   β”‚ (controllers, devices)
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  • User Level: Libraries and system calls (read, write, open, close)
  • Device-Independent: Uniform naming, buffering, error handling, allocation
  • Device Driver: Device-specific commands, initialization, status checks
  • Interrupt Handler: Save context, handle interrupt, restore context

Buffering

  • Purpose: Cope with speed mismatch, allow asynchronous operation, support copy semantics
  • Single Buffering: OS buffer, process waits for I/O completion
  • Double Buffering: Two buffers alternate (fill one while processing other)
  • Circular Buffering: Ring of buffers for producer-consumer scenarios
  • Buffer Cache: Keep frequently accessed blocks in memory (page cache in Linux)

I/O Performance

  • Reduce number of context switches
  • Reduce data copying: Zero-copy techniques, DMA
  • Reduce interrupt frequency: Interrupt coalescing, polling under high load
  • Use DMA for large transfers
  • Balance CPU and I/O operations
  • Asynchronous I/O: Allow overlapping computation and I/O
  • I/O Scheduling: Reorder requests for efficiency (disk scheduling algorithms)

Memory-Mapped I/O vs Port-Mapped I/O

  • Memory-Mapped I/O (MMIO):
    • Device registers mapped into memory address space
    • Use regular load/store instructions
    • Single address space for memory and I/O
    • Common in modern systems
  • Port-Mapped I/O (PMIO):
    • Separate address space for I/O devices
    • Special I/O instructions (IN/OUT on x86)
    • Better isolation between memory and I/O
    • Legacy approach on x86

Block vs Character Devices

  • Block Devices:
    • Store information in fixed-size blocks (typically 512B - 4KB)
    • Random access possible
    • Can be buffered/cached
    • Examples: Hard disks, SSDs, USB drives
  • Character Devices:
    • Stream of characters, no block structure
    • Sequential access, not seekable
    • Examples: Keyboards, mice, serial ports, terminals
  • Network Devices: Often handled separately with socket interface

Quick Reference Summary

  • CPU: Control Unit + ALU + Registers + Cache | Modes: User/Kernel | Interrupts & System Calls
  • Memory: Hierarchy: Registers β†’ Cache (L1/L2/L3) β†’ RAM β†’ Disk | Virtual Memory + Paging | TLB
  • Storage: HDD (mechanical, 5-10ms) vs SSD (flash, 25-100ΞΌs) | RAID for redundancy | File Systems
  • I/O: Polling β†’ Interrupts β†’ DMA | Device Drivers | Buffering | Block vs Character Devices