Basic Computer Architecture
- Length: 682 pages
- Edition: 1
- Language: English
- Publisher: White Falcon Publishing
- Publication Date: 2021-09-01
- ISBN-10: 1636403034
- ISBN-13: 9781636403038
- Sales Rank: #0 (See Top 100 Books)
This book is a comprehensive text on basic, undergraduate-level computer architecture. It starts from theoretical preliminaries and simple Boolean algebra. After a quick discussion on logic gates, it describes three classes of assembly languages: a custom RISC ISA called SimpleRisc, ARM, and x86. In the next part, a processor is designed for the SimpleRisc ISA from scratch. This includes the combinational units, ALUs, processor, basic 5-stage pipeline, and a microcode-based design. The last part of the book discusses caches, virtual memory, parallel programming, multiprocessors, storage devices and modern I/O systems. The book’s website has links to slides for each chapter and video lectures hosted on YouTube.
Introduction to Computer Architecture What is a Computer? Structure of a Typical Desktop Computer Computers are Dumb Machines The Language of Instructions Instruction Set Design Complete - The ISA should be able to Implement all User Programs Concise – Limited Size of the Instruction Set Generic – Instructions should Capture the Common Case Simple – Instructions should be Simple How to Ensure that an ISA is Complete? Towards a Universal ISA* Turing Machine* Universal Turing Machine* A Modified Universal Turing Machine* Single Instruction ISA* Multiple Instruction ISA* Summary of Theoretical Results Design of Practical Machines Harvard Architecture Von Neumann Architecture Towards a Modern Machine with Registers and Stacks The Road Ahead Representing Information Processing Information Processing More Information Summary and Further Reading Summary Further Reading I Architecture: Software Interface The Language of Bits Logical Operations Basic Operators Derived Operators Boolean Algebra De Morgan's Laws Logic Gates Implementing Boolean Functions The Road Ahead Positive Integers Ancient Number Systems Binary Number System Adding Binary Numbers Sizes of Integers Negative Integers Sign-Magnitude based Representation The 1's Complement Approach Bias-based Approach The 2's Complement Method Floating Point Numbers Fixed Point Numbers Generic Form of Floating Point Numbers IEEE 754 Format for Representing Floating Point Numbers Denormal Numbers Double Precision Numbers Floating Point Mathematics Strings ASCII Format UTF-8 UTF-16 and UTF-32 Summary and Further Reading Summary Further Reading Assembly Language Why Assembly Language Software Developer's Perspective Hardware Designer's Perspective The Basics of Assembly Language Machine Model View of Memory Assembly Language Syntax Types of Instructions Types of Operands SimpleRisc Different Instruction Sets Model of the SimpleRisc Machine Register Transfer Instruction – mov Arithmetic Instructions Logical Instructions Shift Instructions – lsl, lsr, asr Data Transfer Instructions: ld and st Unconditional Branch Instructions Conditional Branch Instructions Functions Function Call/Return Instructions The nop Instruction Modifiers Encoding the SimpleRisc Instruction Set Summary and Further Reading Summary Further Reading ARM® Assembly Language The ARM® Machine Model Basic Assembly Instructions Simple Data Processing Instructions Advanced Data-Processing Instructions Compare Instructions Instructions that Set CPSR Flags – The `S' Suffix Data Processing Instructions that use CPSR Flags Simple Branch Instructions Branch and Link Instruction Conditional Instructions Load-Store Instructions Advanced Features Arrays Functions Encoding the Instruction Set Data Processing Instructions Load-Store Instructions Branch Instructions Summary and Further Reading Summary Further Reading x86 Assembly Language Overview of the x86 Family of Assembly Languages Brief History Main Features of the x86 ISA x86 Machine Model Integer Registers Floating Point Registers View of Memory Addressing Modes x86 Assembly Language Integer Instructions Data Transfer Instructions ALU Instructions Branch/ Function Call Instructions Advanced Memory Instructions Floating Point Instructions Data Transfer Instructions Arithmetic Instructions Instructions for Special Functions Compare Instruction Stack Cleanup Instructions Encoding the x86 ISA High Level View of x86 Instruction Encoding Summary and Further Reading Summary Further Reading II Organisation: Processor Design Logic Gates, Registers, and Memories Silicon based Transistors Doping P-N Junction NMOS Transistor PMOS Transistor A Basic CMOS based Inverter NAND and NOR Gates Combinational Logic XOR Gate Decoder Multiplexer Demultiplexer Encoder Priority Encoder Sequential Logic SR Latch The Clock Clocked SR Latch Edge Sensitive SR Flip-flop JK Flip-flop D Flip-flop Master-slave D Flip-flop Metastability Registers Memories Static RAM (SRAM) Content Addressable Memory (CAM) Dynamic RAM (DRAM) Read Only Memory (ROM) Programmable Logic Arrays Summary and Further Reading Summary Further Reading Computer Arithmetic Addition Addition of Two 1-bit Numbers Addition of Three 1-bit Numbers Ripple Carry Adder Carry Select Adder Carry Lookahead Adder Multiplication Overview Iterative Multiplier Booth Multiplier An O(log(n)2) Time Algorithm Wallace Tree Multiplier Division Overview Restoring Division Non-Restoring Division Floating Point Addition and Subtraction Simple Addition with Same Signs Rounding Implementing Rounding Addition of Numbers with Opposite Signs Generic Algorithm for Adding Floating Point Numbers Multiplication of Floating Point Numbers Division of Floating Point Numbers Simple Division Goldschmidt Division Division Using the Newton-Raphson Method Summary and Further Reading Summary Further Reading Processor Design Design of a Basic Processor Overview Units in a Processor Instruction Fetch – Fetch Unit Data Path and Control Path Operand Fetch Unit Execute Unit Memory Access Unit Register Writeback Unit The Data Path The Control Unit Microprogram-Based Processor Microprogrammed Data Path Fetch Unit Decode Unit Register File ALU Memory Unit Overview of the Data Path Microassembly Language Machine Model Microinstructions Implementing Instructions in the Microassembly Language 3-Address Format ALU Instructions 2-Address Format ALU Instructions The nop Instruction ld and st instructions Branch Instructions Shared Bus and Control Signals Control Signals Functional Unit Arguments The Microcontrol Unit Vertical Microprogramming Horizontal Microprogramming Tradeoffs between Horizontal and Vertical Microprogramming Summary and Further Reading Summary Further Reading Principles of Pipelining A Pipelined Processor The Notion of Pipelining Overview of Pipelining Performance Benefits Design of a Simple Pipeline Splitting the Data Path Timing The Instruction Packet Pipeline Stages IF Stage OF Stage EX Stage MA Stage RW Stage Putting it All Together Pipeline Hazards The Pipeline Diagram Data Hazards Control Hazards Structural Hazards Solutions in Software RAW Hazards Control Hazards Pipeline with Interlocks A Conceptual Look at a Pipeline with Interlocks Ensuring the Data-Lock Condition Ensuring the Branch-Lock condition Pipeline with Forwarding Basic Concepts Forwarding Paths in a Pipeline Data Hazards with Forwarding Implementation of a Pipeline with Forwarding Forwarding Conditions Support for Interrupts/ Exceptions* Interrupts Exceptions Precise Exceptions Saving and Restoring Program State SimpleRisc Assembly Code of an Interrupt Handler Processor with Support for Exceptions Performance Metrics The Performance Equation Performance of an Ideal Pipelined Processor Performance of a Non-Ideal Pipeline Performance of a Suite of Programs Inter-Relationship between Performance, the Compiler, Architecture, and Technology Power and Temperature Issues Overview Dynamic Power Leakage Power Modeling Temperature* The ED2 Metric Advanced Techniques* Branch Prediction Multiple Issue In-Order Pipeline EPIC and VLIW Processors Out-of-Order Pipelines Summary and Further Reading Summary Further Reading III Organisation: System Design The Memory System Overview Need for a Fast Memory System Memory Access Patterns Temporal and Spatial Locality of Instruction Accesses Characterising Temporal Locality Characterising Spatial Locality Utilising Spatial and Temporal Locality Exploiting Temporal Locality – Hierarchical Memory System Exploiting Spatial Locality – Cache Blocks Caches Overview of a Basic Cache Cache Lookup and Cache Design Data read and data write Operations The insert Operation The replace Operation The evict Operation Putting all the Pieces Together The Memory System Mathematical Model of the Memory System Cache Misses Reduction of Hit Time and Miss Penalty Summary of Memory System Optimisation Techniques Virtual Memory Process – A Running Instance of a Program The ``Overlap'' and ``Size'' Problems Implementation of Virtual Memory with Paging Swap Space Memory Management Unit (MMU) Advanced Features of the Paging System Summary and Further Reading Summary Further Reading Multiprocessor Systems Background Moore's Law Implications of the Moore's Law Software for Multiprocessor Systems Strong and Loosely Coupled Multiprocessing Shared Memory vs Message Passing Amdahl's Law Design Space of Multiprocessors MIMD Multiprocessors Logical Point of View Coherence Memory Consistency Physical View of Memory Shared Caches Coherent Private Caches Implementing a Memory Consistency Model* Multithreaded Processors SIMD Multiprocessors SIMD – Vector Processors Software Interface A Practical Example using SSE Instructions Predicated Instructions Design of a Vector Processor Interconnection Networks Overview Bisection Bandwidth and Network Diameter Network Topologies Summary and Further Reading Summary Further Reading I/O and Storage Devices I/O System – Overview Overview Requirements of the I/O System Design of the I/O System Layers in the I/O System Physical Layer – Transmission Sublayer Single Ended Signalling Low Voltage Differential Signalling (LVDS) Transmission of Multiple Bits Return to Zero (RZ) Protocols Manchester Encoding Non Return to Zero (NRZ) Protocol Non Return to Zero (NRZI) Inverted Protocol Physical Layer – Synchronisation Sublayer Synchronous Buses Source Synchronous Bus* Asynchronous Buses Data Link Layer Framing and Buffering Error Detection and Correction Arbitration Transaction-Oriented Buses Split Transaction Buses Network Layer I/O Port Addressing Memory Mapped Addressing Protocol Layer Polling Interrupts DMA Case Studies – I/O Protocols PCI Express® SATA SCSI and SAS USB FireWire Protocol Storage Hard Disks RAID Arrays Optical Disks – CD, DVD, Blu-ray Flash Memory Summary and Further Reading Summary Further Reading IV Appendix Case Studies of Real Processors ARM® Processors ARM® Cortex® -M3 ARM® Cortex® -A8 ARM® Cortex® -A15 AMD® Processors AMD Bobcat AMD Bulldozer Intel® Processors Intel® Atom™ Intel Sandy Bridge Graphics Processors Overview Graphics Applications Graphics Pipeline Fusion of High Performance Computing and Graphics Computing NVIDIA Tesla Architecture Work Distribution GPU Compute Engines Interconnection Network, DRAM Modules, L2 Caches, and ROPs Streaming Multiprocessors (SMs) Computation on a GPU CUDA Programs
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.