#aca rgpv Advance Computer Architecture unit 4 part 3
Vector Instruction types
• Vector-Vector Instructions
• Vector-Scalar Instructions
• Vector-Memory Instructions
• Vector Reduction Instructions
• Gather and Scatter Instructions
• Masking Instructions
SIMD organization: distributed memory model and shared memory model
There are various architecture supporting parallel processing exists these are boardly classified as Multiprocessors and Multicomputers. The common classification are Shared-Memory Multiprocessors Models which include all UMA: uniform memory access (all SMP servers), NUMA: nonuniform-memory- access (Stanford DASH, SGI Origin 2000, Cray T3E) and COMA: cache-only memory architecture (KSR) Which have very low remote memory access latency
The Distributed-Memory Multicomputers Model must have a message-passing network, highly scalable like NORMA model (no-remote-memory-access), IBM SP2, Intel Paragon, TMC CM-5, INTEL ASCI Red, PC cluster
Principles of Multithreading: Multithreading Issues and Solutions
In computer architecture, multithreading is the ability of a central processing unit (CPU) or a single core in a multi-core processor to execute multiple processes or threads concurrently, appropriately supported by the operating system. This approach differs from multiprocessing, as with multithreading the processes and threads share the resources of a single or multiple cores: the computing units, the CPU caches, and the translation lookaside buffer (TLB).
Where multiprocessing systems include multiple complete processing units, multithreading aims to increase utilization of a single core by using thread-level as well as instruction-level parallelism. As the two techniques are complementary, they are sometimes combined in systems with multiple multithreading CPUs and in CPUs with multiple multithreading cores.
The multithreading paradigm has become more popular as efforts to further exploit instruction-level parallelism have stalled since the late 1990s. This allowed the concept of throughput computing to re-emerge from the more specialized field of transaction processing; even though it is very difficult to further speed up a single thread or single program, most computer systems are actually multitasking among multiple threads or programs. Thus, techniques that improve the throughput of all tasks result in overall performance gains.
Two major techniques for throughput computing are multithreading and multiprocessing.
If a thread gets a lot of cache misses, the other threads can continue taking advantage of the unused computing resources, which may lead to faster overall execution as these resources would have been idle if only a single thread were executed. Also, if a thread cannot use all the computing resources of the CPU (because instructions depend on each other’s result), running another thread may prevent those resources from becoming idle.
Multiple threads can interfere with each other when sharing hardware resources such as caches or translation lookaside buffers (TLBs). As a result, execution times of a single thread are not improved but can be degraded, even when only one thread is executing, due to lower frequencies or additional pipeline stages that are necessary to accommodate thread-switching hardware.
Overall efficiency varies; Intel claims up to 30% improvement with its Hyper-Threading Technology, while a synthetic program just performing a loop of non-optimized dependent floating-point operations actually gains a 100% speed improvement when run in parallel. On the other hand, hand-tuned assembly language programs using MMX or AltiVec extensions and performing data prefetches (as a good video encoder might) do not suffer from cache misses or idle computing resources. Such programs therefore do not benefit from hardware multithreading and can indeed see degraded performance due to contention for shared resources.
From the software standpoint, hardware support for multithreading is more visible to software, requiring more changes to both application programs and operating systems than