Details

Multicore DSP


Multicore DSP

From Algorithms to Real-time Implementation on the TMS320C66x SoC
1. Aufl.

von: Naim Dahnoun

105,99 €

Verlag: Wiley
Format: EPUB
Veröffentl.: 30.11.2017
ISBN/EAN: 9781119003854
Sprache: englisch
Anzahl Seiten: 648

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<p><b>The only book to offer special coverage of the fundamentals of multicore DSP for implementation on the TMS320C66xx SoC</b> </p> <p>This unique book provides readers with an understanding of the TMS320C66xx SoC as well as its constraints. It offers critical analysis of each element, which not only broadens their knowledge of the subject, but aids them in gaining a better understanding of how these elements work so well together.</p> <p>Written by Texas Instruments’ First DSP Educator Award winner, Naim Dahnoun, the book teaches readers how to use the development tools, take advantage of the maximum performance and functionality of this processor and have an understanding of the rich content which spans from architecture, development tools and programming models, such as OpenCL and OpenMP, to debugging tools. It also covers various multicore audio and image applications in detail.  Additionally, this one-of-a-kind book is supplemented with:</p> <ul> <li>A rich set of tested laboratory exercises and solutions</li> <li>Audio and Image processing applications source code for the Code Composer Studio (integrated development environment from Texas Instruments)</li> <li>Multiple tables and illustrations</li> </ul> <p>With no other book on the market offering any coverage at all on the subject and its rich content with twenty chapters, <i>Multicore DSP: From Algorithms to Real-time Implementation on the TMS320C66x SoC </i>is a rare and much-needed source of information for undergraduates and postgraduates in the field that allows them to make real-time applications work in a relatively short period of time. It is also incredibly beneficial to hardware and software engineers involved in programming real-time embedded systems.</p>
<p>Preface xviii</p> <p>Acknowledgements xxi</p> <p>Foreword xxii</p> <p>About the Companion Website xxiii</p> <p><b>1 Introduction to DSP 1</b></p> <p>1.1 Introduction 1</p> <p>1.2 Multicore processors 3</p> <p>1.2.1 Can any algorithm benefit from a multicore processor? 3</p> <p>1.2.2 How many cores do I need for my application? 5</p> <p>1.3 Key applications of high-performance multicore devices 6</p> <p>1.4 FPGAs, Multicore DSPs, GPUs and Multicore CPUs 8</p> <p>1.5 Challenges faced for programming a multicore processor 9</p> <p>1.6 Texas Instruments DSP roadmap 10</p> <p>1.7 Conclusion 11</p> <p>References 12</p> <p><b>2 The TMS320C66x architecture overview 14</b></p> <p>2.1 Overview 14</p> <p>2.2 The CPU 15</p> <p>2.2.1 Cross paths 16</p> <p>2.2.1.1 Data cross paths 17</p> <p>2.2.1.2 Address cross paths 18</p> <p>2.2.2 Register file A and file B 20</p> <p>2.2.2.1 Operands 20</p> <p>2.2.3 Functional units 21</p> <p>2.2.3.1 Condition registers 21</p> <p>2.2.3.2 .L units 22</p> <p>2.2.3.3 .M units 22</p> <p>2.2.3.4 .S units 23</p> <p>2.2.3.5 .D units 23</p> <p>2.3 Single instruction, multiple data (SIMD) instructions 24</p> <p>2.3.1 Control registers 24</p> <p>2.4 The KeyStone memory 24</p> <p>2.4.1 Using the internal memory 27</p> <p>2.4.2 Memory protection and extension 29</p> <p>2.4.3 Memory throughput 29</p> <p>2.5 Peripherals 30</p> <p>2.5.1 Navigator 32</p> <p>2.5.2 Enhanced Direct Memory Access (EDMA) Controller 32</p> <p>2.5.3 Universal Asynchronous Receiver/Transmitter (UART) 32</p> <p>2.5.4 General purpose input–output (GPIO) 32</p> <p>2.5.5 Internal timers 32</p> <p>2.6 Conclusion 33</p> <p>References 33</p> <p><b>3 Software development tools and the TMS320C6678 EVM 35</b></p> <p>3.1 Introduction 35</p> <p>3.2 Software development tools 37</p> <p>3.2.1 Compiler 38</p> <p>3.2.2 Assembler 39</p> <p>3.2.3 Linker 40</p> <p>3.2.3.1 Linker command file 40</p> <p>3.2.4 Compile, assemble and link 42</p> <p>3.2.5 Using the Real-Time Software Components (RTSC) tools 42</p> <p>3.2.5.1 Platform update using the XDCtools 42</p> <p>3.2.6 KeyStone Multicore Software Development Kit 47</p> <p>3.3 Hardware development tools 47</p> <p>3.3.1 EVM features 47</p> <p>3.4 Laboratory experiments based on the C6678 EVM: introduction to Code Composer Studio (CCS) 51</p> <p>3.4.1 Software and hardware requirements 51</p> <p>3.4.1.1 Key features 52</p> <p>3.4.1.2 Download sites 53</p> <p>3.4.2 Laboratory experiments with the CCS6 53</p> <p>3.4.2.1 Introduction to CCS 55</p> <p>3.4.2.2 Implementation of a DOTP algorithm 63</p> <p>3.4.3 Profiling using the clock 65</p> <p>3.4.4 Considerations when measuring time 67</p> <p>3.5 Loading different applications to different cores 67</p> <p>3.6 Conclusion 72</p> <p>References 72</p> <p><b>4 Numerical issues 74</b></p> <p>4.1 Introduction 74</p> <p>4.2 Fixed- and floating-point representations 75</p> <p>4.2.1 Fixed-point arithmetic 76</p> <p>4.2.1.1 Unsigned integer 76</p> <p>4.2.1.2 Signed integer 77</p> <p>4.2.1.3 Fractional numbers 77</p> <p>4.2.2 Floating-point arithmetic 78</p> <p>4.2.2.1 Special numbers for the 32-bit and 64-bit floating-point formats 81</p> <p>4.3 Dynamic range and accuracy 82</p> <p>4.4 Laboratory exercise 83</p> <p>4.5 Conclusion 85</p> <p>References 85</p> <p><b>5 Software optimisation 86</b></p> <p>5.1 Introduction 86</p> <p>5.2 Hindrance to software scalability for a multicore processor 88</p> <p>5.3 Single-core code optimisation procedure 88</p> <p>5.3.1 The C compiler options 90</p> <p>5.4 Interfacing C with intrinsics, linear assembly and assembly 91</p> <p>5.4.1 Intrinsics 91</p> <p>5.4.2 Interfacing C and assembly 92</p> <p>5.5 Assembly optimisation 97</p> <p>5.5.1 Parallel instructions 98</p> <p>5.5.2 Removing the NOPs 99</p> <p>5.5.3 Loop unrolling 99</p> <p>5.5.4 Double-Word Access 100</p> <p>5.5.5 Optimisation summary 100</p> <p>5.6 Software pipelining 101</p> <p>5.6.1 Software-pipelining procedure 105</p> <p>5.6.1.1 Writing linear assembly code 105</p> <p>5.6.1.2 Creating a dependency graph 105</p> <p>5.6.1.3 Resource allocation 108</p> <p>5.6.1.4 Scheduling table 108</p> <p>5.6.1.5 Generating assembly code 109</p> <p>5.7 Linear assembly 111</p> <p>5.7.1 Hand optimisation of the dotp function using linear assembly 112</p> <p>5.8 Avoiding memory banks 118</p> <p>5.9 Optimisation using the tools 118</p> <p>5.10 Laboratory experiments 123</p> <p>5.11 Conclusion 126</p> <p>References 126</p> <p><b>6 The TMS320C66x interrupts 127</b></p> <p>6.1 Introduction 127</p> <p>6.1.1 Chip-level interrupt controller 129</p> <p>6.2 The interrupt controller 135</p> <p>6.3 Laboratory experiment 140</p> <p>6.3.1 Experiment 1: Using the GIPIOs to trigger some functions 140</p> <p>6.3.2 Experiment 2: Using the console to trigger an interrupt 140</p> <p>6.4 Conclusion 143</p> <p>References 144</p> <p><b>7 Real-time operating system: TI-RTOS 145</b></p> <p>7.1 Introduction 146</p> <p>7.2 TI-RTOS 146</p> <p>7.3 Real-time scheduling 148</p> <p>7.3.1 Hardware interrupts (Hwis) 148</p> <p>7.3.1.1 Setting an Hwi 149</p> <p>7.3.1.2 Hwi hook functions 149</p> <p>7.3.2 Software interrupts (Swis), including clock, periodic or single-shot functions 155</p> <p>7.3.3 Tasks 155</p> <p>7.3.3.1 Task hook functions 157</p> <p>7.3.4 Idle functions 158</p> <p>7.3.5 Clock functions 158</p> <p>7.3.6 Timer functions 158</p> <p>7.3.7 Synchronisation 158</p> <p>7.3.7.1 Semaphores 159</p> <p>7.3.7.2 Semaphore_pend 159</p> <p>7.3.7.3 Semaphore_post 159</p> <p>7.3.7.4 How to configure the semaphores 159</p> <p>7.3.8 Events 159</p> <p>7.3.9 Summary 163</p> <p>7.4 Dynamic memory management 163</p> <p>7.4.1 Stack allocation 165</p> <p>7.4.2 Heap allocation 165</p> <p>7.4.3 Heap implementation 165</p> <p>7.4.3.1 HeapMin implementation 165</p> <p>7.4.3.2 HeapMem implementation 165</p> <p>7.4.3.3 HeapBuf implementation 167</p> <p>7.4.3.4 HeapMultiBuf implementation 171</p> <p>7.5 Laboratory experiments 172</p> <p>7.5.1 Lab 1: Manual setup of the clock (part 1) 172</p> <p>7.5.2 Lab 2: Manual setup of the clock (part 2) 172</p> <p>7.5.3 Lab 3: Using Hwis, Swis, tasks and clocks 174</p> <p>7.5.4 Lab 4: Using events 187</p> <p>7.5.5 Lab 5: Using the heaps 189</p> <p>7.6 Conclusion 190</p> <p>References 191</p> <p>References (further reading) 191</p> <p><b>8 Enhanced Direct Memory Access (EDMA3) controller 192</b></p> <p>8.1 Introduction 192</p> <p>8.2 Type of DMAs available 193</p> <p>8.3 EDMA controllers architecture 194</p> <p>8.3.1 The EDMA3 Channel Controller (EDMA3CC) 194</p> <p>8.3.2 The EDMA3 transfer controller (EDMA3TC) 201</p> <p>8.3.3 EDMA prioritisation 201</p> <p>8.3.3.1 Trigger source priority 202</p> <p>8.3.3.2 Channel priority 203</p> <p>8.3.3.3 Dequeue priority 203</p> <p>8.3.3.4 System (transfer controller) priority 203</p> <p>8.4 Parameter RAM (PaRAM) 203</p> <p>8.4.1 Channel options parameter (OPT) 203</p> <p>8.5 Transfer synchronisation dimensions 203</p> <p>8.5.1 A – Synchronisation 204</p> <p>8.5.2 AB – Synchronisation 204</p> <p>8.6 Simple EDMA transfer 204</p> <p>8.7 Chaining EDMA transfers 208</p> <p>8.8 Linked EDMAs 208</p> <p>8.9 Laboratory experiments 210</p> <p>8.9.1 Laboratory 1: Simple EDMA transfer 211</p> <p>8.9.2 Laboratory 2: EDMA chaining transfer 211</p> <p>8.9.3 Laboratory 3: EDMA link transfer 213</p> <p>8.10 Conclusion 213</p> <p>References 213</p> <p><b>9 Inter-Processor Communication (IPC) 214</b></p> <p>9.1 Introduction 215</p> <p>9.2 Texas Instruments IPC 217</p> <p>9.3 Notify module 219</p> <p>9.3.1 Laboratory experiment 222</p> <p>9.4 MessageQ 222</p> <p>9.4.1 MessageQ protocol 224</p> <p>9.4.2 Message priority 229</p> <p>9.4.3 Thread synchronisation 229</p> <p>9.5 ListMP module 233</p> <p>9.6 GateMP module 234</p> <p>9.6.1 Initialising a GateMP parameter structure 234</p> <p>9.6.1.1 Types of gate protection 235</p> <p>9.6.2 Creating a GateMP instance 236</p> <p>9.6.3 Entering a GateMP 236</p> <p>9.6.4 Leaving a gate 236</p> <p>9.6.5 The list of functions that can be used by GateMP 237</p> <p>9.7 Multi-processor Memory Allocation: HeapBufMP, HeapMemMP and HeapMultiBufMP 237</p> <p>9.7.1 HeapBuf_Params 238</p> <p>9.7.2 HeapMem_Params 239</p> <p>9.7.3 HeapMultiBuf_Params 239</p> <p>9.7.4 Configuration example for HeapMultiBuf 239</p> <p>9.8 Transport mechanisms for the IPC 241</p> <p>9.9 Laboratory experiments with KeyStone I 241</p> <p>9.9.1 Laboratory 1: Using MessageQ with multiple cores 241</p> <p>9.9.1.1 Overview 242</p> <p>9.9.2 Laboratory 2: Using ListMP, ShareRegion and GateMP 243</p> <p>9.10 Laboratory experiments with KeyStone II 249</p> <p>9.10.1 Laboratory experiment 1: Transferring a block of data 249</p> <p>9.10.1.1 Set the connection between the host (PC) and the KeyStone 249</p> <p>9.10.1.2 Explore the ARM code 250</p> <p>9.10.1.3 Explore the DSP code 259</p> <p>9.10.1.4 Compile and run the program 263</p> <p>9.10.2 Laboratory experiment 2: Transferring a pointer 267</p> <p>9.10.2.1 Explore the ARM code 267</p> <p>9.10.2.2 Explore the DSP code 271</p> <p>9.10.2.3 Compile and run the program 278</p> <p>9.11 Conclusion 278</p> <p>References 278</p> <p><b>10 Single and multicore debugging 280</b></p> <p>10.1 Introduction 281</p> <p>10.2 Software and hardware debugging 282</p> <p>10.3 Debug architecture 282</p> <p>10.3.1 Trace 282</p> <p>10.3.1.1 Standard trace 282</p> <p>10.3.1.2 Event trace 283</p> <p>10.3.1.3 System trace 285</p> <p>10.4 Advanced Event Triggering 286</p> <p>10.4.1 Advanced Event Triggering logic 289</p> <p>10.4.2 Unified Breakpoint Manager 294</p> <p>10.5 Unified Instrumentation Architecture 295</p> <p>10.5.1 Host-side tooling 295</p> <p>10.5.2 Target-side tooling 295</p> <p>10.5.2.1 Software instrumentation APIs 297</p> <p>10.5.2.2 Predefined software events and metadata 297</p> <p>10.5.2.3 Event loggers 297</p> <p>10.5.2.4 Transports 297</p> <p>10.5.2.5 SYS/BIOS event capture and transport 297</p> <p>10.5.2.6 Multicore support 297</p> <p>10.6 Debugging with the System Analyzer tools 298</p> <p>10.6.1 Target-side coding with UIA APIs and the XDCtools 299</p> <p>10.6.2 Logging events with Log_write() functions 300</p> <p>10.6.3 Advance debugging using the diagnostic feature 301</p> <p>10.6.4 LogSnapshot APIs for logging state information 302</p> <p>10.7 Instrumentation with TI-RTOS and CCS 302</p> <p>10.7.1 Using RTOS Object Viewer 302</p> <p>10.7.2 Using the RTOS Analyzer and the System Analyzer 303</p> <p>10.7.2.1 RTOS Analyzer 303</p> <p>10.7.2.2 System Analyzer 303</p> <p>10.8 Laboratory sessions 305</p> <p>10.8.1 Laboratory experiment 1: Using the RTOS ROV 305</p> <p>10.8.2 Laboratory experiment 2: Using the RTOS Analyzer 305</p> <p>10.8.3 Laboratory experiment 3: Using the System Analyzer 312</p> <p>10.8.4 Laboratory experiment 4: Using diagnosis features 314</p> <p>10.8.5 Laboratory experiment 5: Using a diagnostic feature with filtering 317</p> <p>10.9 Conclusion 321</p> <p>References 322</p> <p>Further reading 323</p> <p><b>11 Bootloader for KeyStone I and KeyStone II 324</b></p> <p>11.1 Introduction 324</p> <p>11.2 How to start the boot process 325</p> <p>11.3 The boot process 325</p> <p>11.4 ROM Bootloader (RBL) 328</p> <p>11.4.1 The boot configuration format 336</p> <p>11.4.1.1 Creating the boot parameter table 336</p> <p>11.4.1.2 Creating the boot table 338</p> <p>11.4.1.3 The boot configuration table 338</p> <p>11.5 Boot process 340</p> <p>11.5.1 Initialisation stage for the KeyStone I 340</p> <p>11.5.2 Second-level bootloader 341</p> <p>11.5.2.1 Intermediate bootloader 341</p> <p>11.5.2.2 How to use the IBL 342</p> <p>11.6 Laboratory experiment 1 345</p> <p>11.6.1 Initialisation stage for the KeyStone II 350</p> <p>11.6.1.1 Bootloader initialisation after power-on reset 350</p> <p>11.6.1.2 Bootloader initialisation process after hard or soft reset 350</p> <p>11.6.2 Second bootloader for the KeyStone II 350</p> <p>11.6.2.1 U-Boot 351</p> <p>11.7 Laboratory experiment 2 352</p> <p>11.7.1 Printing the U-Boot environment 360</p> <p>11.7.2 Using the help for U-Boot 362</p> <p>11.8 TFTP boot with a host-mounted Network File System (NFS) server – NFS booting 363</p> <p>11.8.1 Laboratory experiment 3 364</p> <p>11.9 Conclusion 372</p> <p>References 372</p> <p><b>12 Introduction to OpenMP 374</b></p> <p>12.1 Introduction to OpenMP 375</p> <p>12.2 Directive formats 376</p> <p>12.3 Forking region 377</p> <p>12.3.1 omp parallel – parallel region construct 377</p> <p>12.3.1.1 Clause descriptions 378</p> <p>12.4 Work-sharing constructs 382</p> <p>12.4.1 omp for 382</p> <p>12.4.1.1 OpenMP loop scheduling 383</p> <p>12.4.2 omp sections 385</p> <p>12.4.3 omp single 386</p> <p>12.4.4 omp master 386</p> <p>12.4.5 omp task 387</p> <p>12.5 Environment variables and library functions 390</p> <p>12.6 Synchronisation constructs 392</p> <p>12.6.1 atomic 393</p> <p>12.6.1.1 Clauses 393</p> <p>12.6.2 barrier 395</p> <p>12.6.3 critical 396</p> <p>12.7 OpenMP accelerator model 397</p> <p>12.7.1 Supported OpenMP device constructs 397</p> <p>12.7.1.1 #pragma omp target 397</p> <p>12.7.1.2 #pragma omp target data 399</p> <p>12.7.1.3 #pragma omp target update 400</p> <p>12.7.1.4 #pragma omp declare target 401</p> <p>12.8 Laboratory experiments 402</p> <p>12.8.1 Laboratory experiment 1 402</p> <p>12.8.2 Laboratory experiment 2 402</p> <p>12.8.3 Laboratory experiment 3 404</p> <p>12.8.4 Laboratory experiment 4 405</p> <p>12.8.5 Laboratory experiment 5 405</p> <p>12.9 Conclusion 417</p> <p>References 419</p> <p><b>13 Introduction to OpenCL for the KeyStone II 420</b></p> <p>13.1 Introduction 421</p> <p>13.2 Operation of OpenCL 421</p> <p>13.3 Command queue 424</p> <p>13.3.1 Creating a command queue 427</p> <p>13.3.1.1 Command-queue properties 429</p> <p>13.3.2 Enqueueing a kernel 430</p> <p>13.4 Kernel declaration 431</p> <p>13.5 How do the kernels access data? 431</p> <p>13.6 OpenCL memory model for the KeyStone 432</p> <p>13.6.1 Creating a buffer 433</p> <p>13.6.1.1 Cl_mem_flags 434</p> <p>13.7 Synchronisation 435</p> <p>13.7.1 Event with a callback function 436</p> <p>13.7.2 User event 439</p> <p>13.7.3 Waiting for one command or all commands to finish 439</p> <p>13.7.4 wait_group_events 440</p> <p>13.7.5 Barrier 440</p> <p>13.8 Basic debugging profiling 440</p> <p>13.9 OpenMP dispatch from OpenCL 443</p> <p>13.9.1 OpenMP for the kernel code 443</p> <p>13.9.2 OpenMP for the ARM code 443</p> <p>13.10 Building the OpenCL project 444</p> <p>13.11 Laboratory experiments 445</p> <p>13.11.1 Laboratory experiment 1: Hello World 446</p> <p>13.11.2 Laboratory experiment 2: dotp functions 454</p> <p>13.11.2.1 Explore the main.cpp function 454</p> <p>13.11.2.2 Explore the kernel dotp.cl 459</p> <p>13.11.2.3 Run the dotp program 460</p> <p>13.11.3 Laboratory experiment 3: USE_HOST_PTR 460</p> <p>13.11.4 Laboratory experiment 4: ALLOC_HOST_PTR 463</p> <p>13.11.5 Laboratory experiment 5: COPY_HOST_PTR 465</p> <p>13.11.6 Laboratory experiment 6: Synchronisation 467</p> <p>13.11.7 Laboratory experiment 7: Local buffer 473</p> <p>13.11.8 Laboratory experiment 8: Barrier 477</p> <p>13.11.9 Laboratory experiment 9: Profiling 479</p> <p>13.11.10 Laboratory experiment 10: OpenMP in kernel 484</p> <p>13.11.11 Laboratory experiment 11: OpenMP in ARM 487</p> <p>13.12 Conclusion 489</p> <p>References 490</p> <p><b>14 Multicore Navigator 491</b></p> <p>14.1 Introduction 491</p> <p>14.2 Navigator architecture 492</p> <p>14.2.1 The PKDMA 494</p> <p>14.2.1.1 PKDMA transmit side 495</p> <p>14.2.1.2 PKDMA receive side 495</p> <p>14.2.1.3 Infrastructure PKDMA 497</p> <p>14.2.2 Descriptors 497</p> <p>14.2.2.1 Host packet descriptors 498</p> <p>14.2.2.2 Monolithic packet descriptor 498</p> <p>14.2.2.3 Setting up the memory regions for the descriptors 498</p> <p>14.2.3 Queue Manager Subsystem 500</p> <p>14.2.4 Queue Manager 503</p> <p>14.2.4.1 Queue peek registers 503</p> <p>14.2.4.2 Link RAM 504</p> <p>14.2.5 Accumulator packet data structure processors 504</p> <p>14.2.5.1 Accumulation 506</p> <p>14.2.5.2 Quality of service 506</p> <p>14.2.5.3 Event management (resource sharing and job load balancing) 506</p> <p>14.2.6 Interrupt distributor module 506</p> <p>14.3 Complete functionality of the Navigator 506</p> <p>14.4 Laboratory experiment 511</p> <p>14.5 Conclusion 513</p> <p>References 514</p> <p><b>15 FIR filter implementation 515</b></p> <p>15.1 Introduction 515</p> <p>15.2 Properties of an FIR filter 516</p> <p>15.2.1 Filter coefficients 516</p> <p>15.2.2 Frequency response of an FIR filter 516</p> <p>15.2.3 Phase linearity of an FIR filter 517</p> <p>15.3 Design procedure 518</p> <p>15.3.1 Specifications 518</p> <p>15.3.2 Coefficients calculation 519</p> <p>15.3.2.1 Window method 519</p> <p>15.3.3 Realisation structure 522</p> <p>15.3.3.1 Direct structure 525</p> <p>15.3.3.2 Linear phase structures 525</p> <p>15.3.3.3 Cascade structures 527</p> <p>15.4 Laboratory experiments 528</p> <p>15.4.1 Filter implementation 529</p> <p>15.4.2 Synchronisation 530</p> <p>15.4.3 Building and running the DSP project 532</p> <p>15.4.4 Building and running the PC project 534</p> <p>15.5 Conclusion 540</p> <p>References 540</p> <p><b>16 IIR filter implementation 542</b></p> <p>16.1 Introduction 542</p> <p>16.2 Design procedure 543</p> <p>16.3 Coefficients calculation 543</p> <p>16.3.1 Pole–zero placement approach 543</p> <p>16.3.2 Analogue-to-digital filter design 543</p> <p>16.3.3 Bilinear transform (BZT) method 544</p> <p>16.3.3.1 Practical example of the bilinear transform method 547</p> <p>16.3.3.2 Coefficients calculation 547</p> <p>16.3.3.3 Realisation structures 548</p> <p>16.3.4 Impulse invariant method 552</p> <p>16.3.4.1 Practical example of the impulse invariant method 553</p> <p>16.4 IIR filter implementation 556</p> <p>16.5 Laboratory experiment 561</p> <p>16.6 Conclusion 561</p> <p>Reference 562</p> <p><b>17 Adaptive filter implementation 563</b></p> <p>17.1 Introduction 563</p> <p>17.2 Mean square error 564</p> <p>17.3 Least mean square 565</p> <p>17.4 Implementation of an adaptive filter using the LMS algorithm 565</p> <p>17.5 Implementation using linear assembly 567</p> <p>17.6 Implementation in C language with compiler switches 572</p> <p>17.7 Laboratory experiment 572</p> <p>17.8 Conclusion 573</p> <p>References 573</p> <p><b>18 FFT implementation 574</b></p> <p>18.1 Introduction 574</p> <p>18.2 FFT algorithm 574</p> <p>18.2.1 Fourier series 574</p> <p>18.2.2 Fourier transform 575</p> <p>18.2.3 Discrete Fourier transform 575</p> <p>18.2.4 Fast Fourier transform 576</p> <p>18.2.4.1 Splitting the DFT into two DFTs 576</p> <p>18.2.4.2 Exploiting the periodicity and symmetry of the twiddle factors 577</p> <p>18.3 FFT implementation 579</p> <p>18.4 Laboratory experiment 582</p> <p>18.4.1 Part 1: Implementation of DIF FFT 582</p> <p>18.4.2 Part 2: Using ping-pong EDMA 585</p> <p>18.5 Conclusion 590</p> <p>References 590</p> <p><b>19 Hough transform 591</b></p> <p>19.1 Introduction 591</p> <p>19.2 Theory 591</p> <p>19.3 Limits of r and θ 593</p> <p>19.4 Hough transform implementation 595</p> <p>19.5 Laboratory experiment 596</p> <p>19.6 Conclusion 603</p> <p>References 603</p> <p><b>20 Stereo vision implementation 604</b></p> <p>20.1 Introduction 604</p> <p>20.2 Algorithm for performing depth calculation 605</p> <p>20.3 Cost functions 606</p> <p>20.4 Implementation 607</p> <p>20.4.1 Laboratory experiment 610</p> <p>20.4.1.1 SAD implementation 610</p> <p>20.4.1.2 NCC implementation 611</p> <p>20.4.1.3 ZNCC implementation 611</p> <p>20.5 Conclusion 613</p> <p>References 616</p> <p>Index 617</p>
<p><strong>Naim Dahnoun </strong>is Reader in Teaching and Learning in Signal Processing in the Faculty of Engineering at the University of Bristol, UK.
<p><strong>This text offers special coverage of the fundamentals of multicore DSP for implementation on the TMS320C66x SoC</strong> <p> This content provides readers with an understanding of the TMS320C66x SoC as well as its constraints. It offers critical analysis of each element, which not only broadens their knowledge of the subject, but aids the reader in gaining a better understanding of how these elements work so well together. <p>Written by Texas Instrument' First DSP Educator Award winner, Naim Dahnoun, the text teaches readers how to use the development tools, take advantage of the maximum performance and functionality of this processor and have an understanding of the rich content which spans from architecture, development tools and programming models, such as OpenCL and OpenMP, to debugging tools. The text also covers various multicore audio and image applications in detail and is supplemented with: <ul><li>A rich set of tested laboratory exercises and solutions</li> <li>Audio and Image processing applications source code for the Code Composer Studio (integrated development environment from Texas Instruments)</li> <li>Multiple tables and illustrations</li> </ul> With its rich content of twenty chapters, <em>Multicore DSP: From Algorithms to Real-time Implementation on the TMS320C66x SoC</em> is a rare and much-needed source of information for undergraduates and postgraduates in the field that allows them to make real-time applications work in a relatively short period of time. This content is also incredibly beneficial to hardware and software engineers involved in programming real-time embedded systems. <p>

Diese Produkte könnten Sie auch interessieren:

Strategies to the Prediction, Mitigation and Management of Product Obsolescence
Strategies to the Prediction, Mitigation and Management of Product Obsolescence
von: Bjoern Bartels, Ulrich Ermel, Peter Sandborn, Michael G. Pecht
PDF ebook
116,99 €