|
Special Feature
January 25, 2006
Small is in Big is out
Ujjwal K Dey
NEW DELHI -- Gone are the days when it was said, “big is better” but presently we are in an era where we all are looking for the small yet productive things to be used in our daily lives. Big gadgets too, are being run on tiny processors. But we are still using different processors for different actions embedded in the same gadget.
Multiprocessor microprocessors, as the name suggests, are tiny processors embedded with actions of different processors, or embedded with a more microscopic processor performing all the actions expected from the gadget.
The immense research in the field of nano technology has brought us closer to the size of the chips that would require us look at the chip through a high-powered microscope. The size would be probably be one-thousandth part of one human hair. Imagine where we are headed- maybe towards Liliput’s world of the little people where miniature and micro articles are used!
Microprocessors are used in technology of large rocket and in small home appliances like irons, mixer-grinders, etc. It is the period of convergence, where we see voice, video and data coming together. Without doubt, it is microprocessors that play a pivotal role in bringing everything alive.
The world is looking towards India for skilled manpower at economical rates. It is a huge opportunity for India to come to the forefront once again, to take the lead…
Cost a challenge
Cost has been always been posing a challenge to different industry in various forms but India has both an advantage and challenge in this area of multicore microprocessor. Prakash Vaswani, research analyst from In-Stat, pointed out that cost would be one of the biggest challenges for the multiprocessor microchip in the Indian market.
He explained, “Though there will surely be some early adopters for this technology, but most of the companies will go in only after the performance and applications deliver promised results. Opportunities for the multiprocessor microchip market are immense in India. With the multiprocessor microchip offering vastly improved performance in databases, digital applications, security, etc., it will be the multimedia, animation, medical and analytics industries that will benefit the most. Moreover, with lot of work being done in India in the field of EDA, semiconductor design, and developments in telecom field, the multiprocessor chip will proffer benefits in these areas.”
Talking challanges
On the issue of India adapting to this multiprocessor microchip and how it is doing vis-à-vis its global adoption, Vaswani said, “Though there are lots of opportunities in India, in different application domains for the multiprocessor microchip, most companies will adopt this technology only after applications have been built to harness their true potential. It is likely that India will take to this technology only after it becomes a success in the western world.”
“The Indian multiprocessor microchip market will face the same challenges and opportunities as the rest of the world, at least as far as the Multicore Association is concerned,” said, Markus Levy, president at embedded microprocessor benchmark consortium (EEMBC).
While dwelling on the issues that are generally discussed at the association are associated with multicore/multiprocessor systems related to software. Levy explain, “The Association (at least in its initial phases) will focus on the standardisation of different aspects of multi-core/processor software to enhance source code software portability.”
Steve Roddy, vice president - marketing, Tensilica, Inc., said, “While many Indian system designers might be interested in buying off-the-shelf multiprocessor chips, some are designing complex SOCs and ASICs themselves and they are increasingly turning to Tensilica’s technology to produce the optimised processor cores for their multiprocessor SOCs.”
“Tensilica’s latest processor technology is the Xtensa LX processor, the highest performance microprocessor and DSP technology.Tensilica has invested over US $100 million on technology and methodologies that automate the process of optimising a processor for a particular application. We also concentrate on optimisations for processor cores,” added Roddy.
Putting forward his point of view Phil Casini, vice president, marketing and business development, Sonics Inc, said, “We have only seen Indian companies acting as subcontractors to our large semiconductor companies (such as TI, Toshiba, and Samsung). We have not sent any technologies explicitly penetrating the consumer or the communications markets we serve.”
He added, "Our R&D has been focused on creating intelligent interconnects that combine data flow services with leading edge bus fabrics for a complete solution."
Cellomania
Hiroko Mochida, Toshiba Corporate Communications, talks about their revolutionary microprocessor – Cell. He said: “Cell is a revolutionary microprocessor, a core device that will sustain a whole spectrum of advanced information-rich broadband applications, from consumer electronics to home entertainment and through to various industrial systems. Cell’s breakthrough multi-core architecture and ultra high-speed communications capabilities deliver vastly improved, real-time response for entertainment and rich media applications, in many cases ten times the performance of the latest PC processors. The Cell will become the broadband processor used for industrial applications to the new digital home.
“Another advantage of Cell is to support multiple operating systems, such as conventional operating systems (including Linux), real-time operating systems for computer entertainment and consumer electronics applications as well as guest operating systems for specific applications, simultaneously,” added Mochida.
Mochida also pointed towards Toshiba’s foot into the nano-world where Toshiba and Sony announced the development of essential technologies for system LSI based on next-generation 45-nanometer (nm) process technology. The advances cover the development of carrier mobility enhancement technologies and wiring process technology for boosting LSI performance.
Memory technique
On the issue of multicore processors picking the right memory technologies and would it be posing a big challenge to the hardware, Casini commented, “The biggest issue today with multiprocessing is the pipe to memory. Sonics has introduced a product called MemMax that enables multiple threads to access the same memory bank(s) and uses queuing theory to optimise these multi-sourced requests. An improvement of up to 40 percent on memory bandwidth can be achieved, and seems to be good for streaming applications given market requirements today. Choosing memory technologies is important as well, but not as important as ensuring that the Interconnect from the memories to he processing elements have enough provides enough bandwidth to satisfy the requirements.”
While commenting on the requirements of the new breed of Internet security and access (ISAs) combining digital signal processing and micro controllers (DSP/MCU), Roddy feels that the configurable processors are ideal for ISAs because they can perform common control (MCU) functions as well as be optimised for DSP, all in one processor, so there is no overhead from moving data between the DSP and MCU. This makes our configurable processors much more power efficient and high performing than DSP/MCU combo architectures.
“Firewalls and digital rights management are key elements of any strategy and are actually now required in the chip itself. SoncisMX contains security firewalls and has assistance for DRM schemes,” added Casini.
Parallel processing the enhancer
Parallel processing has been in talks for a long time for its enhancing quality of improving the performance and also helping in accomplishing the multitasking too. But the experts are all not of the same view on this issue.
Talking about parallel processing, Vaswani said that it would considerably improve the performance of the system. “Though not much improvement on speed, parallel processing will enable multitasking more efficiently and effectively.”
He further added, “This technology will help in applications involving complex computations, large databases, digital imaging, multitasking, improved networking and security. Thus, data centers, space technology, medical and biotechnology developments, multimedia and networking segments will benefit the most. Also additional processor can be more effective in handling spam, viruses and other e-security issues.”
Levy explained that efficiently implemented parallel processing provides improved MIPS/Watt characteristics. For development efforts, many technology disciplines must come together to ensure compatibility and ease of integration. For example, when combining different processor types along with different operating systems, there needs to be better standardisation to support a common interface; this is something that multicore association is working on with its communication API and resource management API.
Roddy feels somewhat different regarding the parallel processing and pointed out, “The key conceptual leap that designers across the globe are starting to realise is that classical 1980s-style parallel processing using symmetric, heterogeneous processor cores is inefficient. Parallel processing failed to become mainstream in the 1980s and will fail to become mainstream in the high volume consumer markets of the 21 st century. Outside of the PC market, classical parallel processing is not taking off. Instead, consumer applications demand the power, efficiency and performance of asymmetric, heterogeneous multi-core architectures.
“The simple comparison of the mobile phone to the PC shows this to be true. Mobile phones have shipped billions of units of multi-core architectures, while the PC is just beginning to use 20 million. The de facto dominant architecture for mobile phones today is that of an asymmetric, heterogeneous architecture. Mobile phones contain DSPs for the radio interface, controller processors for the operating system and user interface and power-efficient dedicated processors for compute intensive media tasks like video playback, digital still camera image processing, and audio,” added Roddy.
“We cannot really comment on this, as it depends the segment and the development work, but, generally speaking, parallel processing is faster and more efficient,” said Mochida.
Casini explained, “Parallel processing is now a requirement to meet end user feature demands and is being employed in every performance SoC being designed today. This in turn is driving the need for more complex interconnects and that is why Sonics is growing very rapidly.”
Generation gap of the old and new
The gap between the generations has always been a problem as it is always a big concern with the experts how to update the old ones and prepare it for the new need. But the rational decisions have always helped to bridge that gap too. On the same level Vaswani said, “The new architecture is fully compatible with old generation hardware as per the claims. ” “With the right software tools, it should be very compatible at the user level. Even at the hardware level, vendors are doing things to virtualise the different processor structure,” added Levy.
Roddy pointed out that today’s software engineers increasingly rely on C code, not assembly code, for it’s productivity and portability. Similarly, system designers rely on operating systems to further aid in code portability. Thus most systems vendor will not have much difficulty in adopting newer processor hardware. Mochida believes Cell has a breakthrough multi-core architectural design and will be directed to next generation information-rich broadband applications.
Microprocessor increase use of power-aware systems
Microprocessor plays the role of a power tool for various power-aware systems ranging from big rocket to small MP3 players. Now it depends how you use it to increase the performance.
It is like taming the small and complex chip technology and use it for complex as well as for the simple household matters.
On the same thought Vaswani explained, “The use for multicore processors in servers is very clear. Most scalable server applications are multithreaded and are instantly amenable to multicore processors. The need for increasing network and compute density plus increased processing complexity are pushing designers to multicore. Multicore server processors can increase the compute performance per processor socket, which can decrease maintenance costs and improve performance per watt.”
He added, “Multicore processors will be used more for server applications initially. They will also be helpful for PCs and other mobile devices provided the software is developed accordingly. Increasing software complexity and battery life would benefit from improvements of MIPS/Watt in multi-processing systems,” said Levy.
“We think the symmetric multicore microprocessor as defined by AMD and Intel will be used in the PC market, while other more power-aware systems, like mobile phones, will use multiple, asymmetric, heterogeneous processor system-on-chip (MPSOC) techniques,” commented Roddy.
“Multiple embedded processors are already in use in these power aware systems. While an entire cell phone might be able to be implemented on one large processor - it would entail a huge power penalty. Multiple optimised configurable processors consume much less battery life than one large processor. That’s why most cell phones today contain three to six processors (control, audio and video at least – or more), ” he added.
Mochida said, “Our Cell is developed for information-rich broadband applica-tions.”
Casini clearly mentioned that the heterogeneous processing is already a standard in the mobile and digital consumer markets. An example TIs OMAP 2 2420 applications processor contains four main engines and many interfaces. Mutlicore microprocessor solutions will only be used in PC applications because the nature of embedded system processing requires different processing elements.
Multithreaded software Vs separate embedded processors
The name itself suggest that a multithreaded would want to do multitasking with the same chip but the separately embedded one wants to put in different chips dedicated for different tasks. Experts have formed different opinion regarding this matter.
Vaswani said, “In multicore processors, two separate caches help to do away with in designing a centralised cache for the processors. Moreover, two caches have shown good results for storing and processing data for the respective processing. Multi-threaded software applications- programs that run multiple tasks (threads) at the same time to increase performance for heavy workload scenarios, such as data mining, mathematical analysis, and Web serving, are already positioned to take advantage of multi-core processors.”
Roddy explained, “For embedded applications, Tensilica is not a proponent of multithreaded software and our customers have shown no interest in this area. Instead of multithreaded software, we recommend having various software functions running on separate embedded processors.” Mochida pointed out that Toshiba supports customers by providing the software that maximise Cell’s multi-core performance.
Specialised cores save power and improve performance
With an intention to save power and improving the performance, lots of work is being done in the microprocessor front. With the use of specialised cores we are able to achieve that with “less input more output”.
But on this issue experts have different point to present. Vaswani felt that “applications hold the key”. On the other hand Levy believes higher performance and longer battery life are key.
Roddy pointed out that this is exactly where Tensilica fitted in. “By using multiple specialised cores, designers can significantly save power and improve performance. Many SOCs are designed with multiple Xtensa processors. On an average, Tensilica’s customers have employed six Xtensa processors per design in 130nm technology, and one of our customers have designed a chip that employs almost 200 processors.
Whether they’re homogeneous arrays of processors performing high-throughput communications tasks or a group of heterogeneous processors performing different tasks in an image processing chain, the allocation of performance among all of the tasks in a SOC design is much easier with multiple Xtensa processors than with just one control processor and multiple blocks of logic.
“There are several advantages to using multiple processors as SOC task building blocks. One of the biggest is that processors are inherently programmable, so functional changes can be made to the chip’s operation using firmware after the chip design is finished and even after the chip has been fabricated. Complex state machines can be implemented in firmware running on the processors, greatly reducing verification time,” said Roddy.
“In addition, a multiple-processor-based design approach promotes the flexible sharing and reuse of on-chip memories while reducing the overall amount of memory needed. Design with multiple processors facilitates system modeling with instruction-set simulators, which are much faster and more efficient than RTL-based system simulation.
Additionally, by employing multiple processors in SOC designs, it’s easier to develop one SOC that works for several different related products. One of the hidden benefits of spreading tasks across multiple processors is that breaking the SOC’s overall task into smaller subtasks. Spreading these subtasks across multiple processors actually speeds the process of writing and debugging the required software,” he continued.
Mochida explained, “Cell’s break-through multi-core architecture, which allows cores to be programmed for specific tasks, plus its ultra high-speed communications capabilities deliver vastly improved, real-time response for entertainment and rich media applications.”
This is essentially what heterogeneous processing has evolved into. It is paramount to adopt this architecture for embedded applications,” pointed out Casini.
Helping vendors for better performance
It has always been the give and take policy in every phase of life. The same also applies in here as well. Commenting on Tensilica’s role on the same feeling Roddy stated, “The most powerful way Tensilica helps SOC designers to realise highly efficient designs is thru the architectural (ISA level) customisation that Tensilica pioneered. Tensilica automated and patented the concept of simultaneous creation of hardware and software for processor cores.
Thus the SOC designer gets a processor optimised for his/her application, with no excess baggage and no compromises. At the chip architecture level, Tensilica’s Xtensa LX processors deliver direct I/O capability equivalent to RTL design, eliminating the wasted power of moving data into and out of conventional DSP/RISC cores using Load/Store operations.
“At the circuit implementation level, Tensilica has automated the insertion of fine-grain clock gating for every functional element of the Xtensa processors including functions conceived of and created by the designer.
Clock gating is a very effective power reduction technique that shuts down the power to parts of the logic that are not in use on a particular clock cycle. We also apply course-grain clock gating for further power optimisation,” he said.
Levy observed that programming API’s would provide access to a broader spectrum of multi-processing system software and tools.
“As a major supplier of SoC, we provide our customers with optimised support from the design stage on, in order to maximise power efficiency and system performance,” commented Mochida whereas Casini felt, “SonicsMX has several levels of power management that enable SoC developers to implement very detailed power management schemes without having to design a lot of the infrastructure themselves. This in turn enables our customer to achieve very aggressive low power budgets.” |