<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Steve Leibson &#187; Low-Power</title>
	<atom:link href="http://low-powerdesign.com/sleibson/index.php/category/low-power/feed/" rel="self" type="application/rss+xml" />
	<link>http://low-powerdesign.com/sleibson</link>
	<description>Leibson's Laws and the Penalties for Breaking Them</description>
	<lastBuildDate>Wed, 01 Feb 2012 00:01:15 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>Do you believe in 21st century Intelligent Design?</title>
		<link>http://low-powerdesign.com/sleibson/2012/02/01/do-you-believe-in-21st-century-intelligent-design/</link>
		<comments>http://low-powerdesign.com/sleibson/2012/02/01/do-you-believe-in-21st-century-intelligent-design/#comments</comments>
		<pubDate>Wed, 01 Feb 2012 00:01:15 +0000</pubDate>
		<dc:creator>sleibson321</dc:creator>
				<category><![CDATA[Design]]></category>
		<category><![CDATA[Low-Power]]></category>
		<category><![CDATA[Honeywell]]></category>
		<category><![CDATA[Learning Thermostat]]></category>
		<category><![CDATA[Nest]]></category>

		<guid isPermaLink="false">http://low-powerdesign.com/sleibson/?p=779</guid>
		<description><![CDATA[Late last month, columnist Mike Cassidy wrote about visionary Clayton Christensen’s Innovator’s Dilemma in the San Jose Mercury News and his words reminded me that it was time, past time, to make yet another blog-based plea for intelligent design. No, &#8230; <a href="http://low-powerdesign.com/sleibson/2012/02/01/do-you-believe-in-21st-century-intelligent-design/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Late last month, columnist Mike Cassidy wrote about visionary Clayton Christensen’s Innovator’s Dilemma in the <strong>San Jose Mercury News </strong>and his words reminded me that it was time, past time, to make yet another blog-based plea for intelligent design. No, I’m not talking about “intelligent design” in the form of an alternative to evolutionary theory. Not that one. I’m talking about “intelligent design” in the form of adding more microprocessors and more software to all electronic designs in a valiant attempt to produce products are more aware of the context of their surroundings. In other words, products that are far less stupid. At the same time, I believe that this form of intelligent design can help you cut product manufacturing cost.</p>
<p>More value at lower cost. Doesn’t that seem like a good deal?</p>
<p>Here’s what Cassidy wrote that triggered this blog:</p>
<p>“Clay Christensen has an idea: Scare the hell out of yourselves.</p>
<p>OK, that&#8217;s not precisely the way he put it. But the author of &#8220;The Innovator&#8217;s Dilemma&#8221; is all about new ideas. Not just new &#8212; but different, unorthodox, radical, uncertain, frightening and disruptive. You don&#8217;t solve old problems with old ideas. The other day, Christensen held a one-man teach-in for non-profits and their supporters at San Jose&#8217;s Mexican Heritage Plaza, preaching the gospel of &#8220;disruptive innovation.&#8221; It&#8217;s an idea that is embedded in Silicon Valley&#8217;s DNA. It is also an idea that is a lot easier to talk about than to actually deploy.”</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2012/01/Honeywell-Thermostat-circa-1952.png"><img class="alignright size-full wp-image-780" title="Honeywell Thermostat circa 1952" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2012/01/Honeywell-Thermostat-circa-1952.png" alt="" width="200" height="201" /></a>So what’s the connection with “intelligent design”? I was immediately reminded of a great new product, a home thermostat of all things, that I’d just written up from last month’s CES show.(See “<a href="http://eda360insider.wordpress.com/2012/01/13/two-minutes-of-system-design-expertise-from-matt-rogers-vp-of-engineering-and-founder-of-nest-and-designer-of-the-thermostat-of-the-future/" target="_blank">Friday Video: Two minutes of system-design expertise from Matt Rogers, VP of Engineering and founder of Nest and designer of the thermostat of the future</a>”) <a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2012/01/Honeywell-Thermostat-circa-1952.png"></a>The thermostat is from a new company called <a href="www.nest.com " target="_blank">Nest</a> and the product is called the Nest Learning Thermostat. It looks like an updated 21<sup>st</sup> century version of the old golden Honeywell manual thermostats that were common in the 1950s and 1960s. (Noted industrial designer Henry Drefuss created the Honeywell thermostat’s iconic circular industrial design in 1952.)</p>
<p>However, unlike those old Honeywell thermostats that were based on bimetallic temperature-sensing coil springs and mercury tilt switches, the Nest Learning Thermostat is based on a 32-bit microprocessor. And a TCP/IP stack. And WiFi. In adding these specific technologies to its Learning Thermostat, Nest demonstrates a grasp of Christensen’s concept of the “Innovator’s Dilemma.”</p>
<p>How?</p>
<p>First, understand that the context of what we mean by “home” has changed. Frequently in the modern Western world, there’s no one home. That means the home’s heating and cooling requirements are different. Also, our definition of “home” increasingly includes a home WiFi network, possibly with access to the outside world through the home’s broadband router. Add some intelligence and connectivity to a thermostat, and you can disruptively change how we heat and cool our homes with a large resulting energy savings.</p>
<p>After installation, the Nest Learning Thermostat needs to know three things:</p>
<ol>
<li>What’s your Zip code (Yes, I know there’s a US bias built in here. Early intelligence has its limits.)</li>
<li>Should the thermostat start to heat or cool your home?</li>
<li>What temperatures should the Nest Thermostat use to heat or cool your home while you are away?</li>
</ol>
<p>After that, you set the thermostat to the desired current temperature and the Nest Learning Thermostat then starts to observe your daily habits (with respect to heating and cooling only!). It monitors when you turn up the heat (like in the morning) and when you turn it down (like before you go to sleep). It notes when you turn on the cooling (like when the afternoon sun starts to make it overly warm for you).</p>
<p>The Thermostat learns your daily routing and your weekend routine (weekend habits are different for most people) and by the eight day, the thermostat has developed a pretty good idea of your habits and strives to maintain a comfortable home for you by catering to those habits. The Nest Learning Thermostat can start to heat your home several minutes before you rise in the morning. (This is a major feature for those of us whose first job in the morning is to turn up the heat for our spouses so they can get up.)</p>
<p>You can log into your thermostat from your office to adjust the heat so that your home is warm by the time you return from work or to let the thermostat know you’re going out for dinner and the heat can be delayed for a few hours. And of course, you can do the same from your smartphone no matter where you are on the planet (assuming you’ve got cellular coverage wherever you are).</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2012/02/Nest-Learning-Thermostat.png"><img class="alignright size-full wp-image-782" title="Nest Learning Thermostat" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2012/02/Nest-Learning-Thermostat.png" alt="" width="250" height="244" /></a>The Nest Learning Thermostat also tries to train you after it learns your habits. It wants you to learn better habits in terms of energy consumption. It does this by starting to display a small green leaf when you turn down the heat or taper off on the air conditioning relative to your habitual heating and cooling use. This leaf tells you that you’re saving energy and thus money. It’s a subtle form of coercion and some people won’t like being told what to do by a thermostat. Others—the ones most likely to buy this product—will appreciate the watchful eye.</p>
<p>None of this would be possible if the Nest Learning Thermostat did not have a 32-bit microprocessor and a WiFi connection. Note that it took an entirely new company to build a thermostat like this. Although Honeywell still makes thermostats, microprocessor-based ones at that, it’s not building anything like the Nest Learning Thermostat. At least not this year. We just had to replace the thermostat in our condo and it is a Honeywell thermostat. It’s a standard setback thermostat design with an LCD. I have to tell it the time. I have to program it for a setback sequence. And Honeywell knew that the user interface on this product was so unintuitive that it kindly included a fold-out instruction booklet that pokes out of the top left of the thermostat, as you can see from this photo.</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2012/02/New-Honeywell-Thermostat.png"><img class="alignright size-full wp-image-783" title="New Honeywell Thermostat" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2012/02/New-Honeywell-Thermostat.png" alt="" width="560" height="378" /></a></p>
<p>Frankly, I find the display of the new Honeywell thermostat extremely confusing. It shows the current temperature on the left side in large characters, the setpoint temperature in somewhat smaller characters on the right side, and the time in even smaller characters in the middle. To me, it looks like the display information was simply thrown on the display with little consideration to how the information is portrayed. The time is not the central piece of information on a thermostat yet that information is central to the display, albeit in small characters. The two important and conjoined pieces of information, the current and setpoint temperatures, are spaced nearly as far apart on the display as possible. And there’s no fixing this design with a firmware update. That LCD’s permanently configured to save manufacturing cost.</p>
<p>Not, repeat not intelligent design.</p>
<p>You can bet that the Nest Intelligent Thermostat does not have an ungainly instruction booklet poking out of its sleek, smooth industrial design. To anyone who has ever used a regular dumb thermostat, the Nest Learning Thermostat’s everyday operational use appears to be entirely intuitive. And if you want to do more complex things with the Nest Learning Thermostat, you interact with it through a Web page, not a handful of multi-use rubber buttons and a limited (for cost reasons) LCD. For these reasons (and a few more), I think the Nest Learning Thermostat is one of the best examples of 21<sup>st</sup> century intelligent design I’ve seen. It offers up several lessons in thoughtful design that I hope you will appreciate as much as I do.</p>
<p>Note: To read Mike Cassidy’s entire column, click on the following link: “<a href="http://www.mercurynews.com/mike-cassidy/ci_19778468" target="_blank">Clay Christensen sees Silicon Valley non-profits&#8217; dilemma as the innovator&#8217;s dilemma</a>”</p>
]]></content:encoded>
			<wfw:commentRss>http://low-powerdesign.com/sleibson/2012/02/01/do-you-believe-in-21st-century-intelligent-design/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>3-Hour, $50 Short course in Low-Power Design with Prof. Jan Rabaey. Silicon Valley, Jan 31</title>
		<link>http://low-powerdesign.com/sleibson/2012/01/10/3-hour-50-short-course-in-low-power-design-with-prof-jan-rabaey-silicon-valley-jan-31/</link>
		<comments>http://low-powerdesign.com/sleibson/2012/01/10/3-hour-50-short-course-in-low-power-design-with-prof-jan-rabaey-silicon-valley-jan-31/#comments</comments>
		<pubDate>Tue, 10 Jan 2012 19:27:07 +0000</pubDate>
		<dc:creator>sleibson321</dc:creator>
				<category><![CDATA[Low-Power]]></category>

		<guid isPermaLink="false">http://low-powerdesign.com/sleibson/?p=773</guid>
		<description><![CDATA[The Santa Clara Valley (SCV) Chapter of the IEEE Solid State Circuits Society is hosting a 3-hour short course in low-power design a the end of this month. The course is divided into two parts: Fundamentals of low-power design and &#8230; <a href="http://low-powerdesign.com/sleibson/2012/01/10/3-hour-50-short-course-in-low-power-design-with-prof-jan-rabaey-silicon-valley-jan-31/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The Santa Clara Valley (SCV) Chapter of the IEEE Solid State Circuits Society is hosting a 3-hour short course in low-power design a the end of this month. The course is divided into two parts:</p>
<ol>
<li>Fundamentals of low-power design and a review of well-established low-power design techniques</li>
<li>New low-power design techniques that will come into their own based on current technology and application trends.</li>
</ol>
<p>This is a rare opportunity to get a concentrated presentation from Professor Jan Rabaey of UC Berkeley, who happens to be an excellent and engaging speaker. The short course will be taught at the TI Silicon Valley Auditorium (formerly National Semiconductor), 2900 Semiconductor Drive, Santa Clara, CA.</p>
<p>&nbsp;</p>
<p>The charge is $50 plus a $3.74 event fee. Register <a href="http://www.eventbrite.com/event/2737252195" target="_blank">here</a>.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://low-powerdesign.com/sleibson/2012/01/10/3-hour-50-short-course-in-low-power-design-with-prof-jan-rabaey-silicon-valley-jan-31/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2011: A great year for low-power design, wasn’t it? Part B</title>
		<link>http://low-powerdesign.com/sleibson/2011/12/17/2011-a-great-year-for-low-power-design-wasn%e2%80%99t-it-part-b/</link>
		<comments>http://low-powerdesign.com/sleibson/2011/12/17/2011-a-great-year-for-low-power-design-wasn%e2%80%99t-it-part-b/#comments</comments>
		<pubDate>Sat, 17 Dec 2011 22:13:31 +0000</pubDate>
		<dc:creator>sleibson321</dc:creator>
				<category><![CDATA[2.5D]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[Low-Power]]></category>
		<category><![CDATA[28nm]]></category>
		<category><![CDATA[Altera]]></category>
		<category><![CDATA[Arria]]></category>
		<category><![CDATA[Virtex]]></category>
		<category><![CDATA[Wide I/O]]></category>
		<category><![CDATA[Xilinx]]></category>

		<guid isPermaLink="false">http://low-powerdesign.com/sleibson/?p=748</guid>
		<description><![CDATA[2011 was a great year for low-power design. I don’t think I can remember a year as good to low-power designers and I thought I’d devote this second part of my blog post on this topic to review some major &#8230; <a href="http://low-powerdesign.com/sleibson/2011/12/17/2011-a-great-year-for-low-power-design-wasn%e2%80%99t-it-part-b/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>2011 was a great year for low-power design. I don’t think I can remember a year as good to low-power designers and I thought I’d devote this second part of my blog post on this topic to review some major process and packaging technology advances that occurred this year, which I think will have major implications for years to come.</p>
<h3><strong>28nm</strong></h3>
<p>Perhaps the biggest process innovation to hit the semiconductor industry, with a long-tail effect on low-power semiconductor design, will turn out to be the 28nm process node. Here are the clues on which I’m basing this opinion.</p>
<p>First, Xilinx provided a very long and detailed explanation as to why it selected the TSMC 28nm HPL process technology for all three members of its series 7 FPGAs (Virtex-7, Kintex-7, and Artex-7) over the TSMC 28nm HP and P process technologies. The 28nm HP process variant is strictly for high-performance designs and the 28nm LP process variant, which employs PolySiON (polysilicon/silicon oxy-nitride) gate oxide, delivers low leakage but also lower-performance transistors. The 28nm HPL process variant is a high-K metal-gate technology that produces high-speed, low-leakage transistors. (See “<a href="http://eda360insider.wordpress.com/2011/07/04/xilinx-28nm-low-power-soc-design-class-part-2-process-technology/" target="_blank">Xilinx 28nm low-power SoC design class, part 2: Process Technology</a>”)</p>
<p>It’s not Xilinx’ selection of the 28nm HPL process that’s the highlight here—it’s the fact that there are two low-power variants of the 28nm process available to IC design teams. They have the flexibility to pick the process variant that best meets the objectives for a specific IC.</p>
<p>Altera took a different approach in developing a low-power FPGA based on 28nm process technology. Late last month, the company <a href="http://www.altera.com/corporate/news_room/releases/2011/products/nr-arria-v-shipping.html " target="_blank">announced</a> that it had started shipping low-power Arria V FPGAs based on the TSMC 28nm LP process variant because it allows the Arria V FPGA family to deliver “the lowest total power, lowest static power, and lowest transceiver power of any midrange FPGA family, consuming up to 40% less power compared to previous generation devices.”</p>
<p>It’s this flexibility in 28nm process variants that I think will allow all sorts of interesting low-power design techniques to be used in future products designed with 28nm process technology. And the 28nm process node is critically important for another reason: It appears to be the last process node to be manufactured without taking extraordinary measures in lithography. By that, I mean that after the 28nm node, we will be seeing a big bump in lithographic complexity, first through double, triple, and quadruple patterning and then through EUV (extreme ultraviolet, aka X-rays) lithography. Scaling is about to get much more difficult after 28nm.</p>
<h3><strong>3D IC Assembly</strong> </h3>
<p>Which leads me to the other big semiconductor manufacturing innovation whose time has apparently arrived: 3D IC assembly. I believe that the time has arrived for 3D IC assembly to become mainstream and that we will see a revolution in SoC design and development based on adding 3D design to the mix. There are just too many benefits to ignore and I discussed some of these in my previous blog post (Part A).</p>
<p>To recap, the big advantages are:</p>
<ul>
<li>A huge reduction in I/O power to transfer signals from chip to chip</li>
<li>A huge increase in chip-to-chip bandwidth without increasing I/O power consumption or packaging costs</li>
<li>Ability to intermix logic, memory, analog, and RF functions by implementing them on separately optimized die (using separately optimized process technologies) and then creating appropriate 3D assemblies</li>
<li>Cost advantages by reducing average die size</li>
<li>Reduction in pressure to jump to the next process node (and higher design and NRE costs) by providing an alternative path to SoC-level integration</li>
</ul>
<p>These advantages cannot be ignored and indeed companies are not ignoring them. TSMC itself has already stepped in and proposed itself as a 1-stop shop for IC die manufacture and 3D assembly. (See “<a href="http://eda360insider.wordpress.com/2011/12/16/3d-week-the-state-of-3d-ic-assembly-december-2011/" target="_blank">3D Week: The State of 3D IC assembly—December 2011</a>”) Xilinx is already solidly in the 2.5D IC assembly camp with the Virtex-7 2000T FPGA and you can bet that technology will make its way down the product line as quickly as the costs of early adoption can be reduced.</p>
<p>JEDEC has announced the finalization of the Wide I/O SDRAM specification, which is essential to the development of 3D memory stacks with high-speed, low-power data-transfer features. (See “<a href="http://eda360insider.wordpress.com/2011/12/14/3d-week-jedec-wide-io-memory-spec-cleared-for-use/" target="_blank">3D Week: JEDEC Wide I/O Memory spec cleared for use</a>”</p>
<p>The economics of the entire industry now solidly point to the adoption of 3D IC assembly as a mainstream packaging technology. (See “<a href="http://eda360insider.wordpress.com/2011/12/14/3d-week-driven-by-economics-its-now-one-minute-to-3d/" target="_blank">3D Week: Driven by economics, it’s now one minute to 3D</a>”) Between the increasing availability of 28nm process technology and the rise of 3D IC assembly, I see a new flowering of electronic design the likes of which we have not seen since the early 1990s, when surface-mount technology and ASIC design both came of age nearly simultaneously. That was truly an great era for the industry. The same sort of simultaneous advance in semiconductor and packaging technology is taking place with 28nm process technology and 3D IC assembly.</p>
<p>It’s a great time for low-power design.</p>
]]></content:encoded>
			<wfw:commentRss>http://low-powerdesign.com/sleibson/2011/12/17/2011-a-great-year-for-low-power-design-wasn%e2%80%99t-it-part-b/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2011: A great year for low-power design, wasn’t it?</title>
		<link>http://low-powerdesign.com/sleibson/2011/12/17/2011-a-great-year-for-low-power-design-wasn%e2%80%99t-it/</link>
		<comments>http://low-powerdesign.com/sleibson/2011/12/17/2011-a-great-year-for-low-power-design-wasn%e2%80%99t-it/#comments</comments>
		<pubDate>Sat, 17 Dec 2011 18:46:25 +0000</pubDate>
		<dc:creator>sleibson321</dc:creator>
				<category><![CDATA[2.5D]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[Design]]></category>
		<category><![CDATA[Low-Power]]></category>
		<category><![CDATA[Microcontroller]]></category>
		<category><![CDATA[Multicore]]></category>
		<category><![CDATA[SOC]]></category>
		<category><![CDATA[2000T]]></category>
		<category><![CDATA[NXP]]></category>
		<category><![CDATA[Virtex-7]]></category>
		<category><![CDATA[Xilinx]]></category>

		<guid isPermaLink="false">http://low-powerdesign.com/sleibson/?p=734</guid>
		<description><![CDATA[2011 was a great year for low-power design. I don’t think I can remember a year as good to low-power designers. I thought I’d devote this blog to a review of some major developments in 2011 that made low-power designers’ &#8230; <a href="http://low-powerdesign.com/sleibson/2011/12/17/2011-a-great-year-for-low-power-design-wasn%e2%80%99t-it/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>2011 was a great year for low-power design. I don’t think I can remember a year as good to low-power designers. I thought I’d devote this blog to a review of some major developments in 2011 that made low-power designers’ lives easier. In fact, there’s so much to talk about that I’m splitting this blog post in two. In the first half, I’ll write about significant developments in standard silicon offerings including microcontrollers, embedded application processors, and FPGAs. In part B, I’ll discuss some of the year’s most significant developments in design at the silicon level and the implications for people who design ASICs, SoCs, and ASSPs. It truly was a bountiful year.<strong> </strong></p>
<h3><strong>Low-power microcontrollers</strong></h3>
<p>If there ever was a year for microcontroller advancement, this was it. Every major microcontroller vendor had something new and exciting on the low-power front. So many developments that I can only hit the highlights:</p>
<p>In August, ARM’s Alan Rampon wrote a blog post listing 17 microcontroller vendors that were offering a broad range of low-power devices based on various ARM Cortex-M series processor cores. The vendor list includes:</p>
<ul>
<li>Analog Devices (Cortex-M3)</li>
<li>Atmel (Cortex-M3)</li>
<li>Broadcom (Cortex-M3)</li>
<li>Cypress Semiconductor (Cortex-M3)</li>
<li>Dust Networks (Cortex-M3)</li>
<li>Ember (Cortex-M3)</li>
<li>Energy Micro (Cortex-M0, M3)</li>
<li>Freescale Semiconductor (Cortex-M4)</li>
<li>Fujitsu (Cortex-M3)</li>
<li>Holtek (Cortex-M3)</li>
<li>Nuvoton (Cortex-M0)</li>
<li>NXP (Cortex-M0, M3, M4)</li>
<li>ON Semiconductor (Cortex-M3)</li>
<li>Samsung (Sortex-M0, M3)</li>
<li>ST Microelectronics (Cortex-M3)</li>
<li>Texas Instruments (Cortex-M3)</li>
<li>Toshiba (Cortex-M3)</li>
</ul>
<p>That list is probably somewhat dated already, but you get the idea. The proliferation of low-power microcontrollers greatly accelerated during 2011. One such device that really sticks in my mind (because it’s recent), is the onset of shipments of the <a href="http://eda360insider.wordpress.com/2011/12/05/asymmetric-dual-core-nxp-lpc4300-microcontrollers-split-tasks-between-arm-cortex-m4-and-m0-cores-cost-3-75-and-up/" target="_blank">NXP Semiconductor LPC4350</a>, which packs an ARM Cortex-M4 and an ARM Cortex-M0 into one microcontroller that costs less than $4 in quantities of 10,000.</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/12/NXP-LPC4350-block-diagram.jpg"><img class="aligncenter size-full wp-image-738" title="NXP LPC4350 block diagram" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/12/NXP-LPC4350-block-diagram.jpg" alt="" width="560" height="454" /></a></p>
<h3>This microcontroller is on the forefront of a new wave of processor design called “asymmetric multiprocessing” and there’s a real “wave-of-the-future” look to this development. (See “<a href="http://eda360insider.wordpress.com/2011/12/07/more-news-on-the-asymmetric processing-soc-front/" target="_blank">More news on the asymmetric processing SoC front</a>”)</h3>
<h3><strong>Asymmetric Multiprocessing</strong></h3>
<p>The microprocessor is 40 years old (last month!) and silicon microprocessor implementations have really advanced over those four decades while many of our design memes have not. In particular, I’m thinking of the meme that says “processors are expensive, so layer as many tasks as possible on a processor to save money.” The net effect of this meme is to make us develop increasingly complex multitasking schemes in an attempt to get processor utilization up to 80% or 90% or perhaps even 95%.</p>
<p>Now any engineer can tell you that when you load any component to near 100%, you have just sent and engraved, gold-plated invitation to Murphy, asking for an audience. In other words, something will go wrong. You won’t always get the latency you expected. You won’t always get the bandwidth you need.</p>
<p>So you’d better ask yourself: Are complex multitasking systems really worth the effort when I can get two 32-bit microprocessor cores in one device for less than $4? You’d better be serious coming up with that answer. I believe that asymmetric multiprocessing will remake all of design, including low-power design, during this coming decade.</p>
<p>Asymmetric multiprocessor design wasn’t the only innovation that loomed in 2011. Xilinx finally <a href="http://low-powerdesign.com/sleibson/2011/03/01/xilinx-zynq-epps-create-a-new-category-that-fits-in-among-socs-fpgas-and-microcontrollers/" target="_blank">announced</a><a href="../2011/03/01/xilinx-zynq-epps-create-a-new-category-that-fits-in-among-socs-fpgas-and-microcontrollers/"></a> the first four members of its new Zynq 7000 EPP (Extensible Processing Platform) family, which fuses a processor complex containing two ARM Cortex-A9 processor cores with an FPGA fabric.</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/12/Xilinx-EPP-Block-Diagram-v4.jpg"><img class="aligncenter size-full wp-image-737" title="Xilinx EPP Block Diagram v4" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/12/Xilinx-EPP-Block-Diagram-v4.jpg" alt="" width="587" height="469" /></a></p>
<p>Now assembling systems with microprocessors and FPGAs isn’t new. In fact, putting processor cores and FPGA fabrics onto the same piece of silicon isn’t particularly new either. However, doing it right? That is new. And this development fits into the low-power design world because putting the processor complex and the FPGA fabric on chip with a massive on-chip interconnect between the two cuts interface power significantly by reducing the interconnect frequency. You don’t need GHz interconnect clock rates when you have thousands of wires for parallel interconnect.</p>
<h3><strong>2.5D IC Assembly</strong></h3>
<p>Speaking of Xilinx, the company started shipping engineering samples of the Virtex-7 2000T FPGAs to customers last month and this too is a low-power design story. The story is completely told in this graphic:</p>
<p>&nbsp;</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/12/Xilinx-Virtex-7-2000T.jpg"><img class="aligncenter size-full wp-image-735" title="Xilinx Virtex 7 2000T" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/12/Xilinx-Virtex-7-2000T.jpg" alt="" width="560" height="319" /></a></p>
<p>&nbsp;</p>
<p>The Xilinx Virtex-7 2000T is a very large FPGA with two million logic elements. But it’s not a monolithic piece of silicon. Rather, the Virtex-7 2000T consists of four “identical” FPGA tiles, each with half a million logic elements (and a ton of other stuff). The FPGA tiles are mounted on a silicon interposer, which establishes more than 10,000 connections between each tile (56,000 connections in total). The silicon interposer is a fascinating piece of technology. It’s a 65nm IC with four layers of metal on each side of the die and no transistors. It’s a silicon circuit board that must be made in a wafer fab. In this case, TSMC owns the fab. The interposer is as large as the stepper reticule will allow. The advantage here is that each FPGA tile is a quarter of the size of the interposer, and die yield has an exponential relationship to die size. The smaller the die, the better the yield percentage. So 2.5D assembly makes a lot of sense in several different ways.</p>
<p>The 2.5D IC assembly-with-interposer approach taken to create the Xilinx Virtex-7 2000T allows the FPGA tiles to use lower power I/O drivers because these drivers will only be driving short, closely controlled traces between adjacent tiles. That system-design knowledge saves power. Although the Xilinx Virtex-7 2000T uses four identical die fabricated with a 28nm process technology to realize the active elements, 2.5D IC assembly permits heterogeneous die assembly as well, as shown in this <a href="http://eda360insider.wordpress.com/2011/11/16/3d-thursday-how-xilinx-developed-a-2-5d-strategy-for-making-the-worlds-largest-fpga-and-what-the-company-might-do-next-with-the-technology/" target="_blank">image</a> from Xilinx:</p>
<p>&nbsp;</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/12/Xilinx-2000T-next-step.jpg"><img class="aligncenter size-full wp-image-736" title="Xilinx 2000T next step" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/12/Xilinx-2000T-next-step.jpg" alt="" width="560" height="171" /></a></p>
<p>&nbsp;</p>
<p>As you can see, 2.5D IC assembly allows designers the freedom to intermix die from radically different IC technologies such as logic, memory (DRAM, Flash, SRAM, etc.), analog, and RF. It’s a pc-board-like technology but on a much smaller scale. The resulting 2.5D device may well be better optimized and cost less than it might if the design team attempted to place everything on one monolithic die. That’s a topic I’ll take up in Part B of this blog entry.</p>
]]></content:encoded>
			<wfw:commentRss>http://low-powerdesign.com/sleibson/2011/12/17/2011-a-great-year-for-low-power-design-wasn%e2%80%99t-it/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What if 2.5D got really cheap? How would that affect low-power design?</title>
		<link>http://low-powerdesign.com/sleibson/2011/11/17/what-if-2-5d-got-really-cheap-how-would-that-affect-low-power-design/</link>
		<comments>http://low-powerdesign.com/sleibson/2011/11/17/what-if-2-5d-got-really-cheap-how-would-that-affect-low-power-design/#comments</comments>
		<pubDate>Thu, 17 Nov 2011 18:09:37 +0000</pubDate>
		<dc:creator>sleibson321</dc:creator>
				<category><![CDATA[2.5D]]></category>
		<category><![CDATA[CMOS]]></category>
		<category><![CDATA[Design]]></category>
		<category><![CDATA[Flash]]></category>
		<category><![CDATA[Low-Power]]></category>

		<guid isPermaLink="false">http://low-powerdesign.com/sleibson/?p=727</guid>
		<description><![CDATA[Last week, silicon-interposer foundry Deca Technologies unstealthed. I found out from an article in the San Jose Mercury News and just published a blog about the announcement in my other blog, the EDA360 Insider. Deca is a subsidiary of Cypress &#8230; <a href="http://low-powerdesign.com/sleibson/2011/11/17/what-if-2-5d-got-really-cheap-how-would-that-affect-low-power-design/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Last week, silicon-interposer foundry Deca Technologies unstealthed. I found out from an <a href="http://www.mercurynews.com/business/ci_19297216" target="_blank">article</a> in the San Jose Mercury News and just published a <a href="http://eda360insider.wordpress.com/2011/11/17/is-cypress-subsidiary-deca-technologies-onto-2-5d-packaging-in-a-big-way/" target="_blank">blog</a> about the announcement in my other blog, the <strong>EDA360 Insider</strong>. Deca is a subsidiary of Cypress Semiconductor and the outspoken President and CEO of Cypress, TJ Rodgers, was good for a quote, as always:</p>
<p>“We want to use the dense, reliable silicon interconnect inherent in Moore’s Law to integrate the dissimilar chips used in today’s systems, but we face an economic barrier because the interconnect on silicon chips is 1,000 times more expensive than the interconnect on PC boards.</p>
<p>“We could enable a new silicon-based interconnect paradigm if we could make silicon interconnect wafers for $10, just what silicon solar wafers cost today. The problem of mapping solar technology onto Moore’s Law is straightforward, but difficult, and we believe DecaTech has the answer.”</p>
<p>Now don’t take that $10 per interposer wafer to the bank. I get the impression that’s a long-term goal, not a short-term pricing roadmap. However, even a 10x drop in interposer costs will have a big influence on the future of 2.5D assembly technology.</p>
<p>And why should we as low-power systems designers care? Because interconnect is expensive and because interconnect now largely determines system performance. First, think about expense. Let your mind go back 40 years (if it can) to the birth of the microprocessor, which we celebrate this month. In the 1970s, microprocessor interconnect meant a bus. Not on a board but in a system. One of the most successful early microprocessor buses was the S-100 bus. It was named for the 100-pin edge connectors and the 100-conductor bus used to interconnect system boards in the original Altair 8800 microcomputer introduced in 1975 and subsequently adopted by several microcomputer vendors including Imsai, Vector Graphics, North Star Computers (formerly known as Kentucky Fried Computers), and Processor Technology and by board vendors including Godbout Electronics/Compupro and Morrow Micro-Stuff/ThinkerToys.</p>
<p>Back then, due to the nascent state of semiconductor integration, you would to have a processor board, one or likely more than one memory boards, a video board, and one or more I/O boards. A major system expense was just the half dozen or so 100-pin edge connectors and the simple but large circuit board that implemented the bus. The S-100 connectors were expensive and you needed a lot of energy to drive the bus lines because they were physically large and because—as bus speeds increased—they required resistive termination to prevent ringing and you needed even more energy to drive the termination resistors.</p>
<p>By 1981 when the IBM PC appeared, things were getting somewhat better. We still had half a dozen edge connectors but we were down to 62 edge-connector pins (for 8-bit systems). Add another 36 pins when we jumped to 16 bits and we found ourselves right back at close to a 100-pin bus. So much for progress.</p>
<p>For board-to-board interconnect, things are going serial (think of the PCI evolution to PCIe) but chip-to-chip interconnect on a board is still largely parallel with lots of pins on a chip looking to connect to lots of pins on other chips. There is still impedance in those pcb traces and you still need relatively big drivers that consume significant amounts of energy to drive those traces. Hence a movement to go serial for chip-to-chip interconnect on a board—an extension of the migration of buses to serialized versions.</p>
<p>High-speed serial buses incur their own costs. There’s the energy cost of driving even a few wires at multi-GHz speeds and there’s the performance hit in the form of latency increase that you get when you serialize and then deserialize a data stream. It’s not all wine and roses.</p>
<p>So we try to put as much as we can on one IC, but that’s not an ideal solution either. Not all IC processing is the same and chips with different functions are really better off being fabricated with different IC fabrication processes. For example, NAND Flash and DRAM processes push as far down the Moore’s Law curve as they can get, as fast as they can to boost density and cut cost per bit. CMOS logic processes are right behind the memory processes but use more layers of on-chip metal interconnect. Because they require more random connectivity than memories. Analog ICs typically operate at higher supply voltages and they’re nowhere near the leading/bleeding edge of IC processor technology. It’s not economical to put all of these different functions on one die, and so we see renewed interest in multichip modules, known by the 21<sup>st</sup>-centrury name: 2.5D IC assembly using silicon interposers.</p>
<p>2.5D assembly using bare semiconductor die attached to small interposers instead of big circuit boards significantly changes the parallel/serial I/O equation. Suddenly, you don’t need such big I/O drivers on the chip because there’s no wire bond, no IC package interconnect, and significantly shorter traces to drive. Suddenly, massively parallel I/O consumes only a fraction of the energy it previously needed and the balancing equation that calculates the breakeven point between parallel I/O and serialized, high-speed I/O alters. The balance alters to favor parallel I/O more and serial I/O less.</p>
<p>And when major changes like that happen to such equations, the way we design systems also changes.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://low-powerdesign.com/sleibson/2011/11/17/what-if-2-5d-got-really-cheap-how-would-that-affect-low-power-design/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>“Watt’s Next?” asks Chris Malachowsky, co-founder, NVIDIA Fellow, and Senior VP or Research</title>
		<link>http://low-powerdesign.com/sleibson/2011/11/10/%e2%80%9cwatt%e2%80%99s-next%e2%80%9d-asks-chris-malachowsky-co-founder-nvidia-fellow-and-senior-vp-or-research/</link>
		<comments>http://low-powerdesign.com/sleibson/2011/11/10/%e2%80%9cwatt%e2%80%99s-next%e2%80%9d-asks-chris-malachowsky-co-founder-nvidia-fellow-and-senior-vp-or-research/#comments</comments>
		<pubDate>Thu, 10 Nov 2011 05:04:49 +0000</pubDate>
		<dc:creator>sleibson321</dc:creator>
				<category><![CDATA[ARM]]></category>
		<category><![CDATA[Low-Power]]></category>
		<category><![CDATA[Multicore]]></category>
		<category><![CDATA[SOC]]></category>
		<category><![CDATA[Cortex-A9]]></category>
		<category><![CDATA[GeForce]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[Kal-El]]></category>
		<category><![CDATA[NVIDIA]]></category>
		<category><![CDATA[QUADRO]]></category>
		<category><![CDATA[Supercomputer]]></category>
		<category><![CDATA[Supercomputing]]></category>
		<category><![CDATA[Tegra]]></category>
		<category><![CDATA[Tesla]]></category>

		<guid isPermaLink="false">http://low-powerdesign.com/sleibson/?p=712</guid>
		<description><![CDATA[Everything—literally everything—we design today is defined by its power consumption said Chris Malachowsky, an NVIDIA co-founder, fellow, and senior VP of research. Malachowsky spoke yesterday at a luncheon during the ICCAD conference held this week in San Jose, California. At &#8230; <a href="http://low-powerdesign.com/sleibson/2011/11/10/%e2%80%9cwatt%e2%80%99s-next%e2%80%9d-asks-chris-malachowsky-co-founder-nvidia-fellow-and-senior-vp-or-research/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Everything—literally everything—we design today is defined by its power consumption said Chris Malachowsky, an NVIDIA co-founder, fellow, and senior VP of research. Malachowsky spoke yesterday at a luncheon during the ICCAD conference held this week in San Jose, California. At the low end of the system spectrum, mobile devices are defined by how much you can do with a Watt. At the high end, supercomputers and supercomputer performance are now defined by how much electricity you can afford. Malachowsky joked that supercomputers now use so much electricity that the local power company is giving them away for free when you sign up for a 2-year service contract just like mobile phone handsets are subsidized by the carriers here in the US. That’s funny enough to hurt.</p>
<p>Coincidentally, NVIDIA makes chips that go into both wireless handsets at the low end (the company’s Tegra series of processors) and supercomputers at the high end (the company’s Tesla series of GPU—graphics processing unit—processing chips). In between are the original NVIDIA products, the GeForce and QUADRO series of graphics chips and boards. Here’s a graphic of the NVIDIA product line from Malachowsky’s talk:</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/NVIDIA-Product-Line.jpg"><img class="aligncenter size-full wp-image-713" title="NVIDIA Product Line" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/NVIDIA-Product-Line.jpg" alt="" width="560" height="266" /></a></p>
<p>Tegra mobile application processors go into mobile handsets that cost roughly $100 but deliver about 2x the CPU performance and 4x the GPU performance of a PC that sold 10 years ago for about $3200. Here’s the specs for comparison:</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/PC-versus-handset.jpg"><img class="aligncenter size-full wp-image-714" title="PC versus handset" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/PC-versus-handset.jpg" alt="" width="560" height="334" /></a> </p>
<p>The quest for performance isn’t going to stop in the handset space so NVIDIA has a roadmap for future processors. The product currently on the market is the Tegra 2 and NVIDIA has already previewed the next step up, code-named Kal-El (Superman’s original name on Krypton). (Note: for more information on Kal-El, see my blog posts “<a href="http://eda360insider.wordpress.com/2011/09/20/processor-wars-nvidia-reveals-a-phantom-fifth-arm-cortex-a9-processor-core-in-kal-el-mobile-processor-ic-guess-why-it%E2%80%99s-there/" target="_blank">Processor Wars: NVIDIA reveals a phantom fifth ARM Cortex-A9 processor core in Kal-El mobile processor IC. Guess why it’s there?</a>” and “<a href="http://eda360insider.wordpress.com/2011/09/23/friday-video-why-do-you-need-four-arm-cortex-a9-processorcores-in-a-mobile-processor-soc/" target="_blank">Friday Video: Why do you need four ARM Cortex-A9 processor cores in a mobile processor SoC?</a>”)</p>
<p>Here’s an NVIDIA roadmap for Tegra processors from Malachowsky’s talk:</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/Tegra-Roadmap.jpg"><img class="aligncenter size-full wp-image-715" title="Tegra Roadmap" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/Tegra-Roadmap.jpg" alt="" width="560" height="352" /></a></p>
<p>Note that NVIDIA likes to use superhero alter-ego names for future Tegra processors.</p>
<p>One thing that’s not progressing quickly on the mobile handset front is battery capacity. Batteries are just not getting better as fast as we’re adding transistors to silicon die thanks to Moore’s Law. As a result, the Tegra processors, like all mobile application processors, are constrained by the amount of power available in a handset.</p>
<p>On the supercomputer front, NVIDIA Tesla GPU chips already power three of the five fastest supercomputers in the world: the Tianhe-1A, the Titan (evolved from the x86 Jaguar), and the Nebulae. These supercomputers use a lot of processors, as you can see from this image:</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/Top-Five-supercomputers.jpg"><img class="aligncenter size-full wp-image-716" title="Top Five supercomputers" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/Top-Five-supercomputers.jpg" alt="" width="560" height="323" /></a></p>
<p>The reason that NVIDIA chips are in supercomputers at all is because researchers and students recognized that NVIDIA’s evolving line of graphics chips contained a lot of parallel processing power and if certain tough math problems and algorithms could be re-expressed to look like problems of drawing and shading triangles, then GPUs could be pressed into service for these other sorts of problems. This conceptual leap resulted in the development of the NVIDIA Tesla line of supercomputing GPU chips.</p>
<p>However, supercomputers are also being constrained by power. Not in how much power is available—it takes megaWatts to run a supercomputer—but by how much power is affordable. And don’t forget, for every megaWatt needed to power the supercomputer, you need a comparable amount of power to cool the supercomputer.</p>
<p>Even the US Department of Energy (DOE) is concerned. It recently put out a Request for Information (RFI) to find out how we might build a 1-Exaflop (an Exaflop is a billion Gigaflops) supercomputer that “only” consumes 20MW (!!!) On the current commercial trajectory, with no extra DOE help, we will eventually be able to build an Exabyte supercomputer but it will consume four or five times the amount of energy said Malachowsky.</p>
<p>Why do we need an Exaflop supercomputer? Because simulation has replaced the wet lab, said Malachowsky. Science, all science, needs simulation and the more the better. The more the faster. As Malachowsky said in his talk, science needs 1000x more computing (but without 1000x the power consumption) because simulation or “computational science” has become the third pillar of science.</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/Third-Pillar-of-Science.jpg"><img class="aligncenter size-full wp-image-717" title="Third Pillar of Science" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/Third-Pillar-of-Science.jpg" alt="" width="560" height="353" /></a></p>
<p>(Note: Theory and Experimentation are the first two pillars. Yeah, I didn’t know that either, but I have it on the authority of the <a href="http://www.nitrd.gov/pitac/reports/20050609_computational/computational.pdf" target="_blank">President’s Information Technology Advisory Committee</a>.)</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/Process-Fairy.jpg"><img class="alignright size-full wp-image-718" title="Process Fairy" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/Process-Fairy.jpg" alt="" width="120" height="161" /></a>And get this: no more lazy, lazy processor or system architects. The “process fairies” aren’t working as hard as they used to, said Malachowsky. Oh sure, they’re still bringing us 2x the transistor count with each new IC process step just like Gordon Moore promised way back in 1965. Sure, the process fairies are keeping that promise. But poor Dennard. His observation about power and speed scaling with lithographic geometry—that’s dead. It died at 90nm. Party’s over.</p>
<p>So what? Here’s what. We’re going to have to rethink our approaches to getting more processing performance using less power. Scaling is out and here’s the graphic proof from Malachowsky’s talk:</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/A-new-approach-is-needed.jpg"><img class="aligncenter size-full wp-image-719" title="A new approach is needed" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/A-new-approach-is-needed.jpg" alt="" width="560" height="377" /></a></p>
<p>Without architectural innovation, the average annual rate of processor performance improvement appears to be dropping from 52% to 20%. Architecture is in and we’ve got to get smarter because the process fairies aren’t working as hard as they used to.</p>
<p>What can we do? Well, one approach is already evident in the design of the multicore NVIDIA Kal-El mobile application processor. The Kal-El chip contains five ARM Cortex-A9 processor cores. Architecturally similar, one of the five ARM processor cores is synthesized for low-power operation. The other four identical cores are synthesized for maximum performance and consequently draw more power. When the Kal-El chip has a lot of work to do, one or more of the high-performance cores is operating. When there’s just a little work to do, the operating system transfers the work load to the low-power core and shuts down all four of the high-performance cores. The Android OS already knows how to do this.</p>
<p>Kal-El’s low-power “companion” ARM Cortex-A9 core is an example of an emerging SoC design style called “dark silicon.” Fortunately, dark silicon is much easier to understand that dark matter or dark energy. Dark silicon simply describes sections of an SoC that are shut down and powered off. In earlier days when there weren’t enough transistors to go around, letting a piece of silicon go dark was unthinkable. In fact, we loaded up a processor with as much work as it could do and perhaps even a little more if we needed to push things. Dark silicon? Fugetaboutit. But now in the multicore era, we’re getting quite used to the idea.</p>
<p>However, dark silicon isn’t going to save us by itself. We need to get smarter about what we do inside of a single core as well said Malachowsky. We’re going to get smart about the energy cost of everything we do inside of a processor core. Malachowsky didn’t directly explain what this means but he did provide a clue.</p>
<p>Here’s a table from Malachowsky’s presentation that shows the energy cost of typical processor transactions:</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/Energy-cost-of-actions.jpg"><img class="aligncenter size-full wp-image-720" title="Energy cost of actions" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/11/Energy-cost-of-actions.jpg" alt="" width="560" height="366" /></a></p>
<p>Note the pattern in this table. The energy costs for moving operands around on chip are comparable to those for performing a computation. This ratio actually gets worse for data movement as lithographic scaling progresses because gates get smaller but the average wire length and cross-sectional resistance get larger. The energy cost for moving an operand on or off chip is higher still. It takes power to wiggle those printed-circuit board traces.</p>
<p>As I said, it was just a clue.</p>
<p>One of the most interesting parts of Malachowsky’s talk for me was where the funding for this architectural research will come from. I would never have guessed.</p>
<p>Video games.</p>
<p>That would be the two middle NVIDIA product lines shown in the first image in this blog post—GeForce and QUADRO. It seems that the video gaming market is pretty big—about $35 billion per year. That’s bigger than the movie market (and way bigger than EDA). Hard-core gamers will apparently pay handsomely for architectural advances as long as it lets them shoot faster.</p>
<p>So when we cure cancer, you can thank a gamer. Meanwhile, give some thought to Malachowsky’s words. There are a lot of really sharp ideas for designers of low-power systems in this presentation.</p>
]]></content:encoded>
			<wfw:commentRss>http://low-powerdesign.com/sleibson/2011/11/10/%e2%80%9cwatt%e2%80%99s-next%e2%80%9d-asks-chris-malachowsky-co-founder-nvidia-fellow-and-senior-vp-or-research/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Generation-jumping 2.5D Xilinx Virtex-7 2000T FPGA delivers 1,954,560 logic cells, consumes only 20W</title>
		<link>http://low-powerdesign.com/sleibson/2011/10/25/generation-jumping-2-5d-xilinx-virtex-7-2000t-fpga-delivers-1954560-logic-cells-consumes-only-20w/</link>
		<comments>http://low-powerdesign.com/sleibson/2011/10/25/generation-jumping-2-5d-xilinx-virtex-7-2000t-fpga-delivers-1954560-logic-cells-consumes-only-20w/#comments</comments>
		<pubDate>Tue, 25 Oct 2011 14:00:10 +0000</pubDate>
		<dc:creator>sleibson321</dc:creator>
				<category><![CDATA[FPGA]]></category>
		<category><![CDATA[Low-Power]]></category>
		<category><![CDATA[2.5D]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[Virtex]]></category>
		<category><![CDATA[Virtex-7]]></category>
		<category><![CDATA[Xilinx]]></category>

		<guid isPermaLink="false">http://low-powerdesign.com/sleibson/?p=690</guid>
		<description><![CDATA[Xilinx announced today that it is shipping Virtex-7 2000T FPGAs to customers. This is one monster FPGA. Its 6.8 billion transistors deliver 1,954,560 logic cells, 21.55 Mbits of distributed SRAM, 2160 DSP slices, 46,512Kbits of block RAM, four PCIe ports, 36 &#8230; <a href="http://low-powerdesign.com/sleibson/2011/10/25/generation-jumping-2-5d-xilinx-virtex-7-2000t-fpga-delivers-1954560-logic-cells-consumes-only-20w/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><strong>Xilinx</strong> announced today that it is shipping <strong>Virtex-7 2000T FPGAs</strong> to customers. This is one monster FPGA. Its 6.8 billion transistors deliver 1,954,560 logic cells, 21.55 Mbits of distributed SRAM, 2160 DSP slices, 46,512Kbits of block RAM, four PCIe ports, 36 12.5Gbps GTX serial transceivers, and 1200 user I/O pins. All in about 20W (!!!). The only fly in the ointment, if you want to call it that, is that no one on this planet can make this FPGA as a monolithic device. The Virtex-7 FPGA is a 2.5D assembly that combines four FPGA tiles on a silicon interposer. The interposer provides 10,000 connections between each of the four FPGA tiles</p>
<p>Here’s an exploded diagram of the FPGA assembly:</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/10/Xilinx-Virtex-7-2000T-Exploded-Diagram.jpg"><img class="aligncenter size-full wp-image-691" title="Xilinx Virtex-7 2000T Exploded Diagram" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/10/Xilinx-Virtex-7-2000T-Exploded-Diagram.jpg" alt="" width="540" height="350" /></a></p>
<p>I’ll be covering the 2.5D/3D assembly aspects of this new FPGA in more detail on my <strong>EDA360 Insider</strong> blog this coming Thursday (for 3D Thursday), but I want to discuss the low-power aspects of this device here, now.</p>
<p>The obvious contributor to the Virtex-7 2000T FPGA’s power profile is the use of the 28nm TSMC HPL (high-performance, low-power) high-K, metal-gate (HKMG) process technology. Xilinx chose TSMC’s 28 HPL process to make all of its Series 7 FPGA family members (Virtex-7, Kintex-7, and Artex-7) over TSMC’s 28 HP and 28 P process technologies. Instead of letting us know this fact and leaving it there,  Xilinx published a White Paper that goes into great detail about the decision (“<a href="http://www.xilinx.com/support/documentation/white_papers/wp389_Lowering_Power_at_28nm.pdf" target="_blank">Lowering Power at 28nm with Xilinx 7 Series FPGAs</a>”).</p>
<p>Xilinx considered all three of the TSMC 28nm process technologies for the 7-series FPGA families but the company quickly locked on the two HKMG processes (HP and HPL) as being the “best” for FPGA design. Because Xilinx wanted to use just one process technology to cover all of the planned Series-7 FPGA families from high-performance to low-power, HKMG promised the best mix of performance and leakage for the company’s unified approach to designing all of the Series-7 FGPA families. TSMC’s 28 LP process uses PolySiON (polysilicon/silicon oxy-nitride) gate insulation and is best suited for designs that require less performance than FPGAs. The PolySiON 28 LP process produces transistors that are about 13% slower than those produced in the 28 HPL and 28 HP processes (for the types of transistors Xilinx would be using to build its Series-7 FPGAs) while exhibiting more than twice the leakage. The advantage of the 28 LP process is that it’s less expensive.</p>
<p>Eliminating the 28 LP process as a possibility left the choice between TSMC’s 28 HPL and 28 HP processes. Both processes can produce equally fast transistors but the 28 HP process produces transistors with about twice the leakage of the 28 HPL process for the types of transistors Xilinx would be using to build its Series-7 FPGAs. According to the White Paper, TSMC’s 28 HP process is better suited to GPU and CPU designs that require the ultimate performance and that have the power budget (~100W) to achieve that performance. The maximum Xilinx Series-7 FPGA power budget is 40W, so the company selected TSMC’s 28 HPL process technology. However, in a demo last week, Xilinx showed the Virtex-7 2000T FPGA simultaneously running 3600 copies of its 8-bit nanoBlaze processor core, delivering 180,000 MIPS while consuming 20W. That’s a really impressive amount of computation for the power.</p>
<p>However, it’s the system implications of the Virtex 7 2000T FPGA’s huge capacity that really have an impact on system power consumption. Here is an example diagram that Xilinx used to showcase the power-consumption advantage of the Virtex-7 2000T FPGA:</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/10/Xilinx-Virtex-7-2000T-Power-Advantage.jpg"><img class="aligncenter size-full wp-image-692" title="Xilinx Virtex-7 2000T Power Advantage" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/10/Xilinx-Virtex-7-2000T-Power-Advantage.jpg" alt="" width="560" height="319" /></a></p>
<p>In this image, Xilinx claims that one Virtex-7 2000T FPGA delivers the equivalent capability of four “competitive” FPGAs, each with nearly 1M logic cells. Xilinx came to this conclusion based on characteristics other than logic-cell count and I’m not going to tell you that this diagram shows an apples-to-apples comparison. That’s not my point in using this diagram.</p>
<p>For my purposes, the diagram shows four smaller FPGAs tied together with many high-speed differential pairs. Each composite FPGA-to-FPGA link burns 8W. That’s 32W total used just for chip-to-chip communications. To me, that’s the secret low-power sauce that the 2.5D assembly approach of the Xilinx Virtex-7 2000T has. The extremely wide I/O provided by the silicon interposer drastically reduces the power consumption of tile-to-tile communications and this power consumption is included in the device’s overall power consumption number: 20W. That means the FPGA power consumption is essentially “free” if you look at the wrong end of the telescope.</p>
<p>Where might this reduction in power consumption come in handy? At the announcement, Xilinx VP of FPGA Development and Silicon Technology Liam Madden discussed two example cases relevant to this discussion.</p>
<p>The first example involved a customer looking at developing a large ASIC for a communications application. The defining performance characteristic was the need to handle a terabit/sec aggregate data rate using an estimated 20M gates for the logic. The power budget was about 30W. The customer knew that it wanted some amount of programmability and was therefore considering a 3-chip solution with one ASIC and two FPGAs. The estimated time to develop this design was three years and the estimated power consumption was 70W. Way out of the ballpark. One Xilinx Virtex-7 2000T filled the bill and beat the power budget by 30%.</p>
<p>The second example involved the replacement of six chips with one device. One Xilinx Virtex-7 FPGA swallowed all six devices and delivered 5x the performance of the existing multi-chip system while consuming 1/7 of the power.</p>
<p>Here’s the before picture:</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/10/6-chip-solution.jpg"><img class="aligncenter size-full wp-image-693" title="6-chip solution" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/10/6-chip-solution.jpg" alt="" width="424" height="258" /></a></p>
<p>And here’s the “after” picture showing the function of all six chips compiled into one Xilinx Virtex-7 2000T FPGA:</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/10/6-chips-in-a-Virtex-7-2000T.jpg"><img class="aligncenter size-full wp-image-694" title="6 chips in a Virtex-7 2000T" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/10/6-chips-in-a-Virtex-7-2000T.jpg" alt="" width="326" height="401" /></a></p>
<p>Clearly, 2.5D and 3D assembly is going to have a major influence on the way we design low-power systems in the future.</p>
]]></content:encoded>
			<wfw:commentRss>http://low-powerdesign.com/sleibson/2011/10/25/generation-jumping-2-5d-xilinx-virtex-7-2000t-fpga-delivers-1954560-logic-cells-consumes-only-20w/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Tesla? Hah! How about a brand new, 2012 all-electric DeLorean? $90K</title>
		<link>http://low-powerdesign.com/sleibson/2011/10/23/tesla-hah-how-about-a-brand-new-2012-all-electric-delorean-90k/</link>
		<comments>http://low-powerdesign.com/sleibson/2011/10/23/tesla-hah-how-about-a-brand-new-2012-all-electric-delorean-90k/#comments</comments>
		<pubDate>Sun, 23 Oct 2011 23:44:59 +0000</pubDate>
		<dc:creator>sleibson321</dc:creator>
				<category><![CDATA[Low-Power]]></category>
		<category><![CDATA[DeLorean]]></category>
		<category><![CDATA[electric vehicle]]></category>
		<category><![CDATA[Mr Fusion]]></category>
		<category><![CDATA[Tesla]]></category>

		<guid isPermaLink="false">http://low-powerdesign.com/sleibson/?p=701</guid>
		<description><![CDATA[McFly!!! This was just too good to pass up. DeLorean Motor Company (Humble, TX), the establishment that bought the remains of the original DeLorean car company including tons of finished stainless-steel body parts, plans to bring out an electric version &#8230; <a href="http://low-powerdesign.com/sleibson/2011/10/23/tesla-hah-how-about-a-brand-new-2012-all-electric-delorean-90k/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/10/Electric-DeLorean.jpg"><img class="alignleft size-full wp-image-702" title="Electric DeLorean" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/10/Electric-DeLorean.jpg" alt="" width="270" height="204" /></a>McFly!!! This was just too good to pass up. DeLorean Motor Company (Humble, TX), the establishment that bought the remains of the original DeLorean car company including tons of finished stainless-steel body parts, plans to bring out an electric version next year. A brand new electric version, re-engineered with a 260 hp electric motor that supposedly drives the car from 0 to 60 mph in 4.9 seconds. Wowzers!</p>
<p>The electric DeLorean debuted at the DMC Texas Open House Event on Oct.14th, 2011. Current range is said to be 70 miles without the Mr. Fusion option but there’s no word on when that will become available. Probably sometime in the future.</p>
<p>Read about the Jalopnik test drive of the electric DeLorean <a href="http://jalopnik.com/5850448/electric-delorean-first-drive" target="_blank">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://low-powerdesign.com/sleibson/2011/10/23/tesla-hah-how-about-a-brand-new-2012-all-electric-delorean-90k/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Altera introduces SoC FPGA melding ARM Cortex-A9 dual-core processor complex with a 28nm FPGA fabric</title>
		<link>http://low-powerdesign.com/sleibson/2011/10/11/altera-introduces-soc-fpga-melding-arm-cortex-a9-dual-core-processor-complex-with-a-28nm-fpga-fabric/</link>
		<comments>http://low-powerdesign.com/sleibson/2011/10/11/altera-introduces-soc-fpga-melding-arm-cortex-a9-dual-core-processor-complex-with-a-28nm-fpga-fabric/#comments</comments>
		<pubDate>Tue, 11 Oct 2011 12:00:33 +0000</pubDate>
		<dc:creator>sleibson321</dc:creator>
				<category><![CDATA[ARM]]></category>
		<category><![CDATA[FPGA]]></category>
		<category><![CDATA[Low-Power]]></category>
		<category><![CDATA[SDRAM]]></category>
		<category><![CDATA[SOC]]></category>
		<category><![CDATA[Altera]]></category>
		<category><![CDATA[SoC FPGA]]></category>
		<category><![CDATA[Xilinx]]></category>
		<category><![CDATA[Zynq]]></category>

		<guid isPermaLink="false">http://low-powerdesign.com/sleibson/?p=668</guid>
		<description><![CDATA[Xilinx first started to talk publicly about the fusion of processors and FPGAs—a product now known as Zynq—in 2010 and has announced plans to roll out parts by the end of this year. It was inevitable that Altera would eventually &#8230; <a href="http://low-powerdesign.com/sleibson/2011/10/11/altera-introduces-soc-fpga-melding-arm-cortex-a9-dual-core-processor-complex-with-a-28nm-fpga-fabric/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Xilinx first started to talk publicly about the fusion of processors and FPGAs—a product now known as Zynq—in 2010 and has announced plans to roll out parts by the end of this year. It was inevitable that Altera would eventually counter with a competing product line. Today the company revealed plans for a line of chips called SoC FPGAs and comparisons between the Altera and Xilinx offerings are inevitable, but let’s look at the details for the Altera offerings.</p>
<p>The SoC FPGA line will include at least 18 different chips with various configurations for the “Hard Processor System” (HPS) and various sizes for the FPGA fabrics connected to the HPS block. In addition, the SoC FPGA product line will be based on two of the Altera 28nm FPGA fabrics—Cyclone V and Arria V—for two different speed grades within the product line. Here’s a generalized block diagram of a device in the product line:</p>
<p><a href="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/10/Altera-SoC-FPGA-Block-Diagram.jpg"><img class="aligncenter size-full wp-image-669" title="Altera SoC FPGA Block Diagram" src="http://low-powerdesign.com/sleibson/wp-content/uploads/2011/10/Altera-SoC-FPGA-Block-Diagram.jpg" alt="" width="513" height="757" /></a></p>
<p>The SoC FPGAs’ HPS is based on two 800MHz ARM Cortex-A9 processor cores with ARM Neon and single/double-precision FPU extensions. Each ARM Cortex-A9 processor has its own L1 caches—separate 32Kbyte L1 caches for instructions and data. The two processor cores share a unified 512Kbyte L2 cache. Each processor also has private interval and watchdog timers. To keep the two processor cores fed with instructions and data, there’s a hard-core, multiport DDR SDRAM controller in the HPS that supports the DDR2 and DDR3 and the LPDDR1 and LPDDR2 SDRAM interface protocols. There’s also a Flash memory controller with a built-in DMA engine. The NAND Flash controller supports NOR and NAND Flash memories including ONFi 1.0 devices and SD, SDIO, and MMC memory cards. In addition, there’s ECC support for the SDRAM and the NAND Flash interfaces.</p>
<p>Next up are the hard-core peripherals within the HPS. There are a lot of them:</p>
<ul>
<li>Two 10/100/1000 Ethernet MACs with DMA</li>
<li>Two USB 2.0 On-the-Go (OTG) controllers with DMA</li>
<li>Four I2C controllers</li>
<li>Two CAN (Controller Area Network) controllers</li>
<li>SPI Master and SPI Slave ports</li>
<li>Two UARTs</li>
<li>General-purpose ports</li>
</ul>
<p>On-chip memory includes 64Kbytes of RAM and a boot ROM.</p>
<p>That’s already quite a lot but then there’s the FPGA section  of the SoC FPGA to consider. On-chip FPGA capacity varies depending on whether the particular SoC FPGA device is based on the Cyclone V or Arria V FPGA fabrics. Devices based on the Cyclone V FPGA fabric will be offered with 25K, 40K, 85K, and 110K logic elements. Devices based on the Arria V FPGA fabric will be offered with 350K and 460K logic elements.</p>
<p>The HPS in the Altera SoC FPGA connects to the on-chip FPGA fabric though two 128-bit AXI buses—one for reads and one for writes. As you can see from the block diagram above, the hard-core peripherals not included in the HPS block separately connect to the FPGA fabric. What’s not apparent from the diagram is that the two ARM Cortex-A9 processors share a Snoop Control Unit (SCU) and there&#8217;s an ACP (accelerator coherency port) linking the HPS to the FPGA fabric so it’s possible to engineer accelerators that maintain coherency with the ARM Cortex-A9 processor cores&#8217; caches and implement them using the on-chip FPGA fabric.</p>
<p>In addition to the six FPGA array sizes (four for the devices based on the Cyclone V FPGA fabric and two for devices based on the Arria V FPGA fabric), Altera plans to offer parts with three HPS subsystem configurations: base, mid, and high. Combined with the six FPGA fabric sizes, that means there are at least 18 Altera SoC FPGA parts planned for the initial product lineup. Altera says that there will also be 1-processor variants in the SoC FPGA lineup. Just in case you suspect that’s perhaps a bit underpowered, keep in mind that essentially 100% of all system designs based on microcontrollers use a far less capable processor core than one 800MHz ARM Cortex-A9 core. You might want to check to make sure you’re not becoming overly acclimatized to multicore designs. On the other hand, if you’re running Android then two capable processor cores will come in handy.</p>
<p>As the block diagram above shows, there are additional hard-core peripherals connected to the SoC FPGA chip’s FPGA array: as many as three more multiport SDRAM controllers, a Gen2 x4 PCIe port (supplemented with the possibility of implementing a soft Gen2 x8 PCIe port in the FPGA fabric), and as many as six 10Gbps high-speed, differential  serial transceivers and as many as thirty 6Gbps high-speed, differential  serial transceivers. These additional peripheral ports have separate access paths into the FPGA fabric of the SoC FPGA devices.</p>
<p>Perhaps the most interesting news is that low-end members of the Altera SoC FPGA family will sell for $15 in “high volumes.” That’s a lot of capability for a relatively low price. In fact, that’s a very low price in the FPGA world. The bad news is that Altera doesn’t plan to ship devices until the second half of 2012.</p>
<p>So that’s the Altera SoC FPGA. Now for the inevitable comparison based on my previous write-ups of the Xilinx Zynq. (See “<a href="http://low-powerdesign.com/sleibson/2011/03/01/xilinx-zynq-epps-create-a-new-category-that-fits-in-among-socs-fpgas-and-microcontrollers/" target="_blank">Xilinx Zynq EPPs create a new category that fits in among SoCs, FPGAs, and microcontrollers</a>”.) First, there’s the processor complex—what Altera calls the HPS. The two products are remarkably similar here: two 800MHz ARM Cortex-A9 processor cores with Neon DSP and FPU extensions, 512Kbytes of unified L2 cache, Flash controller, one SDRAM controller, Snoop Control Unit, timers and watchdog, DMA, etc. Both processor complexes support an ACP (Accelerator Coherency Port) interface into the FPGA fabrics.</p>
<p>There’s some difference in the processor complex-to-FPGA connection scheme: Altera offers one 128-bit read and one 128-bit write AXI bus and a 32-bit APB (Advanced Peripheral Bus) port plus additional ports that go directly from the FPGA fabric to the multiport SDRAM controller in the HPS. Xilinx offers four 32-bit and four 64-bit AXI ports plus direct access from the FPGA fabric to the SDRAM controller. So the Xilinx parts theoretically provide more raw interconnect bandwidth between the processor complex and the FPGA fabric than do the Altera parts. It remains to be seen if that raw capability can deliver more bandwidth in practice, but the potential is clearly there.</p>
<p>But wait! Hold on there! The Altera SoC FPGA parts offer as many as three more SDRAM controllers outside of the hard-core HPS processor complex and those SDRAM controllers can be connected either to devices implemented as soft cores in the on-chip FPGA fabric or through the FPGA fabric to the HPS. That added SDRAM control capability could really be an advantage in systems with extremely high SDRAM bandwidth requirements.</p>
<p>Then there’s the PCIe controller. On the Altera SoC FPGAs, there’s one hard-core Gen2 x4 PCIe port and the possibility of implementing a second, soft-core Gen2 x8 PCIe port in the FPGA fabric. The Xilinx Zynq parts will provide a hard-core Gen2 x4 or x8 PCI port, depending on the family member. There are additional 10.3Gbps serial channels available on the Xilinx Zynq components, so a soft-core PCIe controller is a possibility, as it is for the Altera SoC FPGAs.</p>
<p>Since I’ve brought up the topic of the FPGA fabric, let’s compare those as well. The various Altera SoC FPGA family members offer six FPGA fabric sizes: 25K, 40K, 85K, 110K, 350K, and 460K logic elements. The announced Xilinx Zynq family offers four fabric sizes: 30K, 85K, 125K, and 235K logic elements. So if you need really big FPGA fabrics to complement the capabilities provided by the processor complexes, then the Altera SoC FPGA family seems to offer more capacity for now. However, should a battlefield form at the high end, you can bet that Xilinx will be filling out the product line at the high end, where there’s more margin to be made.</p>
<p>Finally there’s pricing and availability. Both companies have announced high-volume unit pricing “below $15”  but the Xilinx parts are supposed to be available this year and the Altera parts are scheduled to appear in the latter part of next year.</p>
<p>Together, today&#8217;s Altera SoC FPGA announcement and the previous Xilinx Zynq announcements create a truly exciting new product category—one that fuses FPGAs with high-performance microprocessors in a way guaranteed to dramatically extend the reach of FPGAs. The resulting mixture of capability, performance, power consumption, and cost simply cannot be replicated with a 2-chip design.</p>
<p>I predict that many system designers will be unable to resist this combination. Naysayers will point to previous failed attempts at merging FPGAs and hard microprocessor cores and some will predict a similar fate for this new generation of parts. Much has changed. First, the embedded industry has adopted the ARM architectures  and there is a large body of programming talent available for this architecture. Second, these new parts are not FPGAs with processor cores tacked on. They are very capable and complete processor complexes, application processors in their own right, augmented with FPGA fabrics. From my perspective, the Altera SoC FPGAs and Xilinx Zynq parts stand a very good good chance of definining a new and vibrant component category.</p>
]]></content:encoded>
			<wfw:commentRss>http://low-powerdesign.com/sleibson/2011/10/11/altera-introduces-soc-fpga-melding-arm-cortex-a9-dual-core-processor-complex-with-a-28nm-fpga-fabric/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>1972: When scientific calculators truly went low power</title>
		<link>http://low-powerdesign.com/sleibson/2011/09/19/1972-when-scientific-calculators-truly-went-low-power/</link>
		<comments>http://low-powerdesign.com/sleibson/2011/09/19/1972-when-scientific-calculators-truly-went-low-power/#comments</comments>
		<pubDate>Mon, 19 Sep 2011 02:04:51 +0000</pubDate>
		<dc:creator>sleibson321</dc:creator>
				<category><![CDATA[Design]]></category>
		<category><![CDATA[Low-Power]]></category>
		<category><![CDATA[calculator]]></category>
		<category><![CDATA[HP]]></category>

		<guid isPermaLink="false">http://low-powerdesign.com/sleibson/?p=660</guid>
		<description><![CDATA[Dave Cochran recently wrote about his long engineering career at Hewlett-Packard on the www.hpmemory.org Web site. Who? What Web site? Well, the Web site is an amazing living museum that&#8217;s a tribute to Bill and Dave&#8217;s HP. And Dave Cochran is &#8230; <a href="http://low-powerdesign.com/sleibson/2011/09/19/1972-when-scientific-calculators-truly-went-low-power/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Dave Cochran recently <a href="http://www.hpmemory.org/timeline/dave_cochran/a_quarter_century_at_hp_00.htm" target="_blank">wrote about</a> his long engineering career at Hewlett-Packard on the <a href="http://www.hpmemory.org/">www.hpmemory.org</a> Web site. Who? What Web site? Well, the Web site is an amazing living museum that&#8217;s a tribute to Bill and Dave&#8217;s HP. And Dave Cochran is likely one of the most important people you’ve never heard about in the annals of low-power design. He spent 25 years at HP, starting as a part-time test technician in 1956 and departing as a celebrated HP engineer in 1981.</p>
<p>Between those two years, Cochran worked on a huge number of projects including the HP 204B audio oscillator where he used transistors and a hugely ingenious double-spiral-cam potentiometer actuator to design the famous tungsten light bulb out of Bill Hewlett’s original audio oscillator design. (That double-cam actuator makes a linear pot do unnatural things and you need to see the short, 3-second video to believe it!) But it’s what he did in the 1960s and 1970s that make him the topic of this particular blog post.</p>
<p>Cochran was working at HP Labs when Malcolm McMillan and Tom Osborne dropped by with two very different calculator prototypes. It was the legendary Barney Oliver, Grand Wizard of HP Labs who conjured the idea of merging these two very different machines into one massively powerful scientific calculator. McMillan’s design—called Athena—could perform transcendentals using Jack Volder’s CORDIC algorithms but it had a fixed-point architecture so it was not deemed accurate enough for repeated engineering calculations. Osborne’s design was barely more than a simple 4-banger calculator but it had a really elegant, floating-point hardware design based on the 1960s version of a VLIW processor.</p>
<p>Cochran got the job of trying to come up with a way to unify the two architectures. He writes “I was looking at Osborne&#8217;s architecture and trying to figure out what an algorithm was. I even flew down to Southern California to talk with Jack Volder who had developed the <a href="http://en.wikipedia.org/wiki/CORDIC" target="_blank">CORDIC</a>  transcendental functions used in the Athena machine and talked to him for about an hour. He referred me to the original papers by Meggitt where he&#8217;d gotten the pseudo division, pseudo multiplication generalized functions.</p>
<p>My job was to determine how many digits and what the operation time was required; what the architecture had to be; how many registers did we need, clock speed, etc? Other people were coming back with their inputs on cathode ray tube display, keyboards [and so on]. Should we use transistors or small-scale integration? There was no large-scale integration, but there was medium-scale integration, MSI which meant maybe 10 transistors in a chip.”</p>
<p>From Cochran’s architectural contributions, plus substantial work from other engineers in HP Labs and Tom Osborne (who remained an HP consultant for many more years), HP introduced the HP 9100 Scientific Calculator in late 1968. It’s a marvelous machine and it was a real design breakthrough for its day. It also weighed 40 pounds and drew 70 Watts out of a wall socket.</p>
<p>Now 70W isn’t all that much compared to the amount of electrical power you’d need to replicate the HP 9100’s computational abilities with a minicomputer or a mainframe, so you could consider this a lower-power design for its day. But it’s what happened next that really takes us to the domain of miraculously low power.</p>
<p>Cochran writes: “As soon as the 9100 started showing success in the market place Hewlett started to bug me personally. I know he also talked to Tom Osborne about it, what do you think, and so on. But he would come into the lab and he&#8217;d look for me and he&#8217;d say, ‘Hey, how are you coming with putting the 9100 in my shirt pocket?’ He said, ‘I want all that computational power in my shirt pocket.’”</p>
<p>OK. From 40 pounds and 70W to something that runs off batteries and fits in your shirt pocket. Now that’s a stretch. Oh, and one thing I forgot to mention. The HP 9100 calculator design had exactly one integrated circuit in it. That IC was used in the magnetic card reader. Osborne didn’t like the primitive ICs offered at the time. He didn’t design the card reader but the rest of the machine was implemented with discrete transistors, magnetic-core RAM, rope memory (go look that one up), and a large capacitive ROM fabricated from a 16-layer printed-circuit board. None of that technology was headed for a shirt pocket. Not in this universe.</p>
<p>Cochran then writes about his shirt-pocket epiphany: “Tom Whitney and I went down to Fairchild Semiconductor on Ellis Street, Mountain View, and they wanted to show us a calculator architecture that they were planning to provide to various companies that wanted to build calculators and semiconductors. So we went down to look at it. And I looked at it, and oh, this could do the algorithms. See, I knew. By this time, I had already fit the algorithms into a small-scale integrated machine, the 9100. So I knew exactly what architecture I needed, the capabilities of the architecture. I didn&#8217;t know what it was going to look like, but I knew what its capabilities had to be.</p>
<p>It was I think September of 1970; I saw a design that was different than anything else. It was not your classic computer architecture as taught at the universities. It was all shift register. It was designed for the technology at the time. When talking to the people at Fairchild I meet a fellow, Rich Whicker, who later came to work at HP. I said, &#8220;God, this design, did you think of this?&#8221; He says, &#8220;No. We got it from Sweda, the cash register company.&#8221; Sweda at the time was trying to make an electronic cash register or Point-of-Sale products and they were using shift registers.</p>
<p>Shift registers were the densest form of integration of integrated circuits at the time; you had to keep the clock moving and so on. It had no static memory. So here was a design using shift registers a 20-digit chipset that could satisfy anybody making a four-function machine. Add, subtract, multiply, and divide. It could give you the numbers as big as most people wanted, but it was all fixed point. 20 digits should be more than enough for anybody. You could have the decimal point anywhere in that stream.</p>
<p>I got really cranked up about seeing that architecture at Fairchild, I got very excited. And I&#8217;m whispering in Tom Whitney&#8217;s ear, ‘God, this is great.’ And I&#8217;m trying not to be too excited while I&#8217;m there. When we drove away from there and, I said, ‘God, that&#8217;s exactly—you know, I can tweak that architecture just a little bit. We don&#8217;t need the full 20 digits, we can do this and this and this. And gosh, yes, I can do it, I can do it, I can do it.’”</p>
<p>And that’s when the HP 35 Pocket Scientific Calculator crossed over from the realm of the impossible to the realm of the possible. When David Cochran thought that it could. Two years later, in 1972, it became a reality. A pocket scientific calculator that ran on three NiCd batteries in a pack. The rest, as they say, is history.</p>
<p>Be sure to read the whole story at <a href="http://www.hpmemory.org/timeline/dave_cochran/a_quarter_century_at_hp_00.htm#chapter_08">http://www.hpmemory.org/timeline/dave_cochran/a_quarter_century_at_hp_00.htm#chapter_08</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://low-powerdesign.com/sleibson/2011/09/19/1972-when-scientific-calculators-truly-went-low-power/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.542 seconds -->

