Archive for January, 2008

Embedded RAM

Qualcomm Digital Baseband ProcessorFor years, there has been speculation that traditional SRAM would be replaced with a denser type of memory for SoC devices. In fact, TI once announced that they would use ferroelectric RAM or FeRAM beginning at 90nm for their devices. Fear of ever-worsening process variability has been widespread. Alas, we have entered the 45nm period, and SRAM continues to be the workhorse of the industry.

I’m not suggesting there has been no competition. Once upon a time, it was popular to use a DRAM cell, hide the refresh inside a circuit macro, and call it 1T-SRAM.

But why have these alternatives either never been used (FeRAM and others), lost favor (as it seems in the case of 1T-SRAM and eFlash), or been relegated to only very high density, very high speed applications (as with DRAM)? But don’t expect an answer, I’m just posing these questions - at least for this week. Winter has weighed me down too much to think about such things.

DRAM has enjoyed some success, so let’s take a closer look. The bits are denser. A DRAM cell occupies only eight times the minimum area unit of a given process technology versus about 120 or more for an SRAM cell. Looking at it this way, SoC DRAM should be a no-brainer right? Wrong. The DRAM is dynamic RAM. That is, leakage in the cell access transistor will erase the contents of the cell. So the cell has to be refreshed. The circuit overhead for cell refresh along with some other operations means that the DRAM taken as a complete circuit macro will only be smaller than an SRAM for densities beyond about 4MB.

Until very recently, SoC DRAM has been narrowly confined to graphics processors from gaming consoles. Microsoft, Nintendo, and Sony game systems have all used graphics engines that integrated large embedded DRAM (eDRAM) arrays onto their chips. The semico’s that actually produced the chips were ATi/NEC, IBM, and Sony/Toshiba. IBM has enjoyed a long history in the development of DRAM and embedded DRAM. IBM invented trench capacitor DRAM, and this type has become the de-facto standard for SoC devices. I say that despite NEC’s stacked capacitor structure used in one of the XBOX 360 chips. In fact, the ATi/NEC chip is a very special case of eDRAM. It’s really more of a DRAM with integrated memory controller. Dick James has some interesting thoughts on eDRAM in game consoles that you can reach with this link.

Sony PS3 Integrated Emotion Engine and Graphics Processor Die (diffusion level)The Sony gaming consoles are an interesting case study. Consider this. PS2 and PS3 consoles contained graphics engines with eDRAM. However, the hand-held PSP does not. Instead, a commodity DRAM die is packaged, SiP style, along with a pure logic LSI device. You might say that portable devices need to be more energy efficient and would demand the use of single-chip solutions over multiple die which typically suck more power. But there is more price and manufacturing cost pressure on the PSP driving Sony’s choice for the portable platform. High development and process integration costs are certainly limiting the use of eDRAM. I think the lesson from Sony is actually more general and addresses the whole SoC versus SiP debate.PSP graphics processor die micrograph

My take is that eDRAM will only ever exist in a couple of places. It will continue to provide extremely large on-chip arrays for memory-hungry special purpose graphics processors. It will also become more common where bits need to be neatly packed along the columns of an LCD driver. But once again, this will only be for extremely memory intensive applications such as mobile drivers with very large color depth.

Other than that, SRAM will reign supreme. For the best example, look at the Qualcomm MSM7500 die micrograph at the top of this post. For a complicated SoC architecture like this, embedding DRAM makes no sense. Another example of SRAM maintaining its foothold is the latest 45nm MPU. Since Intel still did not integrate a memory controller onto the Penryn, the design still relies on a huge 6MB L2 cache, but it’s still good, old-fashioned SRAM.

Comments

45nm: What Intel Didn’t Tell You

This article originally appeared in EETimes Under the Hood. Unfortunately, there was an editing hiccup, so I have decided to post the complete text of the original article here. -Ed 

Two months after Semiconductor Insights provided the first public view of Intel’s 45nm technology and nearly a month after Intel’s IEDM presentation, it seems appropriate to revisit both the technology itself and what Intel was willing (and perhaps less willing) to reveal about it.

As noted on EETimes almost one month prior to the 2007 IEDM, the main features of Intel’s 45-nm technology are the incorporation of high-k hafnium-based dielectric material, titanium nitride (TiN) for the PFET replacement gate, and a TiN barrier alloyed with a work function tuning metal for the NFET replacement gate.

Although not the first 45nm node technology available on the open market, Intel’s process is the first to incorporate high-k metal gate (HKMG) technology. Panasonic (Matsushita) was actually the first to the 45nm node with an ASIC-optimized process using traditional polysilicon gates with SiON dielectrics.  This process is designed for density over performance and probably more indicative of the direction most manufacturers will be heading at 45nm. Density is certainly the key for the Panasonic process as it boasts several critical dimensions smaller than Intel’s.

Some high points of Intel 45nm HKMG technology are:

  • High-k first, metal gate last integration
  • Hafnium oxide (HfO2) gate dielectric (1.0nm EOT)
  • Dual band edge workfunction metal gates
  • TiN for PMOS
  • TiAlN for NMOS

The gate last integration is one point that needs a bit of clarification in the Intel process flow.
Process Integration

Polysilicon gates may be gone in the final Intel 45nm products, but they are far from forgotten. A great deal of the transistor formation still depends on the polysilicon techniques that have dominated the industry for the last 40 years. In fact, the references to “first” and “last” refer to the order of the high-k and metal gate formation with respect to the polysilicon deposition.

It is now well-known that Intel uses a gate last or replacement gate process flow at 45nm. But there is an opportunity for a great debate of the semantics of the terms, whether it’s “gate” or “last.” I’m not predicting that the lawyers are already on their way, but there’s bound to be a patent out there that will create just such an argument.

The replacement gate flow allows Intel to reuse many process steps and tools from the age-old polysilicon gate technology. Patterning polysilicon and forming traditional silicon oxide and nitride sidewall spacers leverages tried and true self-aligned processes for source and drain formation and their lightly doped extension regions. Once these steps are completed, the polysilicon is removed and workfunction metals are deposited in their stead.

But there is something interesting going on even before the first poly deposition. Contrary to the suggestion in their IEDM paper, Intel deposits the first workfunction metal prior to the sacrifical gate polysilicon. For the P-channel transistor, titanium nitride (TiN) is deposited immediately after the HfO2 dielectric. Adding aluminum to form TiAlN tunes the workfunction for the N-channel transistors. There are a couple of ways to get the aluminum into the NFET’s gate, but I will not mention those here. In general terms, these primary workfunction metals are blanket deposited in their associated conductivity regions on the die.

Intel’s process protects HfO2 from the polysilicon etch by depositing the first workfunction layers before forming and patterning polysilicon. SI engineers refer to the first gate metal layer as the top interface layer (TIL) because of the undeniable protection it provides the HfO2 dielectric. The P-type metal gates are TiN while Al is added to create TiAlN and the appropriate workfunction for NMOS. Thicker layers of both metals are deposited in their respective N- and P-channel transistors after removing the sacrificial polysilicon and a barrier layer is formed on the bottom and sidewalls of the trench left behind by the polysilicon etch.

Making a final determination about whether the first or second layer of the workfunction metals is the most important in the Intel device would require additional mathematical treatment or computer simulations which are beyond the scope of this article. Is the primary gate the metal layer deposited before polysilicon or the one that comes after? To be fair, no one expects manufacturers to publicly disclose specific details of their processes. Either way, comments about the meaning of “gate” are arguably less important than the electrical performance of the finished product. Intel 45nm technology is certainly impressive in that regard. SI’s extraction of transistor electrical parameters indicates the following saturated drive currents at 1.0V and room temperature:

  • PFET IDSAT = 1.08mA/µm
  • NFET IDSAT = 1.36 mA/µm

Intel confirmed these values at their IEDM presentation in December (although our PFET number is actually 10µA higher than Intel reported). Not surprisingly, our results show higher drive currents at low temperature (-20°C) and reduced current at high temperature (85°C).

These high values for drive current evoke more questions regarding the gate structure. There has always been a discrepancy between the physical gate length, LG, of transistors and the shorter electrically active channel length, Lelec. But before the advent of modern metal gate technology, it was relatively easy to specify LG and compare transistor performance between fabs. The Intel gate structure creates some new problems for analysts.

Intel reports a gate length of 35nm which fits well with the 1.36mA/µm drive current generated by their NFET. However, the edge-to-edge dimension of their gate structure is closer to 45nm if measured in a fashion similar to the standard used for polysilicon gates. So what gives? The ratios of LG, Lelec and source/drain extension lengths would be out of whack to produce such large saturation currents.

The answer appears related to the question about the location of the metal gate’s edge. In the past, it was assumed that the entire width of the poly gate influenced carriers in the transistor channel. Since polysilicon is etched and replaced with a metal gate filling the trench in the gate last process, the situation is less straightforward. The first material deposited into the gate trench is not metal for the gate, but actually a barrier material, so the active portion of the gate is less than the traditional length measurement that would essentially run between the sidewall spacer on either side of the gate. The barrier is quite thin, though, so that would not account for the gate measurement difference.

What appears to set the electrically active gate length is the bird’s beak formed where the sidewall spacer meets the TIL. SI analysis concluded that this bird’s beak is the result of TIL and high-k etches undercutting the polysilicon. Re-oxidation of the polysilicon sidewall prior to silicon nitride spacer formation exacerbates the undercut. For the metal gate deposited into the trench, there is a thick, relatively low-k path toward the channel at this point that obviously could not electrically influence charge carriers in the region directly underneath the bird’s beak.

The critical portion of the metal gate could also be the TIL itself. Since this layer is composed of the same workfunction metal as the gate last layer, perhaps its edge defines the metal gate length. Fortunately, the edge of the TIL layer approximately aligns with the bird’s beak above it, so the choice of measurement point will not affect the value you get for LG.

The punch line to all of this is that the gap between gate trench edge and the electrically active edge of the workfunction metal (whether first or last) accounts for somewhere between 8 and 10nm. And that appears to explain the difference between Intel’s reported value for LG and what the rest of us have been looking at.
 
Despite its cure for leakage power, adding hafnium creates new headaches for the process integration engineer. Intel avoided hafnium’s downsides – threshold voltage pinning and reduced carrier mobility – by creating a silicon oxide (or possibly oxynitride) bottom interface layer (BIL) between the silicon substrate and the HfO2 layer. The BIL not only gets hafnium into the gate stack, it also gives the process engineer one more tuning knob. Since the gate dielectric’s influence on the transistor channel and electrical performance is a function of the individual contributions of the various layers, threshold voltages can be controlled by varying the BIL thickness for different transistor applications.
DFM

Process variability and designing for it are now hot topics as problems like line edge roughness and random dopant fluctuations become more problematic at 45nm.  This was addressed in Intel’s second presentation at IEDM 2007. Kelin Kuhn discussed improving yield by process improvements as well as design changes. The SRAM cell illustrated Kelin’s point as she showed the evolution from 90nm to 45nm design. The “tall” cell layout used at 90nm was replaced with a “wide” cell at 65nm.  The 65nm wide SRAM cell design improved dimension control and variability by aligning the polysilicon in a single direction and removing the corners in the active area patterns. At 45nm, Intel’s process removed “dog bone” and “icicle” shapes by employing only square end caps. These uniform structures are also easier to fill reliably in the gate last process.

Intel continues to use 193nm dry lithography at 45nm. Restricted design rules create “structured” gate layouts as Dr. Kuhn mentioned in her discussion of the SRAM cell. This DFM technique of uniform, regular arrangement of metal gates improves yields for the advanced HKMG technology without investing in new immersion tooling. Creating strictly rectangular gate patterns did require an extra step as double-patterning was used for the sacrificial polysilicon layer.

Many features of Intel’s 65nm process remain in evolved form. “Third-generation” strained silicon is used which is structurally similar to the embedded SiGe PMOS of their 65nm process. Nickel-salicide is also used again at 45nm. Intel employs dual damascene copper up to metal nine. SiCN barrier with carbon-doped oxide (CDO) create the low-k inter-level dielectric integration scheme.

Final Thoughts

However you slice it (pun intended), the Intel process is truly innovative. For technology analysts and pundits, it brings something fresh to the discussion –Moore’s (never-ending) Law, future trends, scaling and arguments about how they did it.

I want to thank Fayez Elchamaa, Vu Ho, Xu Chang and the rest of the crack Intel 45nm project team for their hard work. The SI analytical team has managed to piece together a large and complex set of data in order to provide both this brief overview along with the detailed analysis available to our clients.

Comments (1)

ITRS Pre-release

Alan Allan 's Slide from ITRSToday Laura Peters from Semiconductor International (the other SI) hosted a webcast providing an overview of the contents of the upcoming 2007 Edition of the Semiconductor Industry Association’s (SIA) roadmap for technology - the ITRS. If there is anyone who really knows what’s in that upcoming document, it is Intel’s Alan Allan, and he gave the presentation. It’s the second time I’ve had the chance to listen to this Intel guru (see first time, here). This time Allan focused his thoughts on diversification of the ITRS to “more than Moore.” Scaling and Moore’s Law have been the rails for the semiconductor industry since its early days, but there is only so much that physical scaling can do and some places it’s just not at home.

The most obvious technology that gives us more without Moore is MEMS. Micro (or nano) electromechanical systems (MEMS) are at the heart of many systems from air bag deployment to Nintendo Wii game controllers. For Alan Allan and other industry luminaries, turning their attention to the critical part of consumer electronics systems to where the rubber meets the road is part recognition of the importance of the overall system or product and part realization that many aspects of scaling are creating more trouble than they are worth. Those troubles include increasing power consumption and capital consumption to get a cutting-edge chip designed, fabbed and into the market.

Sensors and actuators, or the way microelectronics actually interacts with our physical world, are a critical aspect of everything in our new digital age. Portable music players are ubiquitous primarily because of digital representation and storage of content, but they still need to drive something that can render an analog signal (that’s sound to a headphone if that last bit was too obtuse). You can add as much digital signal processing horsepower you want to a car, but it isn’t going to detect and correct a skid if there isn’t an analog detector at the front end of the signal path. And sensors - in particular MEMS devices - have no need for the latest lithography or the fastest transistors. MEMS technology is most at home in older, sometimes fully depreciated fabs.

Most of the “more than Moore” diatribe above is my own. I apologize and reward you for reading this far with a couple of comments made by the other ITRS representatives who were on hand for answering questions at the webcast. These actually relate more to Moore and traditional scaling issues.

When will EUV be ready? Will Conley, one of Freescale’s members of the lithography working group, answered that the next generation of litho will be ready for 22nm but not before. As for nanoimprint, he said this will only be used for “early device learning” until throughput can be improved.

I took note of a couple of other questions about interconnect. Chris Case (of the Linde Group) said there would not be a “materials” solution to inter-level dielectric constant below 2.0. He quickly added that there will be an air gap combination approach to get below k=2.0. Alan Allan also addressed the slowing of reductions in effective k values and many unresolved problems in technologies that are getting closer to their required introduction date (red brick wall for 2012). Chris Case also pointed out that 3D interconnect technology is well-proven in development fabs and will be ready when the industry is ready to take the plunge.

Comments

Close
E-mail It