'"It's a clone. It's implemented in an FPGA. It runs almost every ARM instruction," said Collin. "It's rather slow: 15 to 17 MHz," he added.'
This kind of effort is going to be a puzzlement for IP companies. With today's excellent design implementation tools, scalar processor design and implementation is not rocket science. (See also my DesignCon paper and the Circuit Cellar articles.)
I believe that very substantial subsets of any of the famous RISC processor architectures can be reimplemented to run at 50-100 MHz in an FPGA -- often in cheap Spartan-II parts -- in short order by a skilled practitioner.
I think the "Star IP" companies should try to get ahead of the curve. These companies will need to determine a licensing regime and a business model that embraces, or at any rate, anticipates, these soft core reimplementations.
Perhaps they should think of these reimplementations as complementary to their deluxe hard core macros for custom VDSM VLSI. A popular 50-100 MHz FPGA soft core is not going to displace their high value hard cores, and in fact will help to win market share dominance for their instruction sets.
Or perhaps they could try to stem the tide of potentially popular and incompatible open source reimplementations by writing, distributing, and licensing free (gratis) high quality (and FPGA optimized) "reference" soft cores themselves, albeit perhaps not free (libre) in its usage terms.
For example, had there existed a high quality FPGA soft core for the architecture in question, easily licensed for free to universities, (does such exist?), this Swedish MPSOC reimplementation might never have come to pass.
For an example of an enlighted approach to this, look at Sparc International and the success of LEON SPARC. Quoting from Peter Clarke's 3/6/00 article in EE Times,
'"The Sparc architecture is an IEEE standard so it is an open standard. There are no limitations on designing around it," said Gaisler. "However the trademark belongs to Sparc International Inc. We can't call Leon a Sparc but we can call it Sparc-compatible. I have been in touch with Sparc International regarding the licensing issues."'Consider also these paragraphs from the same article:
'ARM's spokeswoman said the company does not see availability of free 32-bit processor cores as similar to its own business or competitive with it.'Well in the case of SPARC, this is (at least conceptually) 1) LEON SPARC; 2) GCC; 3) Linux (ucLinux) or RTEMS or eCOS etc. Want commercial support? See Gaisler Research or Metaflow's new LeonCenter.com.'"ARM's business model is not just about cores," she said. "It's based on a full product road map and also on third-party support. That includes design companies, development tool support and operating systems support. It's not simple to build all the support that ARM now has in place."'
See also the predictions in Why FPGA CPUs?, and see another valuable perspective in this EBN article by Jack Robertson, Free downloadable cores pose difficulties, exec says.
The paper abstract
Mikael Collin et al, SoCrates - A Multiprocessor SoC in 40 days.
See also our 8-CPU MP FPGA design and my 1997 posting on FPGA Multiprocessors.
LEON in the News
The LEON SPARC
synthesizable CPU (often implemented in a big Virtex part) is also the
subject of an
editorial
by Max Baron in Microprocessor Report.
"It's not very likely that these new free engines will replace the cores that require license and royalty payments. Extreme results in design must still be hand-tuned, whether for performance, power dissipation, or wide and multiple datapaths and deep pipelines. But a sizable number of volume applications don't need the extremes, and that situation will establish the free cores."On this point, I disagree. Non-free soft cores have no monopoly on hand-tuned optimizations or craftsmanship.
New Joel on Software: Daily Builds are Your Friend:
"One rule we followed on the Microsoft Excel team, to great effect, was that whoever broke the build became responsible for babysitting builds until somebody else broke it. In addition to serving as a clever incentive to keep the build working, it rotated almost everybody through the job of buildmaster so everybody learned about how builds are produced."Yep.
With a multi-hundred-million-transistor budget to play with, it will be quite fascinating to see what Virtex-II 'immerses' next. I can think of several interesting hard cores.
Steven Fyffe, Electronic News (EDN Access): Hitachi, Triscend Partner On Programmable Chip.
Will Cummings of Insight announced the board here on the mailing list.
This board has an XCV600EHQ260-6, 50 MHz oscillator, DS1073 100 MHz programmable oscillator, Infineon PDSP1880 8-character 5x7 alpha LED, 4 LEDs, RS-232 drivers and connector, some switches, the all important DONE LED, DIME option module connector, JTAG port, Multilinx port, XC18V04 ISP PROM, and other goodies.
It has on-board 5V, 3.3V, 2.5V, and 1.8V regulators, which you can jumper out to provide your own power supplies. No doubt I will have to do that when/if I get the big chip-multiprocessor designed and running.
There are also plenty of headers to get at FPGA pins and jumpers to disconnect almost everything from the FPGA.
The board is accompanied by a thorough 26 page User's Guide and a full set of schematics, and by a 7V DC 1.2A AC adapter.
Out of the box, the board is jumper configured in master serial mode, and downloads a design from the config FLASH ROM that displays INSIGHT on the alpha LEDs and flashes the other LEDs.
I wish it had a FLASH ROM and a bank of 32K or more x32 sync SRAM, or a 4Mx32 SDRAM, but you can't always get what you want.
Adventures of a JTAG Newbie
Last night I had my first exposure to JTAG programming.
Previously I've used XChecker or the XESS download cable.
After more than an hour of grief, I gave up trying to get the Xilinx JTAG Programmer to recognize my JTAG Parallel Cable III from my notebook PC. Despite reading and skimming through every JTAG related document and support page on the Xilinx web site, and fiddling again and again with my notebook's parallel port I/O address and settings (should it be ECP, EPP, bidir, unidir, and why can't the silly programmer recognize LPT3), I gave up. My last theory is that there is some Windows NT/2000 driver to access the parallel port that I never installed, but if so, it seems thoroughly undocumented by Xilinx.
So I moved to my desktop machine, installed the JTAG Programmer there, and it recognized my cable immediately.
I set up a JTAG scan chain which bypasses the 18V04 with a dummy MCS file, and programs the V600E with my leds.bit test design. This design simply divides CLK by 2^24, driving the upper 4 bits of the counter to the 4 LEDs, which should count at ~3, ~6, ~12, ~24 Hz.
When I executed that JTAG chain, it properly returned the ID of the 18V04, but when programming the V600E, DONE did not go high.
After some head scratching, I remembered that I had not changed the configuration options of BITGEN to reflect that I was configuring with JTAG. I changed the startup clock to JTAGCLK, rebuilt and redownloaded, and sure enough, DONE went high, and my LEDS lit up but didn't blink -- indicating the counter was stuck at 0x000000.
A bit more head scratching and I had it. Where the User's Guide says "The ... on-board 50 MHz oscillator ... is enabled when the JP36 jumper is closed." -- it means "when the JP36 jumper is absent". I had looked at the schematic, but it wasn't clear what grounding the oscillator's ENABLE pin would do -- enable it or disable it?
I removed JP36 and my little design worked. Call it a night.
Lessons
"The XC2V6000, which is set for release during 1Q2001, contains 76,032 logic cells, or 67,584 LUT/register units. This capacity is essentially equal to the 0.18-micron XCV3200E and can be compared to 51,840 LUT/register units for Altera's EP20K1500E. The largest planned Virtex-II devices, the XC2V10000 will contain 138,240 logic cells."My point exactly.
FPGA 2001 is coming soon.
Xilinx has an XtremeDSP/Virtex-II Technical Simulcast on 1/25: "It's a delicious plate of DSP and system design goodies not to be missed...".
During February and March, Atmel is offering free technical seminars on its FPSLIC (AVR+FPGA) and AVR MCU products.
"In the morning session, the seminar will cover FPSLIC architecture, co-design and co-verification design methodologies, features and documentation of Atmel's System Designer EDA tool and a step-by-step design tutorial using Atmel's FPSLIC Starter Kit."
I found links to several MISC (minimal instruction set computer) FPGA CPU implementations at UltraTechnology's Forth Chips page, and added a new category for them on the Links page.
The XKL TOAD-1 DEC-20 FPGA reimplementation (two XC4010E's, if I recall correctly) was noted on slashdot.org.
'FPGA boards require at least one resistor per I/O. If you multiply the number of I/Os by 1,000 for every pin on a high-end FPGA, times the number of ICs, the problem becomes clear. "We found out that Cisco has more engineers on pc-board layout than on the FPGAs," said Bruce Weyer, senior director of marketing for Xilinx (San Jose, Calif.).'Alex Romanelli, Electronic News Online via EDN: Xilinx Virtex-II FPGAs Finally Ship.
Marketing gates redux
You might think that the new "1 M" "system gate" XC2V1000 would have
about the same number of programmable logic cells as the "1,124,022"
"system gate" XCV1000. Or perhaps you might think the 10 M system gate
XC2V10000 would provide about 1,000 times more logic cells than
the 10,000 gate XC4010, more or less.
The following table shows some representative Xilinx devices, the device CLB matrix, the number of CLBs, the no. of 4-LUTs/CLB, the no. of 4-LUTs, the approximate no. of "ASIC equivalent gates" or "system gates" stated at the time those devices were brought out, and the result of dividing stated gates by the number of 4-LUTs.
Device Matrix CLBs LUTs/CLB 4-LUTs Gates Gates/LUT XC4010 20x20 400 2 800 10K 12.5 XC40150XV 72x72 5184 2 10368 150K 14.4 XCS20 20x20 400 2 800 20K 25.0 XCV1000 64x96 6514 4 24576 1.124M 45.7 XC2V1000 40x32 1280 8 10240 1M 97.6 XC2V2000 56x48 2688 8 21504 2M 93.0 XC2V10000 128x120 15360 8 122880 10M 81.4Draw your own conclusions.
References
XC4010: Xilinx, The Programmable Logic Data Book, 1994, 3rd ed., p.2-47,
(table), "Appr. Gate Count, XC4010/10D: 10,000".
XC40150XV: Xilinx, The Programmable Logic Data Book, 1998, p.4-151,
(table), "Max Logic Gates (No RAM), XC40150XV: 150,000",
"Typical Gate Range (Logic and RAM), XC40150XV: 100,000-300,000".
XCS20: Xilinx, The Programmable Logic Data Book, 1998, p.4-173,
(table), "Max System Gates, XCS20 & XCS20XL: 20,000";
"Typical Gate Range (Logic and RAM), XCS20 & XCS20XL: 7,000-20,000".
XCV1000: Xilinx, The Programmable Logic Data Book, 1999, p.3.3,
(table), "System Gates, XCV1000: 1,124,022".
XC2V1000, XC2V2000, XC2V10000: Xilinx, Virtex-II 1.5V Field-Programmable Gate Arrays, DS031 {v1.2}, January 15, 2000 (sic),
p.2, (table), "System Gates, XC2V1000: 1 M; XC2V2000: 2 M; XC2V10000: 10M".
Today we also have what will probably be the biggest programmable logic news story of the year -- the launch of the 1.5V Xilinx Virtex-II product family.
Xilinx links: launch PR, Q&A, backgrounder (PDF), handbook and data sheets, main data sheet.
Virtex-II is big and fast. For example, the XC2V6000 has 76K logic cells, 144 18 Kb BlockRAMs, 144 18x18 multipliers, and 1104 IOBs.
Here are some entries in the Virtex-II Performance Characteristics (p.48 of the above linked data sheet):
Description Reg-to-reg Perf (MHz) 16-bit addr decoder 457 32-bit addr decoder 311.7 16-bit adder 287.4 reg to LUT to reg 708.7 18x18 multiply, reg'd 154.9(Performance in MHz in XC2V1000-5 device)
"It's a great time to be us." -- Steve Ballmer
(Re)configurable computing
Farewell twentieth century. So long and thanks for all the fish.
FPGA CPU News, Vol. 2, No. 1
Back issues: Dec, Nov, Oct, Sep, Aug, Apr.
Opinions expressed herein are those of Jan Gray, President, Gray Research LLC.