Is custom build really faster than general purpose?

Many people think, that custom chips for appliances are faster for their respective task than general purpose chips. For such areas like graphical processing this is true. GPUs are so powerful, many people think about their usage in computing. But i made an interesting observation in the SPECjbb2005 list: The company Azul Systems developed a special purpose processor for Java. The top of the line 7280 system consists out of 16 processors and provides 872972 SPECjbb2005 bops. Thus the performance per processor is 54561 SPECjbb2006 bops. A single T2000 has an perfomance of 96523 SPECjbb2005 bops. Thus the general purpose processor UltraSPARC T1 (okay, almost general purpose) has twice the performance of the custom build Vega2 processor on a per core basis.
Another comparision is interesting to: Assume a workload that can be scaled horizontally . You need an aggregated performance of at least 850.000 SPECjbb2005 bops. You could use 1 Azul 7280 or 9 T6300 blades. You would assume that the Azul solution is more efficient. But the the 7280 takes 14 rack units and is rated with a typical wattage of 3250 Watts (as specified on their website). 9 T6300 would have a size of 10 rack units and round about 2700 watts. With Intel C2D the performance calculation would be similar, although not in the same power envelope. The Azul system has an different advantage, it provides a single image to application, thus the java application would be able to use the full 756 Gigabytes of memory. But at the end, it´s an interesting evidence that custom build hardware isn´t necessarily faster or more efficient than general purpose hardware.