There are essentially two types of problems supercomputers can solve: problems where all the CPU's have to talk to each other occasionally and problems where CPU's have to talk to each other all the time. Here's two examples:
Shrek. Computer animated movies like Shrek are created frame-by-frame by a computer. The computer has to calculate the color and brightness of each pixel on the screen. There are a lot of pixels on a movie frame and sixteen frames per second of movie. So a ninety minute movie like Shrek or Toy Story or Finding Nemo takes a lot of computing power. The good news is that the pixel on the upper-left of the screen doesn't have much to do with the pixel on the lower right. So if you wanted to speed up the process of drawing one frame you could assign one computer to draw the left side of the screen and another to draw the right side, this would cut the time it takes to draw one frame in half. If you cut the screen into four areas you could put four computers on the job, eight and so on. Because the computers drawing the various parts of the screen don't have to talk to each other, the computers can be far apart, and if one of the computers fails it's no big deal because the other computers aren't waiting for each others results. The failed computer's job can just be resubmitted at the end.
Billiard Balls. Now imagine you wanted a computer to calculate where all the balls would end up after the initial break in a game of eight-ball. The result depends on how all the balls hit each other and the walls of the pool table. Unlike Shrek, the movement of one ball is very much dependent on the movements of all the other balls. Imagine you assigned one computer to calculate the low balls, and one computer to calculate the high balls. As soon as a low ball hit a high ball, the two computers would have to exchange data to determine how to change the balls' paths based on their impact. Since lots of high balls hit lots of low balls on a break, the two computers have to talk constantly. Moreover, if one computer failed the other computer couldn't finish its job because it would be waiting for the failed computer. Consequently, the computers need to be close together so they can talk very quickly, and they have to be very reliable because they're counting on each other. The billiard ball example is simple, there are only fifteen balls. But now imagine there were 1000 balls, or a million balls, or the balls weren't balls at all but instead atoms of water flowing in a river. In this case the computers would spend more time talking than computing. So the network is very important, and so is the reliability of the nodes. If one computer is slow or one fails, everyone waits.








Article comments