MIT's new Compiling Method can Optimize Code Before Parallel Execution

Researchers at MIT have found a way to improve optimization in parallel programs by throwing conventional ‘wisdom’ out the window. This new development could greatly improve the performance of software on multi-core processors, which most modern computers utilize.

As a front end content writer, I sit on the opposite side of the lunch table from computer programmers. Regardless, when an advancement is made in compiling such as this one, even I can understand its significance.

#MIT 's new open-source #compiling #optimizes before #parallel #execution.Click To Tweet

Compiling and Parallel Computing

Researchers at MIT have found a way to advance the field of Computer Science by introducing new methods for optimizing code for parallel computing. What does that mean?

Here we go.

Next week researchers from MIT’s Computer Science and Artificial Intelligence Laboratory will unveil a new variation on an open-source compiler that can optimize before it adds the code necessary to perform a parallel execution.

To understand the new research, you need to know about two things (experts skip ahead):

Compilers convert code that humans understand into instructions that machines can execute. Contemporary compiling can analyze that code to find the best way to optimize the efficiency of the resulting program.
Parallel computing is a form of computation where large problems are separated into multiple smaller ones so that they can be solved at the same time and subsequently combined into a total solution. You see this in multi-core processors, for example, where separating the load of processing power results in faster operations.

The problem with modern compilers is that they tend to lose the benefit of their optimization methods in the face of parallel computing. Managing parallel execution requires the use of extra code to handle the separation and combination of a given problem, and optimization happens after that additional code is introduced.

This tends to confuse the optimizers, and they hesitate to boost the performance of the code.

A Better Optimizer

Here’s where the new research comes in. The researchers are going to present a new variation on a well-known compiler that optimizes before all of that extra code is introduced to enable parallel execution. As a result, compiling will now make better use of parallel computing.

According to Charles E. Leiserson, the lead researcher on the team, the compiler “now optimizes parallel code better than any commercial or open-source compiler, and it also compiles where some of these other compilers don’t.”

“the compiler ‘now optimizes parallel code better than any commercial or open-source compiler'”

The concept of optimization before adding the extra code for parallel processing has been floating around for decades, but the people who develop compilers didn’t think it was possible.

It’s too hard, they said, yet Leiserson’s group was having none of that noise.

The research flies in the face of the traditional stance on the subject, and furthermore, it only took the modification of 6,000 lines of a 4-million-line code base.

Breaking Things Down to put Them Back Together

Normally, compiling has three components that include a front end, a back end, and a middle end.

The front end is tailored to a specific language, C for example, while the back end deals with the design of a particular processing architecture, and the middle end acts as a universal port that connects the front and back end. Generally speaking, optimization happens in the middle end, so that is where the researchers focused the bulk of their efforts.

Your typical middle end uses what is called an “intermediate representation” to connect different variations of front and back ends, and the researchers used a representation that utilized what they call a fork-join model of parallelism.

In a fork-join model, programs fork at certain points or they split into operations that can be performed in parallel, only to join back together later so the program can execute serially until the next fork comes along.

The current version of the research team’s compiler uses a customized front end that is made to implement to a fork-join language called Cilk (pronounced: silk). Leiserson’s cadre developed Cilk, which made it a prime choice for the new method, but ultimately they could have tailored the front end to any other fork-join language.

Cilk expands C by adding two new commands: spawn and sync.

Spawn initiates the forks, while sync joins them back together. While that makes things easier for programmers using Cilk, it makes them much harder on Cilk’s developers. Cilk utilizes a management program called a runtime, but a program written in Cilk has to clearly tell the runtime when it needs to monitor the progress of computations so it can reshuffle the processing cores’ assignments.

Now, you might think that the only alternative to adding the runtime invocations in the front end would be to rewrite all the middle-end optimization algorithms to accommodate the fork-join model. That would allow the middle end to understand them, after all. Instead, the team added three simple commands to the intermediate representation in the middle end: detach, reattach, and sync.

With that little adjustment, many serial compiling optimization algorithms can work without modifying the code, and those that don’t only need minor alterations to optimize programming.

Compiling Code and Getting Results

The team tested their method with two versions of the popular LLVM compiler. One version had a modified front end that invoked runtime processes, while the other had the new intermediate representation in the middle end and added the runtime invocations after optimization.

On all but three of the Cilk programs used on both compilers, the new intermediate representation produced much more efficient software. Where it didn’t boost efficiency, the falloff was less than 2%.

According to Guy Blelloch, a professor of computer science at Carnegie Mellon University, “For the last ten years, all machines have had multicores in them. Before that, there was a huge amount of work on infrastructure for sequential compilers and sequential debuggers and everything. When multicore hit, the easiest thing to do was just to add libraries [of reusable blocks of code] on top of existing infrastructure. The next step was to have the front end of the compiler put the library calls in for you.”

Now, compilers will be able to do their optimizations where parallelism occurs, enabling previously impossible optimizations. Blelloch can’t say for sure how much benefit we’ll gain from this, but when the impossible is made possible a whole new world of potential opens up.

For those of us that know as much about compiling code as we do the true purpose of existence, this news may not mean much. Yet, for the programmers that code our modern, technology-infused lives, this research will break the optimization of multi-core programming wide open.