Technology

TransCoder of Facebook that Converts Programming Languages

Facebook researchers said that they have developed what they call a neural TransCoder. It is a system that transforms code from one high-level programming language like Java, Python, and C++, into another. The system is unsupervised. Thus, this means that it searches for previously undetected patterns in data sets without labels and with a minimal amount of human supervision. Reportedly, it is outperforming rule-based baselines by a significant margin.

To migrate an existing codebase to a modern or more efficient language, like C++ or Java, requires expertise in both the target and source languages. Moreover, it is often very costly. For instance, the Commonwealth Bank of Australia, for over five years, spent around $750 to change its platform from COBOL to Java. Transcompilers can help, in theory. They can eliminate the need for rewriting code from scratch.

Nevertheless, they are challenging to build in practice. This is because different languages may have different syntax. Moreover, they rely on distinctive standard-library functions, variable types, and platform APIs.

Facebook’s new system tackles the challenge with an unsupervised learning approach. Transcoder can translate between Python, C++, and Java. TransCoder was first initialized with the pretraining of a cross-lingual language model. It can map pieces of code expressing the same instructions to identical representations regardless of the original language. A process called denoising auto-encoding trains the system to generate valid sequences, even when fed with noisy input data. Moreover, back-translation means TransCoder can create parallels that could be used for training.

Facebook’s TransCoder

TransCoder’s cross-lingual nature arises from the number of common tokens. These exist across programming languages, in common keywords like “try,” “if, “while, “for” and mathematical operators, digits, and English codes that appear in the source code. Back-translation serves to improve the translation quality of the system. It works by coupling a source-to-target model with a backward target-to-source model trained in parallel. The target-to-source model should target sequences into the source language. It thus produces noisy source sequences. Meanwhile, the source-to-target models help to reconstruct the target sequences from loud sources until the two models converge.

Related Post

Researchers from Facebook trained TransCoder on a public GitHub corpus that contains over 2.8 million open source repositories. It targets translations at the function level. They teach TransCoder on all the source code available. Then, back-translation and denoising auto-encoding components are trained only on functions. It altered between components, with batches of around 6,000 tokens.

The researchers extracted 852 parallel functions in Python, C++, and Java from GeeksforGeeks. They did this to evaluate the performance of TransCoder. Thus, GeeksforGeeks is an online platform that gathers coding problems and can now present solutions in several programming languages. They have developed a new metric that can test whether hypothesis functions generate the same outputs as a reference when given the same inputs.

Nevertheless, the best-performing version of TransCoder did not generate many functions strictly identical to the references. Despite that, its translation had high computational accuracy. They attribute this to the incorporation of beam search. Beam search is a method that can maintain a set of partially decoded sequences.

 

Recent Posts

AUD/JPY Climbs Back to 102.20, Halting Losses

Key Points: AUD/JPY broke below a rising wedge, signalling possible bearish momentum, with immediate resistance at 103.00 and support at…

1 day ago

EUR/JPY Hit 168.25, Boosted by 0.3% Q1 GDP Growth

Key Points EUR/JPY Rises to 168.25: Strengthened by robust Eurozone economy and steady ECB policy. Eurozone GDP Grew by 0.3%…

1 day ago

Chinese Electric Vehicle Market: Nio Stock Up 20%

Key Points: Nio's shares hit 44.20 HKD, up 20%, with electric vehicle deliveries up 134.6% year-on-year to 15,620. BYD leads…

2 days ago

Ethereum Price Dips Below $3,120 Amid Market Slump

Key Points: Ethereum fell sharply from $3,355 to a low of $2,813, reflecting high volatility and sensitivity to market dynamics.…

2 days ago

Stock Markets: Nikkei Down 0.1%, Hang Seng Up 2.4%

Key Points Nikkei 225 slightly fell by 0.1%, while the Hang Seng index surged by 2.4%. USD/JPY increased slightly, highlighting…

2 days ago

Gold Price Increases to ₹71,278 and $2,328

Key Points: Gold prices rose on MCX India to ₹71,278/10 gm and COMEX US to $2,328/oz. The US Dollar Index…

3 days ago

This website uses cookies.