We propose GypSum, a new deep learning model that learns hybrid representations using graph neural networks and a pre-trained programming and natural language model. GypSum uses two encoders to learn from the AST-based graph and the token sequence of source code, respectively, and modifies the encoder-decoder sublayer in the Transformer's decoder to fuse the representations.
This repos is developed based on the environment of:
Python 3.7
PyTorch 1.7.0
- How to construct graph for python and java program?
you can use the function
get_graph_from_source
fromproprocess/java(python)_graph_construction.py
After processing the code, save the data aspkl
into data folder. - How to run?
./run
$GPU_ID$ $DATASET$ $TASK$ For example, if you want to train java model from scrach withcuda:0
, you can just run command./run 0 java train
Gypsum-main
ββ- README.md
βββ c2nl
β βββ __init__.py
β βββ __pycache__
β βββ config.py
β βββ decoders
β βββ encoders
β βββ eval
β βββ inputters
β βββ models
β βββ modules
β βββ objects
β βββ tokenizers
β βββ translator
β βββ utils
βββ config
β βββ general_config.yml
β βββ java_xxx_xxx.yml
β βββ ...
βββ data
β βββ java
β βββ python
βββ evaluation
β βββ bleu
β βββ evaluate.py
β βββ meteor
β βββ rouge
βββ gypsum
β βββ __pycache__
β βββ data
β βββ metor.ipynb
β βββ model.py
β βββ modules
β βββ predict.py
β βββ train.py
β βββ utils
βββ modules
β βββ __pycache__
β βββ attention_zoo.py
βββ preprocess
β βββ generate_java_graph.ipynb
β βββ java_graph_construct.py
β βββ python_ast.ipynb
β βββ python_graph.py
βββ run
We have uploaded our dataset on google drive: Dataset For Experiment
We borrowed and modified code from DrQA, OpenNMT. We would like to expresse our gratitdue for the authors of these repositeries.