Skip to content

Commit 94a5483

Browse files
committedJan 18, 2021
sia
1 parent 34c9484 commit 94a5483

File tree

5 files changed

+450
-662
lines changed

5 files changed

+450
-662
lines changed
 
Lines changed: 47 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,12 @@
1-
# CodeXGLUE -- NL-code-search-WebQuery
2-
3-
Here is the code and data for NL-code-search-WebQuery task.
1+
# CodeXGLUE -- Code Search (WebQueryTest)
42

53
## Task Description
64

7-
Code Search is aimed to find a code snippet which best matches the demand of the query. This task can be formulated in two scenarios: retrieval scenario and text-code classification scenario. In NL-code-search-WebQuery, we present the Code Search in text-code classification scenario.
5+
Code Search is aimed to find a code snippet which best matches the demand of the query. This task can be formulated in two scenarios: retrieval scenario and text-code classification scenario. In WebQueryTest , we present the Code Search in text-code classification scenario.
86

9-
In NL-code-search-WebQuery, a trained model needs to judge whether a code snippet answers a given natural language query, which can be formulated into a binary classification problem.
7+
In WebQueryTest, a trained model needs to judge whether a code snippet answers a given natural language query, which can be formulated into a binary classification problem.
108

11-
Most existing code search datasets use code documentations or questions from online communities for software developers as queries, which is still different from real user search queries. Therefore we provide NL-code-searhc-WebQuery testing set.
9+
Most existing code search datasets use code documentations or questions from online communities for software developers as queries, which is still different from real user search queries. Therefore we provide WebQueryTest testing set.
1210

1311
## Dependency
1412

@@ -19,9 +17,9 @@ Most existing code search datasets use code documentations or questions from o
1917

2018
## Data
2119

22-
Here we present NL-code-search-WebQuery dataset, a testing set of Python code search of 1,046 query-code pairs with code search intent and their human annotations. The realworld user queries are collected from Bing query logs and the code for queries are from CodeSearchNet. You can find our testing set in `./data/test_webquery.json` .
20+
Here we present WebQueryTest dataset, a testing set of Python code search of 1,046 query-code pairs with code search intent and their human annotations. The realworld user queries are collected from Bing query logs and the code for queries are from CodeSearchNet. You can find our testing set in `./data/test_webquery.json` .
2321

24-
Since there's no direct training set for our WebQueryTest set, we finetune the models on an external training set by using the documentation-function pairs in the training set o fCodeSearchNet AdvTest as positive instances. For each documentation, we also randomly sample 31 more functions to form negative instances. You can run the following command to download and preprocess the data:
22+
Since there's no direct training set for our WebQueryTest dataset, we finetune the models on an external training set by using the documentation-function pairs in the training set o CodeSearchNet AdvTest as positive instances. For each documentation, we also randomly sample 31 more functions to form negative instances. You can run the following command to download and preprocess the data:
2523

2624
```shell
2725
cd data
@@ -36,11 +34,11 @@ cd ..
3634

3735
#### Data statistics
3836

39-
Data statistics of NL-code-search-WebQuery are shown in the table below:
37+
Data statistics of WebQueryTest are shown in the table below:
4038

41-
| | #Examples |
42-
| ------------- | :-------: |
43-
| test_webquery | 1,046 |
39+
| | #Examples |
40+
| :----------: | :-------: |
41+
| WebQueryTest | 1,046 |
4442

4543

4644
## Fine-tuning
@@ -50,73 +48,81 @@ You can use the following command to finetune:
5048
```shell
5149
python code/run_classifier.py \
5250
--model_type roberta \
53-
--task_name webquery \
5451
--do_train \
5552
--do_eval \
5653
--eval_all_checkpoints \
57-
--train_file train_codesearchnet_31.json \
58-
--dev_file dev_codesearch_net.json \
54+
--train_file train_codesearchnet_7.json \
55+
--dev_file dev_codesearchnet.json \
5956
--max_seq_length 200 \
60-
--per_gpu_train_batch_size 32 \
61-
--per_gpu_eval_batch_size 32 \
57+
--per_gpu_train_batch_size 16 \
58+
--per_gpu_eval_batch_size 16 \
6259
--learning_rate 1e-5 \
6360
--num_train_epochs 3 \
64-
--gradient_accumulation_steps 1 \
61+
--gradient_accumulation_steps 1 \
6562
--warmup_steps 5000 \
66-
--overwrite_output_dir \
63+
--evaluate_during_training \
6764
--data_dir ./data/ \
68-
--output_dir ./model/ \
69-
--model_name_or_path microsoft/codebert-base \
70-
--config_name roberta-base
65+
--output_dir ./model \
66+
--encoder_name_or_path microsoft/codebert-base
7167

7268
```
7369

7470
## Evaluation
7571

76-
To test on the WebQuery testset, you run the following command. Also it will automatically generate predictions to `--prediction_file`.
72+
To test on the WebQueryTest, you run the following command. Also it will automatically generate predictions to `--prediction_file`.
7773

7874
```shell
7975
python code/run_classifier.py \
8076
--model_type roberta \
81-
--task_name webquery \
8277
--do_predict \
8378
--test_file test_webquery.json \
8479
--max_seq_length 200 \
8580
--per_gpu_eval_batch_size 2 \
86-
--data_dir ./data/ \
87-
--output_dir ./model/checkpoint-best/ \
88-
--model_name_or_path ./model/checkpoint-best/ \
89-
--pred_model_dir ./model/checkpoint-best/ \
90-
--test_result_dir ./model/test_results_webquery \
91-
--prediction_file ./evaluator/webquery_predictions.txt
81+
--data_dir ./data \
82+
--output_dir ./model/checkpoint-best-aver/ \
83+
--encoder_name_or_path microsoft/codebert-base \
84+
--pred_model_dir ./model/checkpoint-last/ \
85+
--prediction_file ./evaluator/webquery_predictions.txt
86+
9287
```
9388

94-
After generate predictions for WebQuery testset, you can use our provided script to evaluate:
89+
After generate predictions for WebQueryTest, you can use our provided script to evaluate:
9590

9691
```shell
9792
python evaluator/evaluator.py \
98-
--answers_webquery evaluator/webquery_answers.txt \
93+
--answers_webquery ./evaluator/webquery_answers.txt \
9994
--predictions_webquery evaluator/webquery_predictions.txt
10095
```
10196

10297
## Results
10398

104-
The results on NL-code-search-WebQuery are shown as below:
99+
The results on WebQueryTest are shown as below:
105100

106-
| testset | model | Precision | Recall | F1 | Accuracy |
107-
| :-----------: | :------: | :-------: | :----: | :---: | :------: |
108-
| test-WebQuery | RoBERTa | 49.50 | 70.62 | 58.20 | 58.64 |
109-
| test-WebQuery | CodeBERT | 49.92 | 75.12 | 59.98 | 59.56 |
101+
| dataset | model | F1 | Accuracy |
102+
| :----------: | :------: | :---: | :------: |
103+
| WebQueryTest | RoBERTa | 57.49 | 40.92 |
104+
| WebQueryTest | CodeBERT | 58.95 | 53.37 |
110105

111106
## Cite
112107

113-
If you use this code or our NL-code-search-WebQuery dataset, please considering citing CodeXGLUE:
108+
If you use this code or our WebQueryTest dataset, please considering citing CodeXGLUE and CodeSearchNet:
114109

115-
<pre><code>@article{CodeXGLUE,
110+
```
111+
@article{CodeXGLUE,
116112
title={CodeXGLUE: An Open Challenge for Code Intelligence},
117113
journal={arXiv},
118114
year={2020},
119-
}</code>
120-
</pre>
115+
}
116+
```
117+
118+
```
119+
@article{husain2019codesearchnet,
120+
title={Codesearchnet challenge: Evaluating the state of semantic code search},
121+
author={Husain, Hamel and Wu, Ho-Hsiang and Gazit, Tiferet and Allamanis, Miltiadis and Brockschmidt, Marc},
122+
journal={arXiv preprint arXiv:1909.09436},
123+
year={2019}
124+
}
125+
```
126+
121127

122128

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
import torch
2+
import torch.nn as nn
3+
import torch
4+
from torch.autograd import Variable
5+
import copy
6+
from transformers.modeling_bert import BertLayerNorm
7+
import torch.nn.functional as F
8+
from torch.nn import CrossEntropyLoss, MSELoss
9+
# from transformers import (WEIGHTS_NAME, AdamW, get_linear_schedule_with_warmup,
10+
# BertConfig, BertForMaskedLM, BertTokenizer,
11+
# GPT2Config, GPT2LMHeadModel, GPT2Tokenizer,
12+
# OpenAIGPTConfig, OpenAIGPTLMHeadModel, OpenAIGPTTokenizer,
13+
# RobertaConfig, RobertaModel, RobertaTokenizer,
14+
# DistilBertConfig, DistilBertForMaskedLM, DistilBertTokenizer)
15+
from transformers.modeling_utils import PreTrainedModel
16+
17+
18+
class Model(PreTrainedModel):
19+
def __init__(self, encoder, config, tokenizer, args):
20+
super(Model, self).__init__(config)
21+
self.encoder = encoder
22+
self.config = config
23+
self.tokenizer = tokenizer
24+
self.mlp = nn.Sequential(nn.Linear(768*4, 768),
25+
nn.Tanh(),
26+
nn.Linear(768, 1),
27+
nn.Sigmoid())
28+
self.loss_func = nn.BCELoss()
29+
self.args = args
30+
31+
def forward(self, code_inputs, nl_inputs, labels, return_vec=False):
32+
bs = code_inputs.shape[0]
33+
inputs = torch.cat((code_inputs, nl_inputs), 0)
34+
outputs = self.encoder(inputs, attention_mask=inputs.ne(1))[1]
35+
code_vec = outputs[:bs]
36+
nl_vec = outputs[bs:]
37+
if return_vec:
38+
return code_vec, nl_vec
39+
40+
logits = self.mlp(torch.cat((nl_vec, code_vec, nl_vec-code_vec, nl_vec*code_vec), 1))
41+
loss = self.loss_func(logits, labels.float())
42+
predictions = (logits > 0.5).int() # (Batch, )
43+
return loss, predictions
44+

0 commit comments

Comments
 (0)
Please sign in to comment.