text2sql-data/systems/sequence-to-sequence at master · jkkummerfeld/text2sql-data

Name	Name	Last commit message	Last commit date
parent directory ..
bin	bin
docs	docs
embeddings_generation	embeddings_generation
experimental_configs	experimental_configs
sample_output	sample_output
seq2seq	seq2seq
.gitignore	.gitignore
LICENSE.tensorflow-source.txt	LICENSE.tensorflow-source.txt
LICENSE.txt	LICENSE.txt
README.md	README.md
config_builder.py	config_builder.py
encoder_input_canonicalizer.py	encoder_input_canonicalizer.py
make_schema_loc_file.py	make_schema_loc_file.py
new_format_processor.py	new_format_processor.py
prep_advising.sh	prep_advising.sh
preprocess_queries.py	preprocess_queries.py
quick_eval.py	quick_eval.py
run_all_predictions.sh	run_all_predictions.sh
setup.py	setup.py

Name

Last commit message

Last commit date

bin

docs

embeddings_generation

LICENSE.tensorflow-source.txt

LICENSE.txt

README.md

config_builder.py

encoder_input_canonicalizer.py

make_schema_loc_file.py

new_format_processor.py

prep_advising.sh

preprocess_queries.py

quick_eval.py

run_all_predictions.sh

setup.py

This directory contains the code for seq2seq with attention-based copying from the input sequence.

Requirements:

tensorflow==1.3.0 OR tensorflow-gpu==1.3.0
nltk==3.2.5 and punkt tokenizer models
numpy==1.13.1
PyYAML==3.11
scikit-learn==0.18.2
scipy==0.19.0
sqlparse==0.2.4

To run an example, starting from this directory:

export S2S_HOME=$(pwd)
./prep_advising.sh
python2 config_builder.py experimental_configs/example_config.yml
cd models/copy_input/advising_query_split/
./experiment.sh

Console output for the prep_advising script should look like the contents of sample_output/expected_prep_script_output.txt. Console output for the config_builder should look like sample_output/expected_config_builder_output.txt.

The experiment will take several hours to run. Its output will be in models/copy_input/advising_query_split/quick_eval.txt (dev set) and models/copy_input/advising_query_split/quick_eval_train.txt (train set). Examples of these output files are in sample_output/.

To run your own experiments, modify the prep_advising.sh to refer to the dataset of your choice, and adjust the hyperparameters the config YAML file. Note that the hyperparameters in example_config.yml are an example only. Actual hyperparameters used in the paper are available here.

If you use this code, please cite our ACL paper:

@InProceedings{data-sql-advising,
 author    = {Catherine Finegan-Dollak, Jonathan K. Kummerfeld, Li Zhang, Karthik Ramanathan, Sesh Sadasivam, Rui Zhang, and Dragomir Radev},
 title     = {Improving Text-to-SQL Evaluation Methodology},
 booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
 month     = {July},
 year      = {2018},
 address   = {Melbourne, Victoria, Australia},
 pages     = {351--360},
 url       = {http://aclweb.org/anthology/P18-1033},
}

This code is built on top of tf-seq2seq, which is documented here and available here as of July 2018.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

FilesExpand file tree

sequence-to-sequence

Directory actions

More options

Directory actions

More options

Latest commit

History

sequence-to-sequence

Folders and files

parent directory

README.md