-
Notifications
You must be signed in to change notification settings - Fork 34
Description
I tried to run this project and found some problems. Following the instructions in README.md, I installed requirements and used this line of command to run.
python run.py --run Balsa_JOBRandSplit --localThe only difference is I'm using PostgreSQL 13.5, and I modified default connection settings in pg_executor/pg_executor/pg_executor.py to fit it into my environment.
LOCAL_DSN = "dbname=imdb user=postgres"An error occurred when I tried to run it the first time, with following traceback:
Traceback (most recent call last):
File "run.py", line 2155, in <module>
app.run(Main)
File "/xxx/.conda/envs/balsa/lib/python3.7/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/xxx/.conda/envs/balsa/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "run.py", line 2151, in Main
agent.Run()
File "run.py", line 2100, in Run
has_timeouts = self.RunOneIter()
File "run.py", line 1831, in RunOneIter
model, dataset = self.Train()
File "run.py", line 1221, in Train
log=not train_from_scratch)
File "run.py", line 883, in _MakeDatasetAndLoader
skip_training_on_timeouts=p.skip_training_on_timeouts)
File "/xxx/balsa/balsa/experience.py", line 560, in featurize
skip_training_on_timeouts=skip_training_on_timeouts)
File "/xxx/balsa/balsa/experience.py", line 393, in _featurize_dedup
self.featurizer, all_subtrees)
File "/xxx/balsa/balsa/experience.py", line 38, in TreeConvFeaturize
plan_featurizer)
File "/xxx/balsa/balsa/models/treeconv.py", line 268, in make_and_featurize_trees
indexes = torch.from_numpy(_batch([_make_indexes(x) for x in trees])).long()
File "/xxx/balsa/balsa/models/treeconv.py", line 268, in <listcomp>
indexes = torch.from_numpy(_batch([_make_indexes(x) for x in trees])).long()
File "/xxx/balsa/balsa/models/treeconv.py", line 218, in _make_indexes
preorder_ids, _ = _make_preorder_ids_tree(root)
File "/xxx/balsa/balsa/models/treeconv.py", line 197, in _make_preorder_ids_tree
root_index=root_index + 1)
File "/xxx/balsa/balsa/models/treeconv.py", line 197, in _make_preorder_ids_tree
root_index=root_index + 1)
File "/xxx/balsa/balsa/models/treeconv.py", line 197, in _make_preorder_ids_tree
root_index=root_index + 1)
[Previous line repeated 1 more time]
File "/xxx/balsa/balsa/models/treeconv.py", line 199, in _make_preorder_ids_tree
root_index=lhs_max_id + 1)
File "/xxx/balsa/balsa/models/treeconv.py", line 197, in _make_preorder_ids_tree
root_index=root_index + 1)
File "/xxx/balsa/balsa/models/treeconv.py", line 199, in _make_preorder_ids_tree
root_index=lhs_max_id + 1)
File "/xxx/balsa/balsa/models/treeconv.py", line 198, in _make_preorder_ids_tree
rhs, rhs_max_id = _make_preorder_ids_tree(curr.children[1],
IndexError: list index out of range
I used print() to show curr, and found the experience loaded a Bitmap Heap Scan node.
print(curr, curr.children, root_index)
// Bitmap Heap Scan [movie_keyword AS mk] cost=1131.96
// Bitmap Index Scan [movie_keyword AS mk] cost=6.74
// [Bitmap Index Scan [movie_keyword AS mk] cost=6.74
//] 10I skipped the issue by directly consider the Bitmap Heap Scan node as a leaf node, but I found another error when I restarted run.py.
Traceback (most recent call last):
File "run.py", line 2155, in <module>
app.run(Main)
File "/xxx/.conda/envs/balsa/lib/python3.7/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/xxx/.conda/envs/balsa/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "run.py", line 2151, in Main
agent.Run()
File "run.py", line 2100, in Run
has_timeouts = self.RunOneIter()
File "run.py", line 1831, in RunOneIter
model, dataset = self.Train()
File "run.py", line 1221, in Train
log=not train_from_scratch)
File "run.py", line 883, in _MakeDatasetAndLoader
skip_training_on_timeouts=p.skip_training_on_timeouts)
File "/xxx/balsa/balsa/experience.py", line 560, in featurize
skip_training_on_timeouts=skip_training_on_timeouts)
File "/xxx/balsa/balsa/experience.py", line 393, in _featurize_dedup
self.featurizer, all_subtrees)
File "/xxx/balsa/balsa/experience.py", line 38, in TreeConvFeaturize
plan_featurizer)
File "/xxx/balsa/balsa/models/treeconv.py", line 269, in make_and_featurize_trees
_batch([_featurize_tree(x, node_featurizer) for x in trees
File "/xxx/balsa/balsa/models/treeconv.py", line 269, in <listcomp>
_batch([_featurize_tree(x, node_featurizer) for x in trees
File "/xxx/balsa/balsa/models/treeconv.py", line 255, in _featurize_tree
_bottom_up(curr_node)
File "/xxx/balsa/balsa/models/treeconv.py", line 249, in _bottom_up
left_vec = _bottom_up(curr.children[0])
File "/xxx/balsa/balsa/models/treeconv.py", line 249, in _bottom_up
left_vec = _bottom_up(curr.children[0])
File "/xxx/balsa/balsa/models/treeconv.py", line 249, in _bottom_up
left_vec = _bottom_up(curr.children[0])
[Previous line repeated 1 more time]
File "/xxx/balsa/balsa/models/treeconv.py", line 250, in _bottom_up
right_vec = _bottom_up(curr.children[1])
File "/xxx/balsa/balsa/models/treeconv.py", line 249, in _bottom_up
left_vec = _bottom_up(curr.children[0])
File "/xxx/balsa/balsa/models/treeconv.py", line 250, in _bottom_up
right_vec = _bottom_up(curr.children[1])
File "/xxx/balsa/balsa/models/treeconv.py", line 249, in _bottom_up
left_vec = _bottom_up(curr.children[0])
File "/xxx/balsa/balsa/models/treeconv.py", line 246, in _bottom_up
vec = node_featurizer.FeaturizeLeaf(curr)
File "/xxx/balsa/balsa/util/plans_lib.py", line 726, in FeaturizeLeaf
scan_operator_idx = np.where(self.scan_ops == node.node_type)[0][0]
IndexError: index 0 is out of bounds for axis 0 with size 0
I found the error is caused by scan methods not included in the parameter search_space_scan_ops. In method BalsaAgent._MakeWorkload() in run.py, JOB queries are loaded and PostgreSQL plans are obtained through explain (costs, format json). Loaded train_nodes are then used to initialize experience set (run.py, line 826).
Besides, I'm also confused about the procedure. As is introduced in the paper, Balsa bootstraps from a simulator and never uses an expert optimizer. So is it OK to just replace train_nodes with an empty list? And if I'm going to use a different split of train/test dataset, how can I train the model with the simulator to get a checkpoint?