-
Notifications
You must be signed in to change notification settings - Fork 36
01. User Guide
-
Create and activate a conda environment with Python 3.13:
conda create -n repoaudit python=3.13 conda activate repoaudit
-
Install the required dependencies:
cd RepoAudit pip install -r requirements.txt -
Ensure you have the Tree-sitter library and language bindings installed:
cd lib python build.py -
Configure the OpenAI API key and Anthropic API key:
export OPENAI_API_KEY=xxxxxx >> ~/.bashrc export ANTHROPIC_API_KEY=xxxxxx >> ~/.bashrc
-
We have prepared several benchmark programs in the
benchmarkdirectory for a quick start. Some of these are submodules, so you may need to initialize them using the following commands:cd RepoAudit git submodule update --init --recursive -
We provide the script
src/run_repoaudit.shto scan files in thebenchmark/Java/toy/NPDdirectory. You can run the following commands:cd src sh run_repoaudit.sh -
After the scanning is complete, you can check the resulting JSON and log files.
For a large repository, a sequential analysis process may be quite time-consuming. To accelerate the analysis, you can choose parallel auditing. Specifically, you can set the option --max-neural-workers to a larger value. By default, this option is set to 6 for parallel auditing.
Also, we have set the parsing-based analysis in a parallel mode by default. The default maximal number of workers is 10.
We also provide a web interface to assist the users in checking bug reports generated by RepoAudit. You can execute the following commands to start the Web UI.
> cd RepoAudit
> streamlit run src/ui/web_ui.py
Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
You can now view your Streamlit app in your browser.
Local URL: http://localhost:8505
Network URL: http://10.145.21.28:8505
External URL: http://128.210.0.165:8505
Open the webpage via one of the above links. You will see the following UI page.

By clicking each bug report, you can further examine the functions in the bug trace and the LLM-generated explanation. Lastly, we can classify it into TP/FP/Unknown and the labled results are stored locally in your machine.
