-
Notifications
You must be signed in to change notification settings - Fork 417
Add TensorBoard logging for AutoTuner sweep mode #3780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TensorBoard logging for AutoTuner sweep mode #3780
Conversation
* Introduced `TensorBoardLogger` class for logging metrics during sweeps. * Updated `sweep` function to integrate TensorBoard logging. * Enhanced `consumer` function to log metrics after each parameter run. Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
|
@jeffng-or Back-ported the feature, could you please checkout this branch and let me know if it works? |
Great, thanks! I will check it out and let you know how it goes. |
|
It looks like the code is trying to write the SDC file into tools/AutoTuner/src/constraint.sdc, which isn't writable and also not in a trial-specific directory: I'm running within a docker container where I've mounted the tools/AutoTuner/src/autotuner directory, but not tools/AutoTuner/src. So, the src directory is not writable. Here's the script that I use to start the container: Here's the Dockerfile that I used to build the autotuner:1.0 container: To build the docker image: To start the container: Within the container: |
vvbandeira
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luarss
Please address Jeff's concerns and request a new review when he is satisfied.
|
So, here are some differences that I see between tune and sweep:
Maybe we should be writing the SDC file under the experiment directory, which would be under flow/logs? At least we'd know that the directory is writable. After I make the change, the AT starts running trials. As it's running, I'm noticing the following:
|
|
The job ran overnight without completing, so there's something off. Please use the following flow for testing:
autotuner.json Once it works for you, I can try again. |
…ing sdc files to correct dir Signed-off-by: Jack Luar <jluar@precisioninno.com>
56771e4 to
f1f1920
Compare
Signed-off-by: Jack Luar <jluar@precisioninno.com>
|
Great, thanks! I'm able to run through a sweep with rapidus2hp gcd and view the data in TensorBoard. The "score" should roughly match the "metric" in tune mode. Check out the metric name and calculation in the evaluate() method (basically metric should be 100*effective_clk_period plus some other stuff) Other than that, it's good to go. I'll update my chart generation code to key off the sweep results. |
|
Frequency sweep is working fine. I've ported my chart generator to use the sweep directory organization. So, make the update to the metrics calculation and I think we can call that done. One quirk that I found when trying to run the physical sweep is that the choice of strings isn't supported in the sweep. My autotuner_phys.json looks like: But, when I run it: likely because the PLACE_SITE doesn't have a minmax. How do we enable this in sweep mode? |
The expected config here would be to use "choice" and not "string", which I don't think that the AT sweep supports in this version. The easy solution is to run AT twice since there are only two possible values for that variable. If you would like to expand to N-possibilieities we should be able to add support as well in another PR. |
Signed-off-by: Jack Luar <jluar@precisioninno.com>
OK, that makes sense. I'd prefer to not have to run AT twice, so let me file another GH issue for choice/string support. That way we get enable this PR, which works. Filed: #3809 |
Signed-off-by: Jack Luar <jluar@precisioninno.com>
|
@jeffng-or Fixed the scoring, could you please check? We should be using the same scoring module for both tune/sweep now. |
Yeah, it looks like it works. For consistency, can we change the name of "score" to "metric" to match the tune mode? Also, I noticed that if the score/metric is 9e99, the resulting score in tensorboard is shown as 0. |
|
Sure, can change the key. Should I show it as 9e99 or 0? |
9e99 is prob best. Users can filter it out, if they want. |
* metrics: show error scores as 9e99 Signed-off-by: Jack Luar <jluar@precisioninno.com>
|
@jeffng-or Could you please retry with the latest diffs? |
|
The metric name change looks fine. I've been running with this and the string choice code for the past couple of days without a problem. Looks good to go. Thanks! |

TensorBoardLoggerclass for logging metrics during sweeps.sweepfunction to integrate TensorBoard logging.consumerfunction to log metrics after each parameter run.