Conversation
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
| export frule, rrule # core function | ||
| # rule configurations | ||
| export RuleConfig, HasReverseMode, NoReverseMode, HasForwardsMode, NoForwardsMode | ||
| export RuleConfig, HasReverseMode, NoReverseMode, HasForwardsMode, NoForwardsMode, HasChunkedMode, NoChunkedMode |
There was a problem hiding this comment.
[JuliaFormatter] reported by reviewdog 🐶
| export RuleConfig, HasReverseMode, NoReverseMode, HasForwardsMode, NoForwardsMode, HasChunkedMode, NoChunkedMode | |
| export RuleConfig, | |
| HasReverseMode, | |
| NoReverseMode, | |
| HasForwardsMode, | |
| NoForwardsMode, | |
| HasChunkedMode, | |
| NoChunkedMode |
| function frule(::RuleConfig{>:HasChunkedMode}, | ||
| (Δf, Δx)::Tuple{Any,ProductTangent}, f, args...) |
There was a problem hiding this comment.
[JuliaFormatter] reported by reviewdog 🐶
| function frule(::RuleConfig{>:HasChunkedMode}, | |
| (Δf, Δx)::Tuple{Any,ProductTangent}, f, args...) | |
| function frule( | |
| ::RuleConfig{>:HasChunkedMode}, (Δf, Δx)::Tuple{Any,ProductTangent}, f, args... | |
| ) |
| function rrule(::RuleConfig{>:HasChunkedMode}, args...) | ||
| y, back = rrule(args...) | ||
| return y, ApplyBack(back) | ||
| end |
There was a problem hiding this comment.
I know I suggested this RuleConfig idea, but I'm not sure it fits. If there is an un-chunked rrule which needs the configuration, then this rrule(args...) can't call it.
I think it might be cleanest to have a new function, whose fallback is chunked_rrule(config, f, args...) = rrule(config, f, args...). Any AD which may make chunks should always call this, and any AD rule to exploit chunks will overload it.
Not sure how that intersects with @opt_out. Maybe there are other weird corners to think about too.
There was a problem hiding this comment.
yeah. We might need it to be separate. It feels clunky, but this method also doesn't seem great.
There was a problem hiding this comment.
Hmm ideally we want to hit the version with this particular trait dropped, and other traits preserved (like HasReverseMode etc)
We should be able to achieve that with invoke, but can we do it elegantly?
| struct ProductTangent{P} | ||
| partials::P | ||
| end |
There was a problem hiding this comment.
How much is gained from this kind of chunking? I mean storing several partials in some tuple. For reverse mode, nothing, I think. For forward mode, maybe derivatives_using_output has the same benefits.
The big gain seems to be from making a solid matrix to represent many vectors, and e.g. getting matrix multiplication. So I think the initial design ought to make that work.
There was a problem hiding this comment.
This design does this (I think). partials here can either be a Tuple{Tuple} or an eachrow(Matrix)
There was a problem hiding this comment.
Ok, I guess a vector of vectors can have the same meaning as an EachCol. Just a different path, by dispatch on ProductTangent{<:EachSlice}.
|
Having some tests would prove illuminating I think |
based on discussion JuliaDiff/Diffractor.jl#54 and in slack. This is an initial implementation of forward and reverse chunked mode AD.