Add LazyKernelMatrix and lazykernelmatrix #515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

sethaxen wants to merge 4 commits into JuliaGaussianProcesses:master

from sethaxen:lazykernelmat

Draft

Add LazyKernelMatrix and lazykernelmatrix #515

sethaxen wants to merge 4 commits into JuliaGaussianProcesses:master from sethaxen:lazykernelmat

Conversation

@sethaxen

Copy link

@sethaxen sethaxen commented Jun 2, 2023

Summary

This PR introduces functionality for lazily representing kernel matrices, which is necessary when the matrix might be too large to store in memory. Fixes #514

Proposed changes

new API function lazykernelmatrix: supports similar semantics as kernelmatrix but constructs a lazy representation
new AbstractMatrix subtype LazyKernelMatrix, constructed for the lazykernelmatrix default.

What alternatives have you considered?

I considered using one of the many existing array types in the ecosystem for lazily representing an array, but defining a novel type allows us to do things like perform scalar multiplication or add kernel matrices without densifying the array.
it is not strictly necessary to introduce lazykernelmatrix, but this could allow even more structured matrix types to be defined for specific kernel types and returned in the future.
perhaps LazyKernelMatrix should also store obsdim1 and obsdim2? Currently we require the user has passed a vector e.g. RowVecs or ColVecs.
perhaps we should support y being a nothing to define a symmetric kernel matrix? This would allow a couple checks to be done at compile time. In particular we could support +(::LazyKernelMatrix{T,Tk,Tx,Nothing}, ::Diagonal) -> LazyKernelMatrix.

To-Do

add tests

sethaxen added 4 commits

June 2, 2023 14:54

@sethaxen


 Add LazyKernelMatrix and lazykernelmatrix

ddc8994

@sethaxen


 Update API docs

70ddd03

@sethaxen


 Update kernelmatrix docstring

32f17d6

@sethaxen


 Update more docs

4d7d0b2

@codecov

Copy link

codecov bot commented Jun 2, 2023 •

edited

Loading

Codecov Report

Patch coverage has no change and project coverage change: -6.52 ⚠️

Comparison is base (ef6d459) 94.16% compared to head (4d7d0b2) 87.64%.

Additional details and impacted files

@@ Coverage Diff @@
## master #515 +/- ##
==========================================
- Coverage 94.16% 87.64% -6.52% 
==========================================
 Files 52 53 +1 
 Lines 1387 1433 +46 
==========================================
- Hits 1306 1256 -50 
- Misses 81 177 +96

Impacted Files	Coverage Δ
src/KernelFunctions.jl	`100.00% <ø> (ø)`
src/matrix/kernelmatrix.jl	`100.00% <ø> (ø)`
src/matrix/lazykernelmatrix.jl	`0.00% <0.00%> (ø)`

... and 9 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@devmotion

Copy link

Member

devmotion commented Jun 2, 2023

this could allow even more structured matrix types to be defined for specific kernel types and returned in the future

Yes, I imagine this would be useful also in the context of #93 (comment). Though there maybe often the non-lazy version might be sufficient (or even preferable).

devmotion

devmotion reviewed

Jun 2, 2023

View reviewed changes

src/KernelFunctions.jl

module KernelFunctions

export kernelmatrix, kernelmatrix!, kernelmatrix_diag, kernelmatrix_diag!

export LazyKernelMatrix, lazykernelmatrix

Copy link

Member

@devmotion devmotion Jun 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to export both? Is lazykernelmatrix sufficient maybe?

Copy link

Author

@sethaxen sethaxen Jun 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably.

@theogf

Copy link

Member

It looks really nice already! Only thing which is a bit uneasy is the output type... I am not sure there is a strong guarantee that the first evaluated type would correspond to the rest of the matrix. But I also don't see how it could be solved easily...

devmotion

devmotion reviewed

Jun 2, 2023

View reviewed changes

src/matrix/lazykernelmatrix.jl

The result is a matrix with the same entries as [`kernelmatrix(κ, x)`](@ref) but where the

entries are not computed until they are needed.

"""

lazykernelmatrix(κ::Kernel, x) = lazykernelmatrix(κ, x, x)

Copy link

Member

@devmotion devmotion Jun 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to optimize this for the symmetric case, IMO, similar to kernelmatrix (which IIRC often does not use such a fallback but more optimized methods).

Copy link

@FelixBenning FelixBenning Jun 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optimization for the symmetric case is only calculating half the matrix lets say the top half and then redirecting all queries from the bottom half to the top half. (Actually distances simply copies the top half into the bottom half).

Since this is lazy, there is probably no point in this optimization because you do not do the calculation from the start. And when you call getindex it does not matter whether you calculate the element in the top or bottom half.

Copy link

Member

@devmotion devmotion Jun 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least with other lazy iterators in Base it's a common pattern to collect results at some point (e.g., after filtering, mapping, etc.). In this case it seems beneficial to know that the lazy matrix is symmetric.

Copy link

Author

@sethaxen sethaxen Jun 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll see if there are ops we can optimize without too much extra code complexity.

Copy link

@FelixBenning FelixBenning Jun 2, 2023 •

edited

Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an interface for that? I mean you could use LinearAlgebra.Symmetric since that just wraps the original matrix afaik, but it also simply redirects queries so collect would still cause two calculations since you do a calculation per query.

I mean you could just specialize collect I guess

@FelixBenning

Copy link

FelixBenning commented Jun 2, 2023 •

edited

Loading

Relevant Part

For this a lazy ProductArray would also be a neat abstraction:

×ばつ3 ProductArrays.ProductArray{Tuple{Vector{Float64}, Vector{Float64}}, Tuple{Float64, Float64}, 2}: (0.417572, 0.417572) (0.417572, 0.399727) (0.417572, 0.997073) (0.399727, 0.417572) (0.399727, 0.399727) (0.399727, 0.997073) (0.997073, 0.417572) (0.997073, 0.399727) (0.997073, 0.997073)">

julia> v = rand(3)
3-element Vector{Float64}:
 0.417571623820013
 0.39972694171008405
 0.9970727095536318
julia> productArray(v,v)
3×ばつ3 ProductArrays.ProductArray{Tuple{Vector{Float64}, Vector{Float64}}, Tuple{Float64, Float64}, 2}:
 (0.417572, 0.417572) (0.417572, 0.399727) (0.417572, 0.997073)
 (0.399727, 0.417572) (0.399727, 0.399727) (0.399727, 0.997073)
 (0.997073, 0.417572) (0.997073, 0.399727) (0.997073, 0.997073)

now the lazy kernelmatrix is simply a lazy mappedarray of this product array. As mappedarray has a field for its mapping:

struct ReadonlyMappedArray{T,N,A<:AbstractArray,F} <: AbstractMappedArray{T,N}
 f::F
 data::A
end

you can write custom code for when F<:Kernel. This would allow you to add + as you do here

Synergy tangent

This is neat, because the productArray abstraction is also helpful as a multioutput abstraction:

julia> vec(productArray(v,1:2))
6-element reshape(::ProductArrays.ProductArray{Tuple{Vector{Float64}, UnitRange{Int64}}, Tuple{Float64, Int64}, 2}, 6) with eltype Tuple{Float64, Int64}:
 (0.417571623820013, 1)
 (0.39972694171008405, 1)
 (0.9970727095536318, 1)
 (0.417571623820013, 2)
 (0.39972694171008405, 2)
 (0.9970727095536318, 2)

cf. https://github.com/lazyLibraries/ProductArrays.jl (not yet a package JuliaRegistries/General#84683)

@devmotion

Copy link

Member

devmotion commented Jun 2, 2023

I have the same feeling as @sethaxen:

I considered using one of the many existing array types in the ecosystem for lazily representing an array, but defining a novel type allows us to do things like perform scalar multiplication or add kernel matrices without densifying the array.

A separate dedicated type provides more information about the context and allows us to define dispatches, special operations and optimizations that might be less relevant or not well defined in the general case.

@FelixBenning

Copy link

FelixBenning commented Jun 2, 2023

A separate dedicated type provides more information about the context and allows us to define dispatches, special operations and optimizations that might be less relevant or not well defined in the general case.

I edited my reply to explain how you can still do special dispatches

@FelixBenning

Copy link

FelixBenning commented Jun 2, 2023 •

edited

Loading

I actually found this pull request because I wanted to ask: does it ever make sense for kernelmatrix to be eager? I mean memory access is expensive and you take $n^2$ space instead of the natural $n$ for just the vector. And kernelmatrices are often only accessed once (e.g. when you want take its cholesky decomposition) and afterwards tossed. Are some kernels really so expensive to evaluate to justify this?

@FelixBenning

Copy link

FelixBenning commented Jun 2, 2023

The Transform test seem very much broken to me, it breaks here, #511 and #510 in pretty much the same way. And those have nothing to do with each other.

@sethaxen

Copy link

Author

sethaxen commented Jun 2, 2023

Only thing which is a bit uneasy is the output type... I am not sure there is a strong guarantee that the first evaluated type would correspond to the rest of the matrix. But I also don't see how it could be solved easily...

I wonder of it's safe to ask Julia to infer the return type of kernelmatrix and use that to infer the eltype.

@FelixBenning

Copy link

FelixBenning commented Jun 2, 2023

Only thing which is a bit uneasy is the output type... I am not sure there is a strong guarantee that the first evaluated type would correspond to the rest of the matrix. But I also don't see how it could be solved easily...

I wonder of it's safe to ask Julia to infer the return type of kernelmatrix and use that to infer the eltype.

mappedarrays does eltype inference 🤷

@willtebbutt

Copy link

Member

willtebbutt commented Jun 2, 2023 •

edited

Loading

A separate option would be to be a little less ambitious with this initial implementation, and not implement a new matrix type, and just provide an interface for the operation that we want.

Specifically, add the following function to the interface:

function kernel_matrix_vector_product(k::Kernel, x::AbstractVector, y::AbstractVector, v::AbstractVector{<:Real})
 return kernelmatrix(k, x, y) * v
end
kernel_matrix_vector_product(k::Kernel, x::AbstractVector, v::AbstractVector{<:Real}) = kernelmatrix(k, x) * v

where the above methods are the default implementations and specify the semantics.
For large problems we could wrap the kernel in some other type which says "don't ever instantiate me", and implements a low-memory version of this operation.

The nice thing about doing things this way is that we avoid having to e.g. guess at the output type of kernelmatrix, or whatever. It's a little verbose, but maybe it covers our initial requirements?

Does this cover our needs? I guess really I'm just asking whether we actually need to go to the trouble of implementing a new matrix type, and can instead just implement a single function that can be overloaded. If there is a range of functionality that we require, then maybe a matrix type is needed, but if we really only have one operation in mind, maybe it's not?

edit: or we could add an additional argument to the functioon that says how things should be computed when you're doing block-wise operations. e.g. kernel_matrix_vector_product(::ChunkSize, ::Kernel, ::AbstractVector, ::AbstractVector{<:Real})

@FelixBenning

Copy link

FelixBenning commented Jun 6, 2023 •

edited

Loading

ProductArrays v1.0.0 is now online, so you could do

using MappedArrays: mappedarray, ReadonlyMappedArray
using ProductArrays: productArray, ProductArray
struct Splat{T} 
 func::T
end
(s::Splat)(x) = s.func(x...)
lazykernelmatrix(k::Kernel, x, y) = mappedarray(Splat(k), productArray(x,y))
const LazyKernelMatrix{K<:Kernel, T<:Real, P<:ProductArray} = ReadonlyMappedArray{T, 2, P, Splat{K}}

and then implement additional functionality for LazyKernelmatrix (you would get a ton of functionality from MappedArrays for free (like Eltype inference)

×ばつ3 mappedarray(Splat{SqExponentialKernel{Distances.Euclidean}}(Squared Exponential Kernel (metric = Distances.Euclidean(0.0))), ::ProductArray{Tuple{Vector{Float64}, Vector{Float64}}, Tuple{Float64, Float64}, 2}) with eltype Float64: 1.0 0.879512 0.998656 0.879512 1.0 0.901716 0.998656 0.901716 1.0 julia> a isa LazyKernelMatrix true julia> eltype(a) Float64 julia> a isa AbstractArray true julia> dump(a) # very readable structure ReadonlyMappedArray{Float64, 2, ProductArray{Tuple{Vector{Float64}, Vector{Float64}}, Tuple{Float64, Float64}, 2}, Splat{SqExponentialKernel{Distances.Euclidean}}} f: Splat{SqExponentialKernel{Distances.Euclidean}} func: SqExponentialKernel{Distances.Euclidean} metric: Distances.Euclidean thresh: Float64 0.0 data: ProductArray{Tuple{Vector{Float64}, Vector{Float64}}, Tuple{Float64, Float64}, 2} prodIt: Base.Iterators.ProductIterator{Tuple{Vector{Float64}, Vector{Float64}}} iterators: Tuple{Vector{Float64}, Vector{Float64}} 1: Array{Float64}((3,)) [0.6995475228324576, 0.19281640447854786, 0.6476909300916908] 2: Array{Float64}((3,)) [0.6995475228324576, 0.19281640447854786, 0.6476909300916908]">

julia> a = mappedarray(Splat(k), productArray(x,x))
3×ばつ3 mappedarray(Splat{SqExponentialKernel{Distances.Euclidean}}(Squared Exponential Kernel (metric = Distances.Euclidean(0.0))), ::ProductArray{Tuple{Vector{Float64}, Vector{Float64}}, Tuple{Float64, Float64}, 2}) with eltype Float64:
 1.0 0.879512 0.998656
 0.879512 1.0 0.901716
 0.998656 0.901716 1.0
julia> a isa LazyKernelMatrix
true
julia> eltype(a)
Float64
julia> a isa AbstractArray
true
julia> dump(a) # very readable structure
ReadonlyMappedArray{Float64, 2, ProductArray{Tuple{Vector{Float64}, Vector{Float64}}, Tuple{Float64, Float64}, 2}, Splat{SqExponentialKernel{Distances.Euclidean}}}
 f: Splat{SqExponentialKernel{Distances.Euclidean}}
 func: SqExponentialKernel{Distances.Euclidean}
 metric: Distances.Euclidean
 thresh: Float64 0.0
 data: ProductArray{Tuple{Vector{Float64}, Vector{Float64}}, Tuple{Float64, Float64}, 2}
 prodIt: Base.Iterators.ProductIterator{Tuple{Vector{Float64}, Vector{Float64}}}
 iterators: Tuple{Vector{Float64}, Vector{Float64}}
 1: Array{Float64}((3,)) [0.6995475228324576, 0.19281640447854786, 0.6476909300916908]
 2: Array{Float64}((3,)) [0.6995475228324576, 0.19281640447854786, 0.6476909300916908]

if you want I can also write a convenience method to skip prodIt and get the underlying iterators straight from the ProductArray wrapper around ProductIterator.

@sethaxen sethaxen mentioned this pull request

Jun 22, 2023

Using a lazy representation of the kernel matrix SebastianCallh/IterGP.jl#8

Open

Labels

None yet

5 participants

@sethaxen @devmotion @theogf @FelixBenning @willtebbutt

Uh oh!

Add LazyKernelMatrix and lazykernelmatrix #515

Are you sure you want to change the base?

Add LazyKernelMatrix and lazykernelmatrix #515

Uh oh!

Conversation

@sethaxen sethaxen commented Jun 2, 2023

Uh oh!

codecov bot commented Jun 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

devmotion commented Jun 2, 2023

Uh oh!

@devmotion devmotion Jun 2, 2023

Choose a reason for hiding this comment

Uh oh!

@sethaxen sethaxen Jun 2, 2023

Choose a reason for hiding this comment

Uh oh!

theogf commented Jun 2, 2023

Uh oh!

@devmotion devmotion Jun 2, 2023

Choose a reason for hiding this comment

Uh oh!

@FelixBenning FelixBenning Jun 2, 2023

Choose a reason for hiding this comment

Uh oh!

@devmotion devmotion Jun 2, 2023

Choose a reason for hiding this comment

Uh oh!

@sethaxen sethaxen Jun 2, 2023

Choose a reason for hiding this comment

Uh oh!

@FelixBenning FelixBenning Jun 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FelixBenning commented Jun 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relevant Part

Synergy tangent

Uh oh!

devmotion commented Jun 2, 2023

Uh oh!

FelixBenning commented Jun 2, 2023

Uh oh!

FelixBenning commented Jun 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FelixBenning commented Jun 2, 2023

Uh oh!

sethaxen commented Jun 2, 2023

Uh oh!

FelixBenning commented Jun 2, 2023

Uh oh!

willtebbutt commented Jun 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FelixBenning commented Jun 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov bot commented Jun 2, 2023 •

edited

Loading

@FelixBenning FelixBenning Jun 2, 2023 •

edited

Loading

FelixBenning commented Jun 2, 2023 •

edited

Loading

FelixBenning commented Jun 2, 2023 •

edited

Loading

willtebbutt commented Jun 2, 2023 •

edited

Loading

FelixBenning commented Jun 6, 2023 •

edited

Loading