Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with change to gelu #631

Open
bclyons12 opened this issue Mar 1, 2025 · 4 comments · May be fixed by #632
Open

Issue with change to gelu #631

bclyons12 opened this issue Mar 1, 2025 · 4 comments · May be fixed by #632

Comments

@bclyons12
Copy link

Cross posting ProjectTorreyPines/EPEDNN.jl#7
My suspicion is that making gelu a constant that aliases the name of another function breaks something in a previous made BSON file that's looking for #gelu. Instead of const gelu = gelu_tanh perhaps it should be soemthing like gelu(args...; kwargs...) = gelu_tanh(args...; kwargs...)

@ToucheSir
Copy link
Member

I don't think I've ever seen this come up with BSON before, and it seems extremely suspicious that it would do something that bypasses Julia's alias machinery. Are you able to construct a MWE that doesn't use this EPEDNN package?

@bclyons12
Copy link
Author

So if you can copy over data/EPED1NNmodel.bson from EPEDNN at least, you can reproduce this below. Note that it's looking for #gelu which I think is some compiler-generated function. It might not get generated if gelu now just resolves to gelu_tanh at compile time.

N.B., if you change to [email protected], it works fine.

(@v1.11) pkg> activate --temp
  Activating new project at `/var/folders/nt/ct_lf2n94_1c21908mmbx55w0000gq/T/jl_n5pEbL`

(jl_n5pEbL) pkg> add BSON Flux [email protected]
   Resolving package versions...
    Updating `/private/var/folders/nt/ct_lf2n94_1c21908mmbx55w0000gq/T/jl_n5pEbL/Project.toml`
  [fbb218c0] + BSON v0.3.9
  [587475ba] + Flux v0.16.3
  [872c559c] + NNlib v0.9.28
    Updating `/private/var/folders/nt/ct_lf2n94_1c21908mmbx55w0000gq/T/jl_n5pEbL/Manifest.toml`
  ...

julia> import BSON, Flux

julia> BSON.load("data/EPED1NNmodel.bson", Flux)
ERROR: UndefVarError: `#gelu` not defined in `NNlib`
Suggestion: check for spelling errors or missing imports.
Stacktrace:
  [1] (::BSON.var"#31#32")(m::Module, f::String)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/extensions.jl:35
  [2] BottomRF
    @ ./reduce.jl:86 [inlined]
  [3] _foldl_impl(op::Base.BottomRF{BSON.var"#31#32"}, init::Module, itr::SubArray{Any, 1, Vector{…}, Tuple{…}, true})
    @ Base ./reduce.jl:58
  [4] foldl_impl
    @ ./reduce.jl:48 [inlined]
  [5] mapfoldl_impl
    @ ./reduce.jl:44 [inlined]
  [6] _mapreduce_dim
    @ ./reducedim.jl:334 [inlined]
  [7] mapreduce
    @ ./reducedim.jl:329 [inlined]
  [8] reduce
    @ ./reducedim.jl:378 [inlined]
  [9] resolve(fs::Vector{Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/extensions.jl:35
 [10] (::BSON.var"#35#36")(d::Dict{Symbol, Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/extensions.jl:79
 [11] _raise_recursive(d::Dict{Symbol, Any}, cache::IdDict{Any, Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/read.jl:80
 [12] raise_recursive(d::Dict{Symbol, Any}, cache::IdDict{Any, Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/read.jl:93
 [13] (::BSON.var"#23#24"{IdDict{Any, Any}, Module})(x::Dict{Symbol, Any})
    @ BSON ~/.julia/packages/BSON/mzJoC/src/read.jl:98
 [14] applychildren!(f::BSON.var"#23#24"{IdDict{Any, Any}, Module}, x::Vector{Any})
    @ BSON ~/.julia/packages/BSON/mzJoC/src/BSON.jl:26
 [15] raise_recursive(v::Vector{Any}, cache::IdDict{Any, Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/read.jl:98
 [16] (::BSON.var"#17#20"{IdDict{Any, Any}, Module})(x::Vector{Any})
    @ BSON ~/.julia/packages/BSON/mzJoC/src/read.jl:80
 [17] applychildren!(f::BSON.var"#17#20"{IdDict{Any, Any}, Module}, x::Dict{Symbol, Any})
    @ BSON ~/.julia/packages/BSON/mzJoC/src/BSON.jl:19
--- the above 7 lines are repeated 2 more times ---
 [32] _raise_recursive(d::Dict{Symbol, Any}, cache::IdDict{Any, Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/read.jl:80
 [33] raise_recursive(d::Dict{Symbol, Any}, cache::IdDict{Any, Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/read.jl:93
 [34] (::BSON.var"#49#50")(d::Dict{Symbol, Any}, cache::IdDict{Any, Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/extensions.jl:193
 [35] raise_recursive(d::Dict{Symbol, Any}, cache::IdDict{Any, Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/read.jl:92
 [36] (::BSON.var"#23#24"{IdDict{Any, Any}, Module})(x::Dict{Symbol, Any})
    @ BSON ~/.julia/packages/BSON/mzJoC/src/read.jl:98
 [37] applychildren!(f::BSON.var"#23#24"{IdDict{Any, Any}, Module}, x::Vector{Any})
    @ BSON ~/.julia/packages/BSON/mzJoC/src/BSON.jl:26
 [38] raise_recursive(v::Vector{Any}, cache::IdDict{Any, Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/read.jl:98
 [39] (::BSON.var"#45#46"{IdDict{Any, Any}, Module})(x::Vector{Any})
    @ BSON ~/.julia/packages/BSON/mzJoC/src/extensions.jl:179
 [40] iterate
    @ ./generator.jl:48 [inlined]
 [41] collect_to!(dest::Vector{Vector{…}}, itr::Base.Generator{Vector{…}, BSON.var"#45#46"{…}}, offs::Int64, st::Int64)
    @ Base ./array.jl:849
 [42] collect_to_with_first!(dest::Vector{…}, v1::Vector{…}, itr::Base.Generator{…}, st::Int64)
    @ Base ./array.jl:827
 [43] _collect(c::Vector{…}, itr::Base.Generator{…}, ::Base.EltypeUnknown, isz::Base.HasShape{…})
    @ Base ./array.jl:821
 [44] collect_similar(cont::Vector{Any}, itr::Base.Generator{Vector{Any}, BSON.var"#45#46"{IdDict{Any, Any}, Module}})
    @ Base ./array.jl:720
 [45] map(f::Function, A::Vector{Any})
    @ Base ./abstractarray.jl:3371
 [46] newstruct_raw(cache::IdDict{Any, Any}, T::Type, d::Dict{Symbol, Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/extensions.jl:179
 [47] (::BSON.var"#49#50")(d::Dict{Symbol, Any}, cache::IdDict{Any, Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/extensions.jl:195
 [48] raise_recursive(d::Dict{Symbol, Any}, cache::IdDict{Any, Any}, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/read.jl:92
 [49] raise_recursive
    @ ~/.julia/packages/BSON/mzJoC/src/read.jl:103 [inlined]
 [50] load(x::String, init::Module)
    @ BSON ~/.julia/packages/BSON/mzJoC/src/read.jl:108
Some type information was truncated. Use `show(err)` to see complete types.

@ToucheSir
Copy link
Member

The reason to provide a MWE that doesn't require third-party packages is that serialization libraries like BSON might require those to be loaded if types from them are used. In any case, I created a simpler one:

using Flux, BSON

m = Dense(1, 1, gelu)
BSON.@save "model.bson" m

# In new session with newer NNlib:
BSON.load("model.bson")

For now, we can consider switching around the function and the alias. But this is not sustainable longer-term as we may want to use aliases to manage things like deprecated symbols.

To Carlo's point in ProjectTorreyPines/EPEDNN.jl#7 (comment), we created Flux.state to make model parameter sharing less susceptible to fragile magic employed by serialization libraries. That doesn't mean you should avoid serializing model layer types, but just be cognizant that you may have to do a process of load model in old version -> serialize state -> load state in new version when changes like this occur to the library (this is not a problem exclusive to FluxML).

@bclyons12
Copy link
Author

bclyons12 commented Mar 3, 2025

Thanks for looking into this and providing solutions. Making sure that some of my more ML-inlined colleagues are aware, in case they have thoughts or comments
@orso82 @tomneiser @jmcclena @mgyoo86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants