Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: clarify what happens when unmarshaling into a non-empty interface{} #26946

Open
deuill opened this issue Aug 12, 2018 · 14 comments
Labels
Documentation Issues describing a change to documentation. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@deuill
Copy link

deuill commented Aug 12, 2018

What version of Go are you using (go version)?

go version go1.10.3 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/deuill/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/deuill/.go"
GORACE=""
GOROOT="/usr/lib/go"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build634949289=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Link to Play: https://play.golang.org/p/37E1QHWofMy

Passing a pointer to a struct (anonymous or not) as type interface{} to json.Unmarshal will have the original type replaced with a map[string]interface{} (correctly decoded, however). It appears the difference between correct and incorrect result is changing:

var nopointer interface{} = Object{}

to:

var nopointer = Object{}

In the example above. A interface{} containing a pointer to a struct works as expected, e.g.:

var pointer interface{} = &Object{}

is the same as:

var pointer = &Object{}

What did you expect to see?

The following results, in order of desirability:

  • The correct struct type (Object, for the example above) populated with data from the JSON object given.
  • An error denoting the inability to unmarshal into the underlying type given.
  • The behaviour above documented in the documentation for json.Unmarshal.

What did you see instead?

None of the above? The use case that led to this issue for me was giving constructors types, where these types would be used as "templates" of sorts for populating with data in subsequent requests. For example:

type Decoder interface {
    Decode(*http.Request) (interface{}, error)
}

type RequestHandler func(interface{}) (interface{}, error)

type Request struct {
    Value interface{}
}

func (r Request) Decode(r *http.request) (interface{}, error) {
    // Read body from request
    // ...

    if err := json.Unmarshal(body, &r.Value); err != nil {
        return nil, err
    }

    return r, nil
}

type User struct {
    Name string `json:"name"`
}

func NewHandler(decoder Decoder, handler RequestHandler) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        req, err := decoder.Decode(r)
        if err != nil {
            return
        }

        resp, err := handler(req)
        if err != nil {
            return
        }

        // Encode response into ResponseWriter etc.
        // ...
    }
}

It is implied, then, that NewHandler is called with the bare type:

NewHandler(Request{Value: User{}}, userHandler)

Which will give "fresh" values to the RequestHandler each time, as they emerge from Request.Decode. If given a pointer, i.e.:

NewHandler(Request{Value: &User{}}, userHandler)

Previous requests will populate the pointer to User, leaving garbage data for future requests.
Apologies for the perhaps obtuse example.

@deuill deuill changed the title encoding/json: Unmarshaling JSON object into interface{} holding non-pointer struct results in map[string]interface{} encoding/json: Unmarshaling JSON object into interface{} holding non-pointer struct results in map[string]interface{} Aug 12, 2018
@deuill
Copy link
Author

deuill commented Aug 12, 2018

Noted that this doesn't seem to be a regression or recent change, as the same behavior can be found on all Go versions I could test (>= 1.2).

The documentation for json.Unmarshal is sort of ambiguous as to whether it considers struct values in interface{} types to be one or the other, and the results are non-orthogonal/inconsistent between Marshal and Unmarshal:

https://play.golang.org/p/cBQv0DOrPhG

Also worth noting that this isn't just an issue when essentially passing *interface{}, as nested structures are subject to the same rules:

https://play.golang.org/p/HmZXqM1gDYs

@andybons
Copy link
Member

@dsnet @bradfitz

@andybons andybons added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 13, 2018
@andybons andybons added this to the Unplanned milestone Aug 13, 2018
@bradfitz bradfitz added the Documentation Issues describing a change to documentation. label Aug 13, 2018
@bradfitz bradfitz modified the milestones: Unplanned, Go1.12 Aug 13, 2018
@bradfitz
Copy link
Contributor

I suppose the Unmarshal docs could elaborate on what happens when a *interface{} is non-nil and pointing to a non-nil interface{} containing a pointer.

What did you expect to see?
The following results, in order of desirability:

  • The correct struct type (Object, for the example above) populated with data from the JSON object given.

I don't even think it's possible to do the first thing you want there. Once you put a struct value in an interface, it's no longer addressable and nothing can mutate it.

@deuill
Copy link
Author

deuill commented Aug 13, 2018

Correct, it doesn't seem we're able to modify struct values once placed in an interface{}, even if we were originally given a *interface{} or a pointer to the parent struct for which our interface{} is a field.

However, it appears it's possible to create a new zero value for the underlying type with reflect.New, and then assign the correct type into the interface{} (as you would the map[string]interface{} fallback). The logic happens here, and we'd have to check for v.IsNil or v.Elem().Kind() != reflect.Invalid in addition to v.NumMethod() == 0.

I can try and work on a patch if the current functionality is considered flawed, knowing that this is quite possibly a breaking change that is hard to check for statically or with go vet (however pointless it might be, relying on setting a struct and handling a map[string]interface{} on the other side), and that the encoding/json package is considered frozen.

I confess to a certain amount of confusion on the rules behind set-ability in the reflect package, as I try to avoid using it unless necessary, and had to re-read The Laws of Reflection and the documentation for the package itself, which did little to dispel my confusion for this particular case.

If the behavior cannot be changed, documenting it may help, as the underlying rules are somewhat subtle.

deuill added a commit to deuill/go that referenced this issue Sep 16, 2018
…ce types

This work extends the behaviour of `json.Unmarshal` to allow for setting into concrete types
embedded in `interface{}` struct fields. For example, the following JSON object:

    {"A": {"B": "C"}}

Being unmarshaled into the following structure:

	var t = struct{ A interface{} }{A: struct{ B string }{}}

Would have previously resulted in field `A` being replaced with a `map[string]interface{}`,
disregarding the underlying type given to the `interface{}` field, i.e.:

    struct { A interface {} }{A:map[string]interface {}{"B":"C"}}

Whereas, by resolving the underlying type of the `interface{}` field, we are able to set into and
return the correct type, i.e.:

    struct { A interface {} }{A:struct { B string }{B:"C"}}

These rules apply also to `map` types, for JSON objects, and arrays or slices, for JSON arrays. The
rules for setting into these fields are exactly the same as those that would apply to the bare types
themselves, i.e. attempting to unmarshal into an underlying type that is unable to store the values
presented in the JSON input will return a type error.

Noted that this is a breaking change for users that relied on having set a concrete type, but handle
a `map[string]interface{}` or `[]interface{}` on the other side.

Fixes golang#26946
deuill added a commit to deuill/go that referenced this issue Sep 16, 2018
This work extends the behaviour of json.Unmarshal to allow for setting into concrete types
embedded in interface{} struct fields. For example, the following JSON object:

    {"A": {"B": "C"}}

Being unmarshaled into the following structure:

	var t = struct{ A interface{} }{A: struct{ B string }{}}

Would have previously resulted in field A being replaced with a map[string]interface{},
disregarding the underlying type given to the interface{} field, i.e.:

    struct { A interface {} }{A:map[string]interface {}{"B":"C"}}

Whereas, by resolving the underlying type of the interface{} field, we are able to set into and
return the correct type, i.e.:

    struct { A interface {} }{A:struct { B string }{B:"C"}}

These rules apply also to map types, for JSON objects, and arrays or slices, for JSON arrays. The
rules for setting into these fields are exactly the same as those that would apply to the bare types
themselves, i.e. attempting to unmarshal into an underlying type that is unable to store the values
presented in the JSON input will return a type error.

Noted that this is a breaking change for users that relied on having set a concrete type, but handle
a map[string]interface{} or []interface{} on the other side.

Fixes golang#26946
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/135421 mentions this issue: encoding/json: resolve struct, map, array, or slice values in interface types

@mvdan mvdan added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Sep 17, 2018
@mvdan
Copy link
Member

mvdan commented Sep 17, 2018

I've re-labelled this issue as NeedsDecision so that we can decide if we want to apply the change in behavior proposed (and implemented) above.

@mvdan
Copy link
Member

mvdan commented Sep 17, 2018

Personally, I think that peeking into interface{} to see if there's an existing struct or map type to use seems like an okay change to make. I can't imagine why anyone would fill in a type like that before using the decoder, to then be surprised if that type ends up being used.

However, I can't help but wonder if this is really needed. Users already have several workarounds:

  • If they need to support varying types, keep on using map[string]interface{} with some extra boilerplate code
  • Similar to the solution above, use json.RawMessage to delay decoding into one of many specific types
  • Use a struct or map type directly, as long as that doesn't mean duplicating too many types

I realise that these options may result in lots of boilerplate in some situations, which is why I'm not opposed to the proposed change.

@dsnet
Copy link
Member

dsnet commented Sep 18, 2018

I think we should document this. The change in https://golang.org/cl/135421 seems like just teaching the json package about more special cases.

  • If structs, maps, array, or slice values are special, what about types that satisfy json.Unmarshaler, but do not happen to be addressable. Should those types be newly created, have UnmarshalJSON called on them and stored back into the interface?
  • The new behavior is inconsistent with regard the behavior of pre-populated structs of non-interfaces, where the contents are merged into the existing struct. In order to preserve behavior similar to that, you need copy the struct, mutate the copy, and store it back. However, the json package should avoid assuming that it is safe to shallow copy a value.

@mvdan
Copy link
Member

mvdan commented Oct 1, 2018

Thanks for pitching in, @dsnet. You're right that the CL above would make the JSON package less consistent.

@deuill do you want to add anything else, before we just amend the documentation to be clearer?

@deuill
Copy link
Author

deuill commented Oct 1, 2018

Nope, updating the documentation to cover these cases will help with avoiding confusion in the future. Agreed that the updated implementation is inconsistent with expectations when structs provided are pre-populated.

@mvdan mvdan added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Oct 1, 2018
@mvdan mvdan changed the title encoding/json: Unmarshaling JSON object into interface{} holding non-pointer struct results in map[string]interface{} encoding/json: clarify what happens when unmarshaling into a non-empty interface{} Oct 1, 2018
@mvdan
Copy link
Member

mvdan commented Oct 1, 2018

Thanks - I've abandoned the CL and repurposed this issue. Feel free to send a godoc CL to fix this issue, and we can take it from there.

@deuill
Copy link
Author

deuill commented Oct 1, 2018

Will do, thanks!

@ianlancetaylor ianlancetaylor modified the milestones: Go1.12, Go1.13 Dec 11, 2018
@andybons andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019
@jxsl13
Copy link

jxsl13 commented Mar 31, 2022

Unmarshalling into non-empty interfaces would help in this case a lot:
https://go.dev/play/p/STQZhHDzqQd

package main

import (
	"encoding/json"
	"fmt"
)

var typeMap = map[string]TypedInterface{
	"myType": MyImpl{}, // multiple implementations with different properties but same meta data (with different ObjectTypes)
}

type Properties struct {
	P1 []string
	P2 string
}

type MetaData struct {
	ObjectType string
}

func (md MetaData) Type() string {
	return md.ObjectType
}

type TypedInterface interface {
	Type() string
}

type MyImpl struct {
	MetaData
	Properties Properties
}

func main() {
	from := MyImpl{
		MetaData: MetaData{"myType"},
		Properties: Properties{
			P1: []string{"P11", "P12"},
			P2: "P2",
		},
	}
	data, err := json.Marshal(from)
	if err != nil {
		fmt.Println(err)
		return
	}

        // the type string can be determined by unmarshaling the meta data object 
       // and take the ObjectType field or Type() method
	to := typeMap["myType"] 
	err = json.Unmarshal(data, &to)
	if err != nil {
		fmt.Printf("from: %T, to: %T\n", from, to)
		fmt.Println(err)
		return
	}
	fmt.Printf("%v\n", to)
}

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/619995 mentions this issue: encoding/json: fix Unmarshal of value assigned to an interface variable.

moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Issues describing a change to documentation. NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

9 participants