reload!() doesn't publish DRAKE_VIEWER_ADD_ROBOT message on Linux #19

tkoolen · 2016-12-13T21:43:33Z

I think this may have to do with maximum UDP package size.

This is with the Valkyrie model. Everything works correctly on OSX. Also, on Linux, if I just do

lcm = PyLCM.LCM()
msg = DrakeVisualizer.drakevis[:lcmt_viewer_load_robot]()
PyLCM.publish(lcm, "DRAKE_VIEWER_ADD_ROBOT", msg)

bot-spy shows that the message is indeed being published, but the message DrakeVisualizer.reload() tries to publish never gets picked up by bot-spy or drake-visualizer. If I change reload! to only add e.g. the first 60 links (out of 81) added to msg.link, then everything is fine as well.

The text was updated successfully, but these errors were encountered:

tkoolen · 2016-12-13T22:05:37Z

The limit seems to be 66 links for Valkyrie. With 66 links, sizeof(msg[:encode]()) is 264544 bytes, with 65 it is 261910 bytes. Both are well over the IPv4 max packet length. And it doesn't seem like there's something wrong with link 66 specifically, because if I just add that one, the message publishes fine.

rdeits · 2016-12-13T22:23:55Z

Crap. I've probably never seen this because all of my computers have their UDP packet sizes turned up for various DRC-related reasons. The issue is presumably because we're pushing the mesh data through with the load_robot message (whereas Drake just sends the filenames). I really really don't want to have to serialize every mesh to disk in order to display it.

A few options:

Write meshes to disk (or to ramdisk or something?) and then send filenames like Drake does. I'm not fond of this
Force users to tweak their MTU settings (if they're using openhumanoids, then they probably already have). This is an obnoxious burden on users, though.
Create an add_link message and publish 81 of those instead of one big load_robot message. This should work fine, but requires more changes to drake-visualizer. It will also fail if one link has a really really really big mesh (as in, one that's ~60 times bigger than Val's meshes).
Switch to a more serious messaging layer like ZeroMQ which knows how to break up large messages. This would require significant changes to drake-visualizer, but would also let us confirm that there is a visualizer listening to what we're sending (and presumably automatically open one if there isn't).

patmarion · 2016-12-13T22:47:46Z

There is one more possibility. Director has a "mesh manager" which listens on lcm for mesh data. This was used by the affordance server in director. So you could send meshes to the mesh manager, and then when you load a robot, you could set the filename field to something customized like: director_mesh://<mesh id> and drakevisualizer is taught how to resolve that. I think this would definitely be more complicated then just implementing option 3, but could be an alternate way to do it if you didn't want to modify the load_robot protocol.

rdeits · 2016-12-13T22:55:48Z

Interesting; I hadn't thought of that, thanks. I think that would work fine for Val, but not as well for my other use case, which is visualizing soft robots. For the soft robots, I generate an entirely new mesh at every time step (that's also why I don't want to write the meshes to disk). It seems like registering new meshes with the manager every time I want to draw them would add extra complexity and state (especially if the load_robot message happened to be delivered before the corresponding load_mesh message).

tkoolen · 2016-12-13T23:04:41Z

I'd be happy with 3 above. Happier still with 4, but I know that's a lot of work.

For now, how does 2 work? I tried sudo ip link set mtu 9000 dev lo (and sudo ip link set mtu 9000 dev eno1) to no avail. Just trying to make absolutely sure that message size is indeed the problem.

rdeits · 2016-12-13T23:04:45Z

@tkoolen does running the setup_loopback_multicast.sh script fix the issue?

rdeits · 2016-12-13T23:06:47Z

I notice that my Linux desktop has MTU 1500 on the eth0 interface, but 65536 on the loopback interface.

Full output of ifconfig:

➜ ifconfig                                                                                              18:02:38
eth0      Link encap:Ethernet  HWaddr <redacted>
          inet addr:<redacted>  
          inet6 addr: <redacted>
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
          Interrupt:18

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MULTICAST  MTU:65536  Metric:1
          RX packets:641854 errors:0 dropped:0 overruns:0 frame:0
          TX packets:641854 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:133835514 (133.8 MB)  TX bytes:133835514 (133.8 MB)

virbr0    Link encap:Ethernet  HWaddr ba:<redacted>
          inet addr: <redacted>
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

tkoolen · 2016-12-13T23:23:40Z

No luck with the script, and no luck with MTU 65536. I thought 9000 was the maximum, but I guess that's only for ethernet devices?

patmarion · 2016-12-13T23:47:07Z

did you try running these commands:

sudo sysctl -w net.core.rmem_max=2097152 
sudo sysctl -w net.core.rmem_default=2097152

rdeits · 2016-12-15T05:54:17Z

From discussion with @patmarion I'm leaning towards 2 being the solution for now, but I've started looking into what 4 would require. ZeroMQ + Msgpack is a pretty nice combination.

tkoolen · 2016-12-15T19:28:43Z

sudo sysctl -w net.core.rmem_max=2097152 
sudo sysctl -w net.core.rmem_default=2097152

Thanks, that made the LCM message show up in bot_spy, but drake-visualizer still isn't getting the message it seems.

tkoolen · 2016-12-15T19:56:19Z

@rdeits investigated the above. It turns out it's due to me being on 16.04, for which the drake-visualizer binary is out of date, so drake-visualizer doesn't listen to the same LCM channel on which DrakeVisualizer.jl is publishing the lcmt_viewer_load_robot message.

patmarion · 2016-12-15T19:57:28Z

I'll make you guys an updated binary. I have to do it manually for ubuntu-16. @rdeits and I are discussing ways to automate.

patmarion · 2016-12-15T20:03:23Z

updated:

http://people.csail.mit.edu/patmarion/software/director/releases/director-0.1.0-ubuntu16.tar.gz

patmarion · 2016-12-15T20:05:44Z

fyi, to avoid running those sudo commands again, edit /etc/sysctl.conf and add these lines:

net.core.rmem_max=2097152
net.core.rmem_default=2097152

See https://lcm-proj.github.io/multicast_setup.html

tkoolen · 2016-12-15T20:12:38Z

Thanks, Pat!

rdeits · 2016-12-20T17:23:24Z

@tkoolen did Pat's workaround fix the issue? If so, I think we should add it to the readme and then close this. I think the ZMQ approach might be the right thing eventually, but with the workaround LCM can limp along for a little while longer.

tkoolen · 2016-12-20T23:33:20Z

It did, yeah. I was waiting to test the updated 16.04 build before closing this issue, but forgot to do so before leaving for home. I trust that it has been fixed though. Yeah, if you could add it to the readme, that would be great, and feel free to close after that.

rdeits · 2016-12-21T06:24:36Z

Ok, done.

rdeits closed this as completed in c41b522 Dec 21, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reload!() doesn't publish DRAKE_VIEWER_ADD_ROBOT message on Linux #19

reload!() doesn't publish DRAKE_VIEWER_ADD_ROBOT message on Linux #19

tkoolen commented Dec 13, 2016

tkoolen commented Dec 13, 2016

rdeits commented Dec 13, 2016

patmarion commented Dec 13, 2016 •

edited

Loading

rdeits commented Dec 13, 2016

tkoolen commented Dec 13, 2016

rdeits commented Dec 13, 2016

rdeits commented Dec 13, 2016 •

edited

Loading

tkoolen commented Dec 13, 2016

patmarion commented Dec 13, 2016 •

edited

Loading

rdeits commented Dec 15, 2016

tkoolen commented Dec 15, 2016

tkoolen commented Dec 15, 2016

patmarion commented Dec 15, 2016

patmarion commented Dec 15, 2016

patmarion commented Dec 15, 2016

tkoolen commented Dec 15, 2016

rdeits commented Dec 20, 2016

tkoolen commented Dec 20, 2016

rdeits commented Dec 21, 2016

reload!() doesn't publish DRAKE_VIEWER_ADD_ROBOT message on Linux #19

reload!() doesn't publish DRAKE_VIEWER_ADD_ROBOT message on Linux #19

Comments

tkoolen commented Dec 13, 2016

tkoolen commented Dec 13, 2016

rdeits commented Dec 13, 2016

patmarion commented Dec 13, 2016 • edited Loading

rdeits commented Dec 13, 2016

tkoolen commented Dec 13, 2016

rdeits commented Dec 13, 2016

rdeits commented Dec 13, 2016 • edited Loading

tkoolen commented Dec 13, 2016

patmarion commented Dec 13, 2016 • edited Loading

rdeits commented Dec 15, 2016

tkoolen commented Dec 15, 2016

tkoolen commented Dec 15, 2016

patmarion commented Dec 15, 2016

patmarion commented Dec 15, 2016

patmarion commented Dec 15, 2016

tkoolen commented Dec 15, 2016

rdeits commented Dec 20, 2016

tkoolen commented Dec 20, 2016

rdeits commented Dec 21, 2016

patmarion commented Dec 13, 2016 •

edited

Loading

rdeits commented Dec 13, 2016 •

edited

Loading

patmarion commented Dec 13, 2016 •

edited

Loading