KVN Forum day 1

Wednesday, October 25, 2017

First keynote

First speaker is 15 years old, started programming at 9. Keila and Phillip Banks.

General comments about women and minorities in tech, followed by a specific miniority: kids. Talk about age restrictions when using Slack. Actually went to Slack and started complaining about it. Keila has been speaing in front of audiences since she was 11. Visited the White House "under president Obama" (applause) At the time the CTO for "America" was a woman.

Talk about how they do it:

  • Goals. Each year, she tries to learn a new language. Strive to be a better person. Learn JavaScript more and teach other kids.
  • Embrace the fear. Remember to laugh at yourself. Talked about the impostor syndrom. Why am I able to fly to Prague when I'm not as smart as the people in the room. Compare yourself to no-one, be inspired by other people.

Father and girl talk together in a very harmonious way. If you are mentoring, make sure to have them have some investment in their own goals. Father interrupting himself to ask questions to his daughter about whether she has something to add. Feels not overly scripted, but quite interactive (between them, not the audience).

Second keynote

The Tao of HashiCorp (https://www.hashicorp.com). A bit lengthy at the beginning, talking about his company and how he learned about programming and how open source enabled him to learn how to program.

  1. Workflows, not technologies. Technologies change, the goal remains the same. There is a big set of problems that stay the same. Can span paradigm shifts. Is eating his own dog food.
  2. Simple, modular, composable. A sort of Unix-style definition. Smaller components with well-defined scope that are functional on their own, that can be composed for a larger purpose.
  3. Versioning through codification. Encoding knowledge through code. The problem with oral tradition is that it's slow and fickle. So knowledge should be represented with the code and that should be used as the source for truth. Also, truth should be encapsulated in automation, not in source code (e.g. scripts).
  4. Resilient systems. Notion of a desired state. Must create a plan or path to move from current system to desired state.
  5. Pragmatism: A set of ideals that we should fight for, but if it's not the right solution, be able to re-evaluate them for the given problem. Immutability is not always practical.

Jan Kiszka

Challenges in industrializing OSS

Use OSS quite a bit in their software. Constant rise in open source usage in embedded products.

There is more and more software in Siemens products. Less and less is for differentiation, more and more for commodity. Also have to take into account the long life of their systems. That means that the updates can change a lot beyond what they wanted to change themselves. Even if the changes are correct, it may not be in a configuration that had been cerified, with expectations for some layer that are not met by the neighbouring layers.

Changes can be self-inflicted, but also supplier provided. Message is "upstream first".

Also want to make sure that they are not just consumer, but also providers of software to the open-source ecosystem. Easier to share issues and demo problems to suppliers with open-source problems.

Example: "I have a big in GCC". Answer "File a bug", but then the source code that exhibited the bug was closed.

See the github page.

Civil infrastructure platform: here.

Some branding changes, Mentor graphics now part of Siemens. Ask Mark Mitchell if he's still there?

Software license compliance as free software. Use fossology and software catalog. They take license compliance very seriously.

Linus Torvalds conversation

Linus Torvalds on stage, cameras popping up.

Linus hates public speaking, and only does conversations, because he does not want to do slides.

Q1: 4.14-rc6 is out, is it good when we have more rcs?. Problem is that it's an LTS, so many companies want to put more stuff in.

Q2: Who is causing problem. ARM seems to be a shining star, names will be named tomorrow.

Q3: Do we do enough to get new maintainers, younger blood. Being a maintainer is painful. Not hard once you get used to it, but you need to have a lot of experience. In order to handle all the flow, you need to have it done for a long time. "I love maintainers, it may not appear that way in my emails". Looking for people. "You don't have to be perfect", but if then you stand up and fix it, that's how trust shows up.

Tries to put his vacation at the end of rc series. Usually gets less pull requests on Thursdays. Like to have multiple maintainers to allow people to not be there all the time. Maintainer mentoring program?

Q4: Security: Have recent events changed your approach to security. Does not like security, because a lot is about "look at me, I found this bug, look how great I am". Often, being a self-serving person is a way to attract business. Good security people are not visible. Talks about compiler-based tools being quite good at finding bugs that no human would ever find.

Q5: Community projects: Why do communities sometimes take over, but the vast majority never takes off? Nobody would ever do an open-source database because it's boring. Was wrong. OSS projects that tend to be the most successful find a lot of commonalities. Infrastructure, things where it's easy to find agreemnent. Everybody needs a kernel, not much of an argument about what the needs are. More disagreement about UIs.

The open source community used to be very uniform, geeky white male with a beard. That has changed. More diversity now. If you started programming when you were 10 or 11, you are probably a good programmer. True for everything. If you want to be an athlete, you need dedication. Linus accepts his extra baggage (laughs). If you are in marketing, you may have to like to talk to people.

Q6: Explain your new love of C++. Used to wonder how anybody could develop with C++. "My kernel compiles faster than this".

Q7: Don't do end-user support. Not a people person, does not like to interact with people face to face. Likes interacting over technical matters. No longer the case for kernel. By the time it reaches him, it has been filtered a lot. Not so with Subsurface.

Q8: What makes you wake up every morning and pull through hundreds of emails. Negative emails are the ones being showcased, but enjoys interacting with people. The kernel matters to people. Gives your life meaning. Would be bored if was not doing kernel development. Very occasionally, is incommunicado for a week or two, feels very odd. So ready to get back after a week.

Q9: Ask prediction about how things are going to shape up for Linux in the coming year. Not how he works. Knows what's coming in a very big view, talks both to hardware compagnies and developers. But most of his life is reacting, not predicting. Not trying to do the next big thing. Don't know what we'll be doing in the next year.

Q10: When is Linux 5.0? Been known to start loosing track of numbers. When you can't recall the difference between x.22 and x.23. The numbers don't mean anything at all. Was stressful when they did.

Next year summit is in Edinburgh.

GPGPU On OpenStack

Sort of curious what we are going to see here. Room is not very full.

Runs a demo video with an OpenStack. OpenStack instance with an M60 GPU passthrough. Have several images running, Win7, Ubuntu, Kube Controller

First demo is a CUDA device query (not very complex)

Then brings up Windows 7 guest. Control panel shows M60, but hard to read, because everything on screen is in Japanese!!!

Benchark with a game wihtin Windows. FPS is good. Explains that the FPS is very good in the VM (237 FPS), but bad on screen because of remote viewing, and network bandwidth is not enough.

End of demo, switching to slides. Slides are in moderately good english, e.g. "Project Orgizined" Sometimes below I will use the original phrasing when I'm not sure what it means.

Masafumi Ohta, Presales engineer for automotive company, looking for GPGPU uses. Session has been presented a few times, OpenStack Japan, in China, etc.

Now evaluating Tesla + DellEMC PowerEdge C4130 + OpenStack. Note that the talk is about GPGPU use, but it looks like normal GPU uses are being covered quite a bit too (e.g. games).

Why GPGPU on OpenStack?

Specific use of OpenStack: Hadoob (Sahara), HPC.

No information available (search says "Document lost" on openstack.org). Would be nice to gather information at docs.openstack.org.

How does GPGPU work in OpenStack

Using many cores in GPU, servers are more compact. Lower electric consumption. [Nothing really surprising here]

PCI Passthrough or nvidia(GPGPU) docker. PCI Passthrough for kVM. VSphere and Xen can split GPU cores to each VM, but OpenStack can only add each GPU unit to a given VM (not sure I understood the slide text, "OpenStack can only add t=qwith each 'GPU unit' onto VM", trying to translate)

nvidia(GPGPU) Docker is "share GPU with each container" but not split. Windows hasn't got worked as docker VM. Problem that some automotive applications are only running on Windows.

Instant HPC use:

  • Try some calculates and then destroy vm
  • Orchestrate a bynch of vms to try HPC grid computings

Use it internal cloud with GPU

  • Use it internal use - some manufacturers can't have some systems on public cloud because of strict security policies

What GPGPU menchanism works on OpenStack

PCI devices directly connects to VM via Linux Host Depends on Hypervisor, not OpenStack

One device to one VM in KVM, GPU itself cannot share and split the cores each VMs like docker, Xen or VMware. It is a limitation in kVM, not OpenStack.

Red Hat officially support passthrough, but not recommended. Ubuntu does not document.

Passthrough all GPU units, including HDMI / Audio Suggests to manually unbind from physical host and bind the identifiers to pci-stub. I prefer the kernel command-line approach documented here.

Explains how to configure nova-compute in /etc/nova/nova.conf

Issues we must take care using GPGPU on OpenStack:

Paolo's talk about FC and NPIV

Good comparison between FC and Ethernet

A few interesting questions and comments:

  • How to deal with in-flight I/Os in the HBA / LUN. Apparently not there yet, Paolo told me about 200ms wirth of in-flight I/Os...
  • Why would the host need to see ports that are for the guest? Noticed that my colleagues would assume that if visible to the guest, it has to be visible to the host. Not true with FC I believe.
  • How is FC managed for Linux? Not sure how this works.

virtio

Michael Tsirkin is not here, replaced.

Explain compatibility and feature masks.

Priorities:

  • Code compatibility
  • Intellectual Property Rights (IPR) compatibility

Myth #3: Queues with carefully tuned cache layout descriptor Interesting ideas about cache-sensitive layout, but I am a bit puzzled by how this is done. Need to read the mails and rationale.

Relatively low battery (40%), need to be more careful about my note taking and usage of Spice and Taoi3D, or I will not finish the day.

Maximizing VM performance

Not enormously to say there that won't be in the slides.

Transactions with BTRFS

Transactional here means a) atomic and b) can be rolled back

Remote access

Connected to the internal server, noticed that Tao3D had a numnber of weird errors about the DBus being disconnected. After restarting the spice streaming agent, it crashed with an error message I had not seen before:

pice-streaming-agent.cpp:113: [5326 29.043051] poll: Polled, *pfd=-1
spice-streaming-agent.cpp:631: [5327 30.614538] frame: Got frame, 570 bytes, capture 1571487 us, interval 1571504 us
spice-streaming-agent.cpp:102: [5328 30.614552] poll: Polling streamfd 15, nfds 2, timeout 0
spice-streaming-agent.cpp:113: [5329 30.614556] poll: Polled, *pfd=-1
spice-streaming-agent.cpp:631: [5330 30.617805] frame: Got frame, 571 bytes, capture 3249 us, interval 3267 us
spice-streaming-agent.cpp:102: [5331 30.617817] poll: Polling streamfd 15, nfds 2, timeout 0
spice-streaming-agent.cpp:113: [5332 30.617820] poll: Polled, *pfd=-1
nvidia-frame-capture.cpp:355: [5333 53.786051] nvidia_fbc_error: Function FBCToH264GrabFrame failed: Unable to lock bitstream (status: 8) (14)
recorder/recorder.c:1572: [5334 53.786454] signals: Received signal Aborted (6) si_addr=0x402000020af, dumping recorder
recorder/recorder.c:735: [5335 53.786462] recorders: Recorder dump
Aborted (core dumped)

The new (unexpected) error message is

 nvidia_fbc_error: Function FBCToH264GrabFrame failed: Unable to lock bitstream (status: 8) (14)

Full log is here. First time I have a non-provoked flight recorder crash in the streaning agent. We need to look at it.