misc: IBM Virtual Management Channel Driver (VMC)

This driver is a logical device which provides an
interface between the hypervisor and a management
partition. This interface is like a message
passing interface. This management partition
is intended to provide an alternative to HMC-based
system management.

VMC enables the Management LPAR to provide basic
logical partition functions:
- Logical Partition Configuration
- Boot, start, and stop actions for individual
  partitions
- Display of partition status
- Management of virtual Ethernet
- Management of virtual Storage
- Basic system management

This driver is to be used for the POWER Virtual
Management Channel Virtual Adapter on the PowerPC
platform. It provides a character device which
allows for both request/response and async message
support through the /dev/ibmvmc node.

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Reviewed-by: Adam Reznechek <adreznec@linux.vnet.ibm.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Taylor Jakobson <tjakobs@us.ibm.com>
Tested-by: Brad Warrum <bwarrum@us.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This commit is contained in:
Bryant G. Ly 2018-04-25 16:32:57 -05:00 committed by Greg Kroah-Hartman
parent 5b7d127726
commit 0eca353e7a
8 changed files with 2876 additions and 0 deletions

View File

@ -327,6 +327,7 @@ Code Seq#(hex) Include File Comments
0xCA 80-BF uapi/scsi/cxlflash_ioctl.h
0xCB 00-1F CBM serial IEC bus in development:
<mailto:michael.klein@puffin.lb.shuttle.de>
0xCC 00-0F drivers/misc/ibmvmc.h pseries VMC driver
0xCD 01 linux/reiserfs_fs.h
0xCF 02 fs/cifs/ioctl.c
0xDB 00-0F drivers/char/mwave/mwavepub.h

View File

@ -0,0 +1,226 @@
.. SPDX-License-Identifier: GPL-2.0+
======================================================
IBM Virtual Management Channel Kernel Driver (IBMVMC)
======================================================
:Authors:
Dave Engebretsen <engebret@us.ibm.com>,
Adam Reznechek <adreznec@linux.vnet.ibm.com>,
Steven Royer <seroyer@linux.vnet.ibm.com>,
Bryant G. Ly <bryantly@linux.vnet.ibm.com>,
Introduction
============
Note: Knowledge of virtualization technology is required to understand
this document.
A good reference document would be:
https://openpowerfoundation.org/wp-content/uploads/2016/05/LoPAPR_DRAFT_v11_24March2016_cmt1.pdf
The Virtual Management Channel (VMC) is a logical device which provides an
interface between the hypervisor and a management partition. This interface
is like a message passing interface. This management partition is intended
to provide an alternative to systems that use a Hardware Management
Console (HMC) - based system management.
The primary hardware management solution that is developed by IBM relies
on an appliance server named the Hardware Management Console (HMC),
packaged as an external tower or rack-mounted personal computer. In a
Power Systems environment, a single HMC can manage multiple POWER
processor-based systems.
Management Application
----------------------
In the management partition, a management application exists which enables
a system administrator to configure the systems partitioning
characteristics via a command line interface (CLI) or Representational
State Transfer Application (REST API's).
The management application runs on a Linux logical partition on a
POWER8 or newer processor-based server that is virtualized by PowerVM.
System configuration, maintenance, and control functions which
traditionally require an HMC can be implemented in the management
application using a combination of HMC to hypervisor interfaces and
existing operating system methods. This tool provides a subset of the
functions implemented by the HMC and enables basic partition configuration.
The set of HMC to hypervisor messages supported by the management
application component are passed to the hypervisor over a VMC interface,
which is defined below.
The VMC enables the management partition to provide basic partitioning
functions:
- Logical Partitioning Configuration
- Start, and stop actions for individual partitions
- Display of partition status
- Management of virtual Ethernet
- Management of virtual Storage
- Basic system management
Virtual Management Channel (VMC)
--------------------------------
A logical device, called the Virtual Management Channel (VMC), is defined
for communicating between the management application and the hypervisor. It
basically creates the pipes that enable virtualization management
software. This device is presented to a designated management partition as
a virtual device.
This communication device uses Command/Response Queue (CRQ) and the
Remote Direct Memory Access (RDMA) interfaces. A three-way handshake is
defined that must take place to establish that both the hypervisor and
management partition sides of the channel are running prior to
sending/receiving any of the protocol messages.
This driver also utilizes Transport Event CRQs. CRQ messages are sent
when the hypervisor detects one of the peer partitions has abnormally
terminated, or one side has called H_FREE_CRQ to close their CRQ.
Two new classes of CRQ messages are introduced for the VMC device. VMC
Administrative messages are used for each partition using the VMC to
communicate capabilities to their partner. HMC Interface messages are used
for the actual flow of HMC messages between the management partition and
the hypervisor. As most HMC messages far exceed the size of a CRQ buffer,
a virtual DMA (RMDA) of the HMC message data is done prior to each HMC
Interface CRQ message. Only the management partition drives RDMA
operations; hypervisors never directly cause the movement of message data.
Terminology
-----------
RDMA
Remote Direct Memory Access is DMA transfer from the server to its
client or from the server to its partner partition. DMA refers
to both physical I/O to and from memory operations and to memory
to memory move operations.
CRQ
Command/Response Queue a facility which is used to communicate
between partner partitions. Transport events which are signaled
from the hypervisor to partition are also reported in this queue.
Example Management Partition VMC Driver Interface
=================================================
This section provides an example for the management application
implementation where a device driver is used to interface to the VMC
device. This driver consists of a new device, for example /dev/ibmvmc,
which provides interfaces to open, close, read, write, and perform
ioctls against the VMC device.
VMC Interface Initialization
----------------------------
The device driver is responsible for initializing the VMC when the driver
is loaded. It first creates and initializes the CRQ. Next, an exchange of
VMC capabilities is performed to indicate the code version and number of
resources available in both the management partition and the hypervisor.
Finally, the hypervisor requests that the management partition create an
initial pool of VMC buffers, one buffer for each possible HMC connection,
which will be used for management application session initialization.
Prior to completion of this initialization sequence, the device returns
EBUSY to open() calls. EIO is returned for all open() failures.
::
Management Partition Hypervisor
CRQ INIT
---------------------------------------->
CRQ INIT COMPLETE
<----------------------------------------
CAPABILITIES
---------------------------------------->
CAPABILITIES RESPONSE
<----------------------------------------
ADD BUFFER (HMC IDX=0,1,..) _
<---------------------------------------- |
ADD BUFFER RESPONSE | - Perform # HMCs Iterations
----------------------------------------> -
VMC Interface Open
------------------
After the basic VMC channel has been initialized, an HMC session level
connection can be established. The application layer performs an open() to
the VMC device and executes an ioctl() against it, indicating the HMC ID
(32 bytes of data) for this session. If the VMC device is in an invalid
state, EIO will be returned for the ioctl(). The device driver creates a
new HMC session value (ranging from 1 to 255) and HMC index value (starting
at index 0 and ranging to 254) for this HMC ID. The driver then does an
RDMA of the HMC ID to the hypervisor, and then sends an Interface Open
message to the hypervisor to establish the session over the VMC. After the
hypervisor receives this information, it sends Add Buffer messages to the
management partition to seed an initial pool of buffers for the new HMC
connection. Finally, the hypervisor sends an Interface Open Response
message, to indicate that it is ready for normal runtime messaging. The
following illustrates this VMC flow:
::
Management Partition Hypervisor
RDMA HMC ID
---------------------------------------->
Interface Open
---------------------------------------->
Add Buffer _
<---------------------------------------- |
Add Buffer Response | - Perform N Iterations
----------------------------------------> -
Interface Open Response
<----------------------------------------
VMC Interface Runtime
---------------------
During normal runtime, the management application and the hypervisor
exchange HMC messages via the Signal VMC message and RDMA operations. When
sending data to the hypervisor, the management application performs a
write() to the VMC device, and the driver RDMAs the data to the hypervisor
and then sends a Signal Message. If a write() is attempted before VMC
device buffers have been made available by the hypervisor, or no buffers
are currently available, EBUSY is returned in response to the write(). A
write() will return EIO for all other errors, such as an invalid device
state. When the hypervisor sends a message to the management, the data is
put into a VMC buffer and an Signal Message is sent to the VMC driver in
the management partition. The driver RDMAs the buffer into the partition
and passes the data up to the appropriate management application via a
read() to the VMC device. The read() request blocks if there is no buffer
available to read. The management application may use select() to wait for
the VMC device to become ready with data to read.
::
Management Partition Hypervisor
MSG RDMA
---------------------------------------->
SIGNAL MSG
---------------------------------------->
SIGNAL MSG
<----------------------------------------
MSG RDMA
<----------------------------------------
VMC Interface Close
-------------------
HMC session level connections are closed by the management partition when
the application layer performs a close() against the device. This action
results in an Interface Close message flowing to the hypervisor, which
causes the session to be terminated. The device driver must free any
storage allocated for buffers for this HMC connection.
::
Management Partition Hypervisor
INTERFACE CLOSE
---------------------------------------->
INTERFACE CLOSE RESPONSE
<----------------------------------------
Additional Information
======================
For more information on the documentation for CRQ Messages, VMC Messages,
HMC interface Buffers, and signal messages please refer to the Linux on
Power Architecture Platform Reference. Section F.

View File

@ -6757,6 +6757,12 @@ L: linux-scsi@vger.kernel.org
S: Supported
F: drivers/scsi/ibmvscsi/ibmvfc*
IBM Power Virtual Management Channel Driver
M: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
M: Steven Royer <seroyer@linux.vnet.ibm.com>
S: Supported
F: drivers/misc/ibmvmc.*
IBM Power Virtual SCSI Device Drivers
M: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
L: linux-scsi@vger.kernel.org

View File

@ -279,6 +279,7 @@
#define H_GET_MPP_X 0x314
#define H_SET_MODE 0x31C
#define H_CLEAR_HPT 0x358
#define H_REQUEST_VMC 0x360
#define H_RESIZE_HPT_PREPARE 0x36C
#define H_RESIZE_HPT_COMMIT 0x370
#define H_REGISTER_PROC_TBL 0x37C

View File

@ -113,6 +113,20 @@ config IBM_ASM
for information on the specific driver level and support statement
for your IBM server.
config IBMVMC
tristate "IBM Virtual Management Channel support"
depends on PPC_PSERIES
help
This is the IBM POWER Virtual Management Channel
This driver is to be used for the POWER Virtual
Management Channel virtual adapter on the PowerVM
platform. It provides both request/response and
async message support through the /dev/ibmvmc node.
To compile this driver as a module, choose M here: the
module will be called ibmvmc.
config PHANTOM
tristate "Sensable PHANToM (PCI)"
depends on PCI

View File

@ -4,6 +4,7 @@
#
obj-$(CONFIG_IBM_ASM) += ibmasm/
obj-$(CONFIG_IBMVMC) += ibmvmc.o
obj-$(CONFIG_AD525X_DPOT) += ad525x_dpot.o
obj-$(CONFIG_AD525X_DPOT_I2C) += ad525x_dpot-i2c.o
obj-$(CONFIG_AD525X_DPOT_SPI) += ad525x_dpot-spi.o

2418
drivers/misc/ibmvmc.c Normal file

File diff suppressed because it is too large Load Diff

209
drivers/misc/ibmvmc.h Normal file
View File

@ -0,0 +1,209 @@
/* SPDX-License-Identifier: GPL-2.0+
*
* linux/drivers/misc/ibmvmc.h
*
* IBM Power Systems Virtual Management Channel Support.
*
* Copyright (c) 2004, 2018 IBM Corp.
* Dave Engebretsen engebret@us.ibm.com
* Steven Royer seroyer@linux.vnet.ibm.com
* Adam Reznechek adreznec@linux.vnet.ibm.com
* Bryant G. Ly <bryantly@linux.vnet.ibm.com>
*/
#ifndef IBMVMC_H
#define IBMVMC_H
#include <linux/types.h>
#include <linux/cdev.h>
#include <asm/vio.h>
#define IBMVMC_PROTOCOL_VERSION 0x0101
#define MIN_BUF_POOL_SIZE 16
#define MIN_HMCS 1
#define MIN_MTU 4096
#define MAX_BUF_POOL_SIZE 64
#define MAX_HMCS 2
#define MAX_MTU (4 * 4096)
#define DEFAULT_BUF_POOL_SIZE 32
#define DEFAULT_HMCS 1
#define DEFAULT_MTU 4096
#define HMC_ID_LEN 32
#define VMC_INVALID_BUFFER_ID 0xFFFF
/* ioctl numbers */
#define VMC_BASE 0xCC
#define VMC_IOCTL_SETHMCID _IOW(VMC_BASE, 0x00, unsigned char *)
#define VMC_IOCTL_QUERY _IOR(VMC_BASE, 0x01, struct ibmvmc_query_struct)
#define VMC_IOCTL_REQUESTVMC _IOR(VMC_BASE, 0x02, u32)
#define VMC_MSG_CAP 0x01
#define VMC_MSG_CAP_RESP 0x81
#define VMC_MSG_OPEN 0x02
#define VMC_MSG_OPEN_RESP 0x82
#define VMC_MSG_CLOSE 0x03
#define VMC_MSG_CLOSE_RESP 0x83
#define VMC_MSG_ADD_BUF 0x04
#define VMC_MSG_ADD_BUF_RESP 0x84
#define VMC_MSG_REM_BUF 0x05
#define VMC_MSG_REM_BUF_RESP 0x85
#define VMC_MSG_SIGNAL 0x06
#define VMC_MSG_SUCCESS 0
#define VMC_MSG_INVALID_HMC_INDEX 1
#define VMC_MSG_INVALID_BUFFER_ID 2
#define VMC_MSG_CLOSED_HMC 3
#define VMC_MSG_INTERFACE_FAILURE 4
#define VMC_MSG_NO_BUFFER 5
#define VMC_BUF_OWNER_ALPHA 0
#define VMC_BUF_OWNER_HV 1
enum ibmvmc_states {
ibmvmc_state_sched_reset = -1,
ibmvmc_state_initial = 0,
ibmvmc_state_crqinit = 1,
ibmvmc_state_capabilities = 2,
ibmvmc_state_ready = 3,
ibmvmc_state_failed = 4,
};
enum ibmhmc_states {
/* HMC connection not established */
ibmhmc_state_free = 0,
/* HMC connection established (open called) */
ibmhmc_state_initial = 1,
/* open msg sent to HV, due to ioctl(1) call */
ibmhmc_state_opening = 2,
/* HMC connection ready, open resp msg from HV */
ibmhmc_state_ready = 3,
/* HMC connection failure */
ibmhmc_state_failed = 4,
};
struct ibmvmc_buffer {
u8 valid; /* 1 when DMA storage allocated to buffer */
u8 free; /* 1 when buffer available for the Alpha Partition */
u8 owner;
u16 id;
u32 size;
u32 msg_len;
dma_addr_t dma_addr_local;
dma_addr_t dma_addr_remote;
void *real_addr_local;
};
struct ibmvmc_admin_crq_msg {
u8 valid; /* RPA Defined */
u8 type; /* ibmvmc msg type */
u8 status; /* Response msg status. Zero is success and on failure,
* either 1 - General Failure, or 2 - Invalid Version is
* returned.
*/
u8 rsvd[2];
u8 max_hmc; /* Max # of independent HMC connections supported */
__be16 pool_size; /* Maximum number of buffers supported per HMC
* connection
*/
__be32 max_mtu; /* Maximum message size supported (bytes) */
__be16 crq_size; /* # of entries available in the CRQ for the
* source partition. The target partition must
* limit the number of outstanding messages to
* one half or less.
*/
__be16 version; /* Indicates the code level of the management partition
* or the hypervisor with the high-order byte
* indicating a major version and the low-order byte
* indicating a minor version.
*/
};
struct ibmvmc_crq_msg {
u8 valid; /* RPA Defined */
u8 type; /* ibmvmc msg type */
u8 status; /* Response msg status */
union {
u8 rsvd; /* Reserved */
u8 owner;
} var1;
u8 hmc_session; /* Session Identifier for the current VMC connection */
u8 hmc_index; /* A unique HMC Idx would be used if multiple management
* applications running concurrently were desired
*/
union {
__be16 rsvd;
__be16 buffer_id;
} var2;
__be32 rsvd;
union {
__be32 rsvd;
__be32 lioba;
__be32 msg_len;
} var3;
};
/* an RPA command/response transport queue */
struct crq_queue {
struct ibmvmc_crq_msg *msgs;
int size, cur;
dma_addr_t msg_token;
spinlock_t lock;
};
/* VMC server adapter settings */
struct crq_server_adapter {
struct device *dev;
struct crq_queue queue;
u32 liobn;
u32 riobn;
struct tasklet_struct work_task;
wait_queue_head_t reset_wait_queue;
struct task_struct *reset_task;
};
/* Driver wide settings */
struct ibmvmc_struct {
u32 state;
u32 max_mtu;
u32 max_buffer_pool_size;
u32 max_hmc_index;
struct crq_server_adapter *adapter;
struct cdev cdev;
u32 vmc_drc_index;
};
struct ibmvmc_file_session;
/* Connection specific settings */
struct ibmvmc_hmc {
u8 session;
u8 index;
u32 state;
struct crq_server_adapter *adapter;
spinlock_t lock;
unsigned char hmc_id[HMC_ID_LEN];
struct ibmvmc_buffer buffer[MAX_BUF_POOL_SIZE];
unsigned short queue_outbound_msgs[MAX_BUF_POOL_SIZE];
int queue_head, queue_tail;
struct ibmvmc_file_session *file_session;
};
struct ibmvmc_file_session {
struct file *file;
struct ibmvmc_hmc *hmc;
bool valid;
};
struct ibmvmc_query_struct {
int have_vmc;
int state;
int vmc_drc_index;
};
#endif /* __IBMVMC_H */