Re: [RFC PATCH v5 00/19] virtio/vsock: introduce SOCK_SEQPACKET support
From: Arseny Krasnov
Date: Wed Feb 24 2021 - 03:40:35 EST
On 24.02.2021 11:35, Stefano Garzarella wrote:
>
On Wed, Feb 24, 2021 at 11:28:50AM +0300, Arseny Krasnov wrote:
>
> On 24.02.2021 11:23, Stefano Garzarella wrote:
>
>> On Wed, Feb 24, 2021 at 07:29:25AM +0300, Arseny Krasnov wrote:
>
>>> On 23.02.2021 17:50, Stefano Garzarella wrote:
>
>>>> On Mon, Feb 22, 2021 at 03:23:11PM +0100, Stefano Garzarella wrote:
>
>>>>> Hi Arseny,
>
>>>>>
>
>>>>> On Thu, Feb 18, 2021 at 08:33:44AM +0300, Arseny Krasnov wrote:
>
>>>>>> This patchset impelements support of SOCK_SEQPACKET for virtio
>
>>>>>> transport.
>
>>>>>> As SOCK_SEQPACKET guarantees to save record boundaries, so to
>
>>>>>> do it, two new packet operations were added: first for start of record
>
>>>>>> and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>
>>>>>> both operations carries metadata - to maintain boundaries and payload
>
>>>>>> integrity. Metadata is introduced by adding special header with two
>
>>>>>> fields - message count and message length:
>
>>>>>>
>
>>>>>> struct virtio_vsock_seq_hdr {
>
>>>>>> __le32 msg_cnt;
>
>>>>>> __le32 msg_len;
>
>>>>>> } __attribute__((packed));
>
>>>>>>
>
>>>>>> This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>
>>>>>> packets(buffer of second virtio descriptor in chain) in the same way as
>
>>>>>> data transmitted in RW packets. Payload was chosen as buffer for this
>
>>>>>> header to avoid touching first virtio buffer which carries header of
>
>>>>>> packet, because someone could check that size of this buffer is equal
>
>>>>>> to size of packet header. To send record, packet with start marker is
>
>>>>>> sent first(it's header contains length of record and counter), then
>
>>>>>> counter is incremented and all data is sent as usual 'RW' packets and
>
>>>>>> finally SEQ_END is sent(it also carries counter of message, which is
>
>>>>>> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
>
>>>>>> incremented again. On receiver's side, length of record is known from
>
>>>>>> packet with start record marker. To check that no packets were dropped
>
>>>>>> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
>
>>>>>> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
>
>>>>>> 1) and length of data between two markers is compared to length in
>
>>>>>> SEQ_BEGIN header.
>
>>>>>> Now as packets of one socket are not reordered neither on
>
>>>>>> vsock nor on vhost transport layers, such markers allows to restore
>
>>>>>> original record on receiver's side. If user's buffer is smaller that
>
>>>>>> record length, when all out of size data is dropped.
>
>>>>>> Maximum length of datagram is not limited as in stream socket,
>
>>>>>> because same credit logic is used. Difference with stream socket is
>
>>>>>> that user is not woken up until whole record is received or error
>
>>>>>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>
>>>>>> Tests also implemented.
>
>>>>> I reviewed the first part (af_vsock.c changes), tomorrow I'll review
>
>>>>> the rest. That part looks great to me, only found a few minor issues.
>
>>>> I revieiwed the rest of it as well, left a few minor comments, but I
>
>>>> think we're well on track.
>
>>>>
>
>>>> I'll take a better look at the specification patch tomorrow.
>
>>> Great, Thank You
>
>>>> Thanks,
>
>>>> Stefano
>
>>>>
>
>>>>> In the meantime, however, I'm getting a doubt, especially with regard
>
>>>>> to other transports besides virtio.
>
>>>>>
>
>>>>> Should we hide the begin/end marker sending in the transport?
>
>>>>>
>
>>>>> I mean, should the transport just provide a seqpacket_enqueue()
>
>>>>> callbacl?
>
>>>>> Inside it then the transport will send the markers. This is because
>
>>>>> some transports might not need to send markers.
>
>>>>>
>
>>>>> But thinking about it more, they could actually implement stubs for
>
>>>>> that calls, if they don't need to send markers.
>
>>>>>
>
>>>>> So I think for now it's fine since it allows us to reuse a lot of
>
>>>>> code, unless someone has some objection.
>
>>> I thought about that, I'll try to implement it in next version. Let's see...
>
>> If you want to discuss it first, write down the idea you want to
>
>> implement, I wouldn't want to make you do unnecessary work. :-)
>
> Idea is simple, in iov iterator of 'struct msghdr' which is passed to
>
>
>
> enqueue callback we have two fields: 'iov_offset' which is byte
>
>
>
> offset inside io vector where next data must be picked and 'count'
>
>
>
> which is rest of unprocessed bytes in io vector. So in seqpacket
>
>
>
> enqueue callback if 'iov_offset' is 0 i'll send SEQBEGIN, and if
>
>
>
> 'count' is 0 i'll send SEQEND.
>
>
>
Got it, make sense and it's defently more transparent for the vsock
>
core!
>
Go head, maybe adding a comment in the vsock core explaining this, so
>
other developers can understand better if they want to support SEPACKET
>
in other transports.
Ack
>
>
Thanks,
>
Stefano
>
>