<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY rfc3261 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3261.xml">
<!ENTITY rfc3550 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3550.xml">
<!ENTITY rfc4103 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4103.xml">
<!ENTITY rfc4575 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4575.xml">
<!ENTITY rfc4579 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4579.xml">
<!ENTITY I-D.hellstrom-textpreview SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.hellstrom-textpreview.xml">
]>
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="bcp" docName="draft-hellstrom-text-conference-00"
     ipr="full3978">
  <front>
    <title abbrev="Text conference handling">Text media handling in RTP based
    real-time and message conferences</title>

    <author fullname="Gunnar Hellstrom" initials="G" surname="Hellstrom">
      <organization>Omnitor</organization>

      <address>
        <postal>
          <street>Box 92054</street>

          <city>Stockholm</city>

          <code>SE-120 06</code>

          <country>Sweden</country>
        </postal>

        <phone>+46858900056</phone>

        <facsimile>+4658900051</facsimile>

        <email>gunnar.hellstrom@omnitor.se</email>

        <uri>www.omnitor.se</uri>
      </address>
    </author>

    <author fullname="Arnoud van Wijk" initials="A. T." surname="van Wijk">
      <organization>RealTimeText.org</organization>

      <address>
        <email>arnoud@realtimetext.org</email>

        <uri>http://www.realtimetext.org</uri>
      </address>
    </author>

    <date day="2" month="January" year="2008" />

    <abstract>
      <t>This memo specifies methods for text media handling in multi-party
      calls, where the text is carried by the RTP protocol. Real-time text is
      carried in a time-sampled mode according to RFC 4103. Centralized
      multi-party handling of real-time text is achieved through a media
      control unit coordinating multiple RTP text streams into one RTP
      session, identifying each stream with its own SSRC. Identification for
      the streams are provided through the RTCP messages. This mechanism
      enables the receiving application to present the received real-time text
      medium in different ways according to user preferences. Some
      presentation related features are also described explaining suitable
      variations of transmission and presentation of text. Call control
      features are described for the SIP environment, while the transport
      mechanisms should be suitable for any IP based call control environment
      using RTP transport.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119">RFC 2119</xref>.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>Real-time text is a medium in real-time conversational sessions. Text
      entered by participants in a session is transmitted in a time-sampled
      fashion, so that no specific user action is needed to cause
      transmission. This gives a direct flow of text that is suitable in a
      real-time conversational setting. The real-time text medium can be
      combined with other media in multimedia sessions.</t>

      <t>A number of multimedia sessions can be combined in a multi-party
      session. This memo specifies how the real-time text streams are handled
      in such multi-party calls.</t>

      <t>The description is mainly focused on the transport level, but also
      describes a few presentation level features.</t>

      <t>Transport of real-time text is specified in RFC 4103 <xref
      target="RFC4103"></xref> RTP Payload for text conversation. It makes use
      of RFC 3550 <xref target="RFC3550"></xref> Real Time Protocol, for
      transport, and is usually used in the SIP RFC 3261<xref
      target="RFC3261"></xref> Session Initiation Protocol environment, even
      if it is also used in other call control environments. Call control
      aspects in this specification are explained with examples from SIP. The
      specifications about how to handle multi-party text transport,
      identification and presentation are valid also for other call control
      environments where RTP and RTCP are used.</t>

      <t>A very brief overview of functions for both real-time and messaging
      text handling in multi-party sessions is described in RFC 4597
      Conferencing Scenarios <xref target="RFC4579"></xref>. This
      specification builds on that description and indicates what existing
      protocol mechanisms should be used to implement multi-party handling of
      text in real-time sessions.</t>

      <t></t>
    </section>

    <section title="Centralized conference model">
      <t>In the centralized conference model, one function co-ordinates the
      sessions with participants in the multi-party session. This function
      also controls media mixer functions for the media appearing in the
      session. The central function is common for control of all media, while
      the media mixers may work differently for each medium.</t>

      <t>The central function is called the Focus UA and may be co-located in
      an advanced terminal including multi-party control functions, or it may
      be located in a separate location. Many variants exist for setting up
      sessions including the multipoint control centre, It is not within scope
      of this description to describe these, but rather the media specific
      handling in the mixer required to handle multi-party calls.</t>

      <t>The main principle for handling real-time text media in a centralized
      conference is that one RTP session for real-time text is established
      between the multipoint media control centre and each participant who is
      going to have real-time text exchange with the others.</t>

      <t>Within each RTP session, text from each participant is transmitted
      from the media mixer as a separate RTP stream, thus all using the same
      destination address/port combination, but using different RTP SSRCs as
      described in Section 7.1 of RTP RFC 3550 <xref target="RFC3550"></xref>
      about the Translator function. This methods enables the receiver to
      freely select display characteristics of the text conversation.</t>

      <t>General session control aspects for multi-party sessions are
      described in RFC 4575 A Session Initiation Protocol (SIP) Event Package
      for Conference State<xref target="RFC4575"></xref>, and RFC 4579 Session
      Initiation Protocol (SIP) Call Control - Conferencing for User Agents
      <xref target="RFC4579"></xref>. The nomenclature of these specifications
      are used here.</t>
    </section>

    <section title="Identification of the source of text">
      <t>The Focus UA co-ordinates the media flow. Real-time text media from
      different sources are combined in one text media session by the Focus
      UA. The Focus UA acts as an RTP Translator as described in RFC 3550 RTP
      <xref target="RFC3550"></xref> Section 7.1.</t>

      <t>The RTP text stream from each participant who transmits text is
      allocated one unique SSRC. The SSRC is used by the receiver to identify
      text packets originating from one source. Each RTP packet MUST contain
      text from only one source.</t>

      <t>The redundancy mechanism for increased robustness used by the RFC
      4103 transport makes use of the RTP sequence number for detection of
      loss. One sequence number series is maintained per RTP stream identified
      by one SSRC. The RTP Translation mechanism maintains a separate SSRC for
      each source RTP stream in the combined RTP session. Therefore the RTP
      Translation mechanism can be used for conveying text from multiple
      sources to one destination, with maintained possibility to detect and
      recover loss and identify text from the different sources.</t>

      <t>As soon as a new member is added to the RTP session, its
      characteristics shall be transmitted in RTCP SDES reports according to
      section 6.5 in RFC 3550.</t>

      <t>In the RTCP SDES report, SHOULD contain identification of the source
      represented by the SSRC identifier. This identification MUST contain the
      CNAME field and MAY contain the NAME field and other defined fields of
      the SDES report.</t>

      <t>A focus UA SHOULD primarily convey SDES information received from the
      sources of the session members. When such information is not available,
      the focus UA SOULD compose CNAME and NAME information from available
      information from the SIP session with the participant.</t>

      <t></t>
    </section>

    <section title="Presentation of multi-party text">
      <t>All session participants MUST observe the SSRC field of incoming text
      RTP packets, and make note of what source they came from in order to be
      able to present text in a way that makes it easy to read text from each
      participant in a session, and get information about the source of the
      text.</t>

      <section title="Associating identities with text streams">
        <t>A source identity SHOULD be composed from available information
        sources and displayed together with the text as indicated in <xref
        target="T.140">ITU-T T.140 Appendix </xref>.</t>

        <t>The source should primarily be the NAME field from incoming SDES
        packets. If this information is not available, and the session is a
        two-party session, then the T.140 source identity SHOULD be composed
        from the SIP session participant information. For multi-party sessions
        the source identity may be composed by local information if sufficient
        information is not available in the session.</t>

        <t>Applications may abbreviate the presented source identity to a
        suitable form for the available display.</t>
      </section>

      <section title="Selecting what to present">
        <t>Display space limitations and other considerations may call for an
        opportunity for the user to select what sources of text to present, at
        what stage in the reception process to display them and how to present
        them. The specification draft-hellstrom-text-preview <xref
        target="I-D.hellstrom-textpreview"></xref> specifies such presentation
        aspects.</t>
      </section>

      <t></t>
    </section>

    <section title="Transmission of text from each user">
      <t>UAs participating in sessions with real-time text, SHOULD send SDES
      packets in RTCP giving values to appropriate identification fields. The
      NAME field should be given a value that is suitable as an identifier of
      text from the user of the UA.</t>

      <t></t>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document makes no request of IANA.</t>

      <t>Note to RFC Editor: this section may be removed on publication as an
      RFC.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>The security considerations valid for RFC 4103 and RFC 3550 are valid
      also for the multi-party sessions with text.</t>
    </section>

    <section title="Congestion considerations">
      <t>The congestion considerations described in RFC 4103 are valid also
      for multi-party use of the real-time text RTP transport. A risk for
      congestion may appear if a number of conference participants are active
      transmitting text simultaneously, because this multi-party transmission
      method does not allow multiple sources of text to contribute to the same
      packet.</t>

      <t>In situations of risk for congestion, the Focus UA MAY combine
      packets from the same source to increase the transmission interval per
      source up to one second. Local conference policy in the Focus UA may be
      used to decide on which streams shall be selected for such transmission
      frequency reduction.</t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t></t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      &rfc2119;

      &rfc3261;

      &rfc3550;

      &rfc4103;

      &rfc4575;

      &rfc4579;

      &I-D.hellstrom-textpreview;

      <reference anchor="T.140" target="http://www.itu.int/rec/T-REC-T.140/en">
        <front>
          <title>Protocol for multimedia application text conversation</title>

          <author>
            <organization>ITU-T.</organization>
          </author>

          <date year="1998" />
        </front>
      </reference>
    </references>
  </back>
</rfc>