From OpenHome
Contents |
Songcast Direct
Overview
Songcast Direct (SCD) can be used to send decoded audio from any computing device to an OpenHome device. The Sender device is responsible for decoding audio to PCM, framing this in a simple protocol and making it available via a simple TCP server. The Receiver pulls data from this server, giving the Receiver complete control of the audio clock.
A Songcast Direct sender performs a broadly similar role to a Songcast sender. Applications that perform their own decoding to PCM will find SCD is easier to integrate and offers higher audio performance:
- There is a constant TCP connection between sender and receiver, avoiding the need for application-level resend support
- It uses the clock of the receiver, allowing for higher quality playback if the sender is on a desktop computer
- It offers the receiver limited control over the stream – initially seeking within a track or skipping between tracks. While the primary control UI may reside on the sender device, this allows for integration with any Ir handset for the receiver.
Sample Code
Sample code exists for all key aspects of the Sender – server, framing and control.
Discovery
SSDP
First search for av-openhome-org:service:Product:2 then either
- Check Attributes on Product service for "Transport"
- Check Modes on Transport service for "scd"
mDNS
Search for _openhome._odp service to identify ODP endpoint
Control
Control of the Receiver is out of band, via the standard OpenHome network APIs. These are available over UPnP or ODP (a single connected socket).
Protocol
The protocol used for communication between Sender and Receiver contains the following message types:
Type | Description | Sent by |
---|---|---|
Ready | Signals availability and version support to other party | Receiver & Sender |
MetadataDidl | Metadata relevant until the end of the next track. Will cause the receiver to reset its reported time indicator for the audio stream. Should be sent after all Audio for any preceding track. Uses DIDL-Lite. | Sender |
MetadataOh | Metadata relevant until the end of the next track. Will cause the receiver to reset its reported time indicator for the audio stream. Should be sent after all Audio for any preceding track. Uses OpenHome Metadata format. | Sender |
Format | Format for the following audio. Must be sent before any audio. | Sender |
Audio | Decoded audio. Can only be sent after a Format message describing its sample rate etc. | Sender |
MetatextDidl | Metadata relevant to a portion of a track. May be sent [0..n] times during a track. Uses DIDL-Lite. | Sender |
MetatextOh | Metadata relevant to a portion of a track. May be sent [0..n] times during a track. Uses OpenHome Metadata format. | Sender |
Halt | Indicates that a break in audio follows. E.g. at the end of a track with no further tracks to be played, or when the Sender has paused. | Sender |
Disconnect | Indicates that the originator of the message is closing its SCD session. No further audio will be available. The message receiver should disconnect its socket. | Receiver & Sender |
Seek | Indicates that the Receiver wants to jump to a different point in the current track. Is only sent for tracks that the Sender has indicated support seeking. | Receiver |
Skip | Indicates that the Receiver wants to immediately jump to the next or previous track. | Receiver |
A SCD session is initiated by the Sender. It instantiates a simple TCP server then instructs the Receiver to start playing from it.
The Receiver connects to the Sender and sends a Ready message, indicating the protocol version it supports. If the Sender is compatible with this version, it responds with a Ready message stating the same version. Otherwise, the Sender sends a Ready message stating the version it supports. If the Receiver cannot support this version, it must close the connection.
After sending a Ready message, the Sender should send either Halt (if it has nothing to send yet) or one of MetadataDidl or MetadataOh plus Format. The order of Format versus Metadata* is not important.
After this, the Sender should then send Audio. Any number of Audio messages can be sent until the end of the stream is reached. Metatext messages (either Didl or Oh) may be interleaved with Audio.
A Halt message implies that a break in audio may follow. If this break is during a stream, the Sender must have applied attenuation to samples immediately before this to avoid any audio artifacts at the break in transmission. Senders which cannot do this should implement pausing of streams using the Transport.Pause control API instead.
The details of the wire protocol are shown below. Note that all multi-byte integer values are big endian.
Bytes | Name | Description |
---|---|---|
Header | Prefixes to all messages below | |
4 | Signature | 0x73, 0x63, 0x64,0x20 ('scd ') |
1 | Type | The type of message:
|
2 | Length | Length in bytes of the whole message including this header |
4 | Reserved | Unused, reserved for future use |
Ready | ||
2 | Major Version | 1 |
2 | Minor Version | 0 |
MetadataDidl | ||
2 | TrackUriLength | Length, in bytes, of the track URI |
m | TrackUri | The track URI, where m = TrackUriLength |
2 | MetadataLength | Length, in bytes, of the track metadata, in DIDL-Lite format |
n | Metadata | The track metadata, where n = MetadataLength |
MetadataOh | ||
1 | Count | Number of key-value pairs that follow |
1 | KeyLength | Length, in bytes, of the key |
p | Key | Key for the metadata element, where p = KeyLength |
2 | ValueLength | Length, in bytes, of the value |
q | Value | Value for the metadata element, where q = ValueLength |
... | ... | Repeat key-value pairs for all Count metadata elements |
Format | ||
1 | BitDepth | Bit depth of the following audio |
4 | SampleRate | Sample rate of the following audio |
1 | Channels | Number of channels of the following audio |
4 | BitRate | Bit rate of the following audio |
8 | SamplesTotal | Total number of samples in the following audio (requirement for this to be confirmed) |
8 | SampleStart | Sample position of the first sample in the next Audio message |
1 | Flags |
|
1 | CodecNameLength | Length, in bytes, of the codec name |
r | CodecName | The codec name, where r = CodecNameLength |
Audio | ||
2 | NumSamples | The number of audio samples in this message |
s | AudioData | Audio data. Where s = NumSamples * (BitDepth/8) * NumChannels. Multi-channel audio must supply data sample at a time, with left channel first. Audio is packed (i.e. no padding bytes rounding samples up to 32-bit boundaries) and big endian. |
Metatextdidl | ||
2 | Metatext Length | Length, in bytes, of the metatext, in DIDL-Lite format |
n | Metatext | The metatext, where n = MetatextLength |
MetatextOh | ||
1 | Count | Number of key-value pairs that follow |
1 | KeyLength | Length, in bytes, of the key |
p | Key | Key for the metatext element, where p = KeyLength |
2 | ValueLength | Length, in bytes, of the value |
q | Value | Value for the metatext element, where q = ValueLength |
... | ... | Repeat key-value pairs for all Count metatext elements |
Halt | ||
No message body (only header required) | ||
Disconnect | ||
No message body (only header required) | ||
Seek | ||
To be confirmed | ||
Skip | ||
To be confirmed |