Date Issued: | 1st July 2024 |
Latest version: | http://www.imsglobal.org/activity/question/ |
IPR and Distribution Notices
Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the specification set forth in this document, and to provide supporting documentation.
1EdTech takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on 1EdTech's procedures with respect to rights in IMS specifications can be found at the 1EdTech Intellectual Property Rights web page: https://www.1edtech.org/ipr/1edtechipr_policyFinal.pdf.
Org Name | Date Election Made | Necessary Claims | Type |
---|---|---|---|
DRC | 23rd May 2024 | No | RF RAND (Required & Optional Elements) |
HMH | 10th May 2024 | No | RF RAND (Required & Optional Elements) |
New Meridian | 29th May 2024 | No | RF RAND (Required & Optional Elements) |
Copyright © 2024 1EdTech Consortium. All Rights Reserved.
Use of this specification to develop products or services is governed by the license with 1EdTech found on the 1EdTech website: https://www.1edtech.org/speclicense.html.
Permission is granted to all parties to use excerpts from this document as needed in producing requests for proposals.
The limited permissions granted above are perpetual and will not be revoked by 1EdTech or its successors or assigns.
THIS SPECIFICATION IS BEING OFFERED WITHOUT ANY WARRANTY WHATSOEVER, AND IN PARTICULAR, ANY WARRANTY OF NONINFRINGEMENT IS EXPRESSLY DISCLAIMED. ANY USE OF THIS SPECIFICATION SHALL BE MADE ENTIRELY AT THE IMPLEMENTER'S OWN RISK, AND NEITHER THE CONSORTIUM, NOR ANY OF ITS MEMBERS OR SUBMITTERS, SHALL HAVE ANY LIABILITY WHATSOEVER TO ANY IMPLEMENTER OR THIRD PARTY FOR ANY DAMAGES OF ANY NATURE WHATSOEVER, DIRECTLY OR INDIRECTLY, ARISING FROM THE USE OF THIS SPECIFICATION.
Public contributions, comments and questions can be posted here: www.1edtech.org/forums/1edtech-public-forums-and-resources.
© 2024 1EdTechConsortium, Inc.
All Rights Reserved.
Trademark information: http://www.1edtech.org/copyright.html
Document Name: 1EdTech Support for Speech Synthesis Markup Language (SSML) Using the 'data-ssml' Property v1.0
Revision: 1st July 2024
In educational contexts, students utilizing Text-To-Speech synthesis (TTS) frequently face challenges with incorrect pronunciations and unclear presentation of textual content. To address this issue, this specification introduces a mechanism enabling content creators to precisely define pronunciations and manage specific TTS presentation aspects.
The 1EdTech Question and Test Interoperability (QTI) 3.0 specification describes a data model for the representation of question ( qti-assessment-item
) and test ( qti-assessment-test
) data and their corresponding results reports. Therefore, the specification enables the exchange of this item, test and results data between authoring tools, item banks, test constructional tools, learning systems and assessment delivery systems. The 1EdTech QTI: Assessment, Section and Item (ASI) 3.0 specification is a part of the full QTI specification. In the ASI documentation is the definition of the data-ssml
property that is used to enable information to the passed to an SSML based processor.
By using a single attribute (i.e. data-ssml
) pronunciation and presentation guidance can be applied to textual content containers in both QTI 3.0 and HTML environments. This approach enables enhanced accessibility and improved learning experiences for students who rely upon TTS.
The expected and permitted format for the data passed in the data-ssml
is defined in this document. The permitted content is derived from the W3C Speech Synthesis Markup Language (SSML) Version 1 [SSML-11] specification. This document contains the abstract definition of the information permitted inside the data-ssml
property. JSON is the expected technology binding for the information in the data-ssml
property. Therefore, this document includes the definition of the JSON format and expresses this in the machine readable JSON-Schema.
1. Introduction
2.2 Root Attribute Description
2.2.1 "data-ssml" Root Attribute Description
2.3.1 "DataSSML" Root Class Description
2.3.1.1 "break" Attribute Description
2.3.1.2 "phoneme" Attribute Description
2.3.1.3 "prosody" Attribute Description
2.3.1.4 "say-as" Attribute Description
2.3.1.5 "sub" Attribute Description
2.4.1 "Break" Class Description
2.4.1.1 "strength" Attribute Description
2.4.1.2 "time" Attribute Description
2.4.2 "Phoneme" Class Description
2.4.2.1 "ph" Attribute Description
2.4.2.2 "alphabet" Attribute Description
2.4.2.3 "type" Attribute Description
2.4.3 "Prosody" Class Description
2.4.3.1 "rate" Attribute Description
2.4.4 "SayAs" Class Description
2.4.4.1 "interpret-as" Attribute Description
2.4.5 "Sub" Class Description
2.4.5.1 "alias" Attribute Description
2.5.1 "RateUnion" Class Description
2.6 Derived Class Descriptions
2.6.1 "CSSTimeFormat" Class Description
2.6.2 "RateString" Class Description
2.7 Enumerated Vocabulary Descriptions
2.7.1 "AlphabetEnum" Vocabulary Description
2.7.2 "InterpretAsEnum" Vocabulary Description
2.7.3 "RateEnum" Vocabulary Description
3. JSON Binding
3.2.1 Service Parameter Payload Properties UML/JSON Mapping
3.2.2 Service Parameter Payload Class UML/JSON Mapping
3.2.3 Payload Classes UML/JSON Mapping
3.2.3.1 Break Payload Class Mapping
3.2.3.2 Phoneme Payload Class Mapping
3.2.3.3 Prosody Payload Class Mapping
3.2.3.4 SayAs Payload Class Mapping
3.2.3.5 Sub Payload Class Mapping
3.2.4 Enumerated Class UML/JSON Mapping
3.2.5 Enumerated List Class UML/JSON Mapping
3.2.6 Primitive Type UML/JSON Mapping
3.3 JSON Payloads
3.3.1 "DataSSML" Class Payload
3.4 Description of the JSON Schema
3.4.1 General Information
3.4.2 Tags Information
3.4.3 Security Information
3.4.4 Paths Information
3.4.5 Definitions Information
3.4.5.1 "BreakDType" Definition
3.4.5.2 "DataSSMLDType" Definition
3.4.5.3 "PhonemeDType" Definition
3.4.5.4 "ProsodyDType" Definition
3.4.5.5 "SayAsDType" Definition
3.4.5.6 "SubDType" Definition
4.1.1 Creating Pauses in Spoken Presentations
4.1.2 Ensuring Text is Presented Accurately
4.1.3 Examples
5. Conformance and Certification
Appendix A Modeling Terms and Concepts
A1.1 Data Model Diagrams
A1.2 Class Descriptions
A1.3 Attribute and Characteristic Descriptions
A1.4 Enumerated Vocabulary Descriptions
A1.5 External Vocabulary Descriptions
A1.6 Import Class Descriptions
Appendix B Binding Terminology
B1.1 Service Parameter Payload Properties UML/JSON Mapping Table Definition
B1.2 UML/JSON Payload Class Mapping Table Definition
B1.3 UML/JSON Enumerated and Enumerated List Class Mapping Table Definition
B1.4 UML/JSON Primitive Types Mapping Table Definition
B2 OpenAPI Descriptions Explanations
B2.1a OpenAPI(2) General Information Table Explanation
B2.1b OpenAPI(3) General Information Table Explanation
B2.2 OpenAPI Tags Table Explanation
B2.3 OpenAPI Security Table Explanation
Appendix C JSON Schema Listings
Figure 2.3.1 DataSSML class definitions
Figure 2.4.1 Break class definitions
Figure 2.4.2 Phoneme class definitions
Figure 2.4.3 Prosody class definitions
Figure 2.4.4 SayAs class definitions
Figure 2.4.5 Sub class definitions
Figure 2.5.1 RateUnion class definitions
Figure 2.6.1 CSSTimeFormat class definitions
Figure 2.6.2 RateString class definitions
Figure 2.7.1 AlphabetEnum class definitions
Figure 2.7.2 InterpretAsEnum class definitions
Figure 2.7.3 RateEnum class definitions
Figure 2.7.4 StrengthEnum class definitions
Figure 2.7.5 TypeEnum class definitions
Figure 3.4.5.1 - OpenAPI JSON Schema description for the "BreakDType" Complex Type.
Figure 3.4.5.2 - OpenAPI JSON Schema description for the "PhonemeDType" Complex Type.
Figure 3.4.5.3 - OpenAPI JSON Schema description for the "ProsodyDType" Complex Type.
Figure 3.4.5.4 - OpenAPI JSON Schema description for the "SayAsDType" Complex Type.
Figure 3.4.5.5 - OpenAPI JSON Schema description for the "SubDType" Complex Type.
Table 2.2.1 "data-ssml" root attribute description
Table 2.3.1 DataSSML class definitions
Table 2.3.1.1 Description of the "break" attribute for the "DataSSML" class
Table 2.3.1.2 Description of the "phoneme" attribute for the "DataSSML" class
Table 2.3.1.3 Description of the "prosody" attribute for the "DataSSML" class
Table 2.3.1.4 Description of the "say-as" attribute for the "DataSSML" class
Table 2.3.1.5 Description of the "sub" attribute for the "DataSSML" class
Table 2.4.1 Break class definitions
Table 2.4.1.1 Description of the "strength" attribute for the "Break" class
Table 2.4.1.2 Description of the "time" attribute for the "Break" class
Table 2.4.2 Phoneme class definitions
Table 2.4.2.1 Description of the "ph" attribute for the "Phoneme" class
Table 2.4.2.2 Description of the "alphabet" attribute for the "Phoneme" class
Table 2.4.2.3 Description of the "type" attribute for the "Phoneme" class
Table 2.4.3 Prosody class definitions
Table 2.4.3.1 Description of the "rate" attribute for the "Prosody" class
Table 2.4.4 SayAs class definitions
Table 2.4.4.1 Description of the "interpret-as" attribute for the "SayAs" class
Table 2.4.5 Sub class definitions
Table 2.4.5.1 Description of the "alias" attribute for the "Sub" class
Table 2.5.1 RateUnion class description
Table 2.6.1 CSSTimeFormat class definitions
Table 2.6.2 RateString class definitions
Table 2.7.1 AlphabetEnum class definitions
Table 2.7.2 InterpretAsEnum class definitions
Table 2.7.3 RateEnum class definitions
Table 2.7.4 StrengthEnum class definitions
Table 2.7.5 TypeEnum class definitions
Table 3.2.3.1 - Payload UML/JSON Mapping for the "Break" Class
Table 3.2.3.2 - Payload UML/JSON Mapping for the "Phoneme" Class
Table 3.2.3.3 - Payload UML/JSON Mapping for the "Prosody" Class
Table 3.2.3.4 - Payload UML/JSON Mapping for the "SayAs" Class
Table 3.2.3.5 - Payload UML/JSON Mapping for the "Sub" Class
Table 3.2.4 - UML/JSON Mapping for the Enumerated Class Definitions
Table 3.2.6 - UML/JSON Mapping for the Primitive Type Definitions
Table 3.3.1.1 - Tabular representation of the JSON payload for the "DataSSML class.
Table 3.4.5.1 - OpenAPI JSON Schema description for the "BreakDType" Complex Type.
Table 3.4.5.2 - OpenAPI JSON Schema description for the "PhonemeDType" Complex Type.
Table 3.4.5.3 - OpenAPI JSON Schema description for the "ProsodyDType" Complex Type.
Table 3.4.5.4 - OpenAPI JSON Schema description for the "SayAsDType" Complex Type.
Table 3.4.5.5 - OpenAPI JSON Schema description for the "SubDType" Complex Type.
Table A1.1 The key to the descriptions of data model diagrams
Table A1.2 The key to the descriptions of the data class tables
Table A1.3 The key to the descriptions of the data attribute/characteristic tables
Table A1.4 The key to the descriptions of the enumerated vocabulary tables
Table A1.5 The key to the descriptions of the external vocabulary tables
Table A1.6 The key to the descriptions of the import class tables
Table A1.7 The key to the descriptions of the link data tables
Table A1.8 The key to the descriptions of the privacy class tables
Table A1.9 The key to the descriptions of the common data model persistent identifier tables
Table B2.1a The key to the tabular description of the OpenAPI(2) general information
Table B2.1b The key to the tabular description of the OpenAPI(3) general information
Table B2.2 The key to the tabular description of the OpenAPI tags information
Table B2.3 The key to the tabular description of the OpenAPI security information.
Table B2.4 The key to the tabular description of the OpenAPI paths information for an HTTP Verb
Table B2.5 The key to the tabular description of the OpenAPI definitions information
Code 3.3.1.1 - JSON payload example for the "DataSSML" Class
A variety of methods have been used by vendors of assessment systems/tools/apps to provide pronunciation or presentational hints to assistive technologies which render text using text to speech synthesis (such as screen readers and read aloud tools). These methods include the misuse of the W3C WAI-ARIA [WAI-ARIA] standard, repurposing the 'alt' attribute for the 'image' tag in HTML to provide alternate pronunciations, insertion of multiple commas to create pauses of sufficient duration, and the creation of proprietary custom attributes. Therefore proprietary solutions were often crafted for specific platforms and speech synthesizers.
While the W3C Speech Synthesis Markup Language (SSML) is a solution to interoperability across many speech synthesizers, there was no method for incorporating SSML into HTML. This is a key requirement for Question and Test Interoperability (QTI) which is commonly rendered (transformed) into this markup for web delivery.
This specification defines a data-attribute ( data-ssml
) that utilizes JSON to encapsulate SSML functions, attributes, and values in manner that allows for easy consumption by assistive technologies. The data-ssml
attribute can be applied to text container elements in QTI and HTML to allow content authors to control the pronunciation and presentation of content by text to speech synthesizers.
All sections marked as non-normative, all authoring guidelines, diagrams (with the exception of the UML diagrams), examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119]. This means that from the perspective of conformance:
The Conformance and Certification Guide for this specification may introduce greater normative constraints than those defined here for specific service or implementation categories.
The SHOULD/SHOULD NOT/RECOMMENDED statements MUST NOT be used in any document, or section of a document, that is responsible for defining the information model and/or the associated bindings and/or conformance and certification.
The structure of the rest of this document is:
2. Information Model | The abstract description of the data model using the 1EdTech Model Driven Specification approach. It is from this model that the concrete binding is derived. |
3. JSON Binding | The description of how the information model is implemented using JSON as the concrete data exchange format. |
4. Best Practices | The set of recommended best practices when using the specification. This information adds to the corresponding best practices for the QTI 3.0 specification [QTI-IMPL-30]. |
5. Conformance and Certification | The process for undertaking conformation and obtaining certification with respect to this specification. This is in the context of corresponding confiormance and certification for the QTI 3.0 specification [QTI-CERT-30]. |
References | The set of cited documents, normative and informative, that are used to support the technical details in this document; |
Appendix A Modeling Terms and Concepts | An overview of the model driven approach, the concepts and the terms used by 1EdTech to create the data model representations (based upon a profile of UML), the corresponding set of bindings and the accompanying documentation (including this information model). |
Appendix B Binding Terminology | An overview of the model driven approach, the concepts and the terms used by 1EdTech to create the service model REST/JSON binding definitions and the accompanying documentation (including this binding). |
Appendix C JSON Schema Listings | The listing of the the JSON Schema that SHOULD be used to validate the content of the data-ssml property. It is this artifact that is used by 1EdTech to support the formal certification process. |
ARIA | Accessible Rich Internet Applications |
ASI | Assessment, Section and Item |
IPA | International Phonetic Association |
ISO | International Standards Organisation |
JSON | Java Script Object Notation |
QTI | Question and Test Interoperability |
RFC | Request For Comment |
SSML | Speech Synthesis Markup Language |
TTS | Text-To-Speech |
UML | Unified Modeling Language |
URI | Uniform Resource Identifier |
VDEX | Vocabulary Description and Exchange |
W3C | World Wide Web Consortium |
WAI | Web Accessibility Initiative |
The data-ssml
attribute can be applied to QTI and HTML elements containing textual content. The data contained is a JSON structure which contains the SSML function (tag), for example, say-as
, and any required property-value pairs needed to fully specify the function. Assistive technologies are expected to query the data-ssml
attribute of an element and use the JSON string to generate the appropriate SSML markup corresponding to the function, properties and values specified by the author. The resultant SSML is then sent to the SSML conformant text to speech synthesizer by the assistive technology.
A subset of SSML functions were selected for inclusion in this specification. This subset was chosen so that all of the following criteria would be met for each selected function:
The subset is:
break
- controls the pausing or other prosodic boundaries between tokens e.g. between two words in a sentyencephoneme
- provides a phonemic/phonetic pronunciation for the contained textprosody
- permits control of the speaking ratesay-as
- allows the author to indicate information on the type of text construct contained within the element and to help specify the level of detail for rendering the contained textsub
- employed to indicate that the text in the alias attribute value replaces the contained text for pronunciation.Some members of the subset, for example, prosody, are limited to in functionality, in this specific case, speech rate. Other prosody features are not consistently implemented across speech synthesizers.
This Section is NORMATIVE.
All of the Root attributes (the root name for the instances that can be exchanged) used within this Information Model are described in this Section. The syntax and semantics for this representation is described in Appendix A1.3. The root attributes are:
data-ssml
property is the container for the equivalent JSON value that is to be passed to an SSML processor. This is the equivalent data-ssml
property defined to provide the mapping context to the QTI ASI 3.0 specification.The definition of the "data-ssml" root attribute is shown in Table 2.2.1.
Descriptor | Definition |
---|---|
Attribute Name | data-ssml |
Data Type | DataSSML |
Description | In the 1EdTech Question and Test Interoperability (QTI): Assessment, Section and Item (ASI) 3.0 specification the data-ssml property is the container for the equivalent JSON value that is to be passed to an SSML processor. This is the equivalent data-ssml property defined to provide the mapping context to the QTI ASI 3.0 specification. |
This Section is NORMATIVE.
All of the Root data classes (the first class objects that can be exchanged using the data model) used within this Information Model are described in this Section. The syntax and semantics for this representation is described in Appendix A1.2.
The data model for the "DataSSML" root class is shown in Figure 2.3.1 and the accompanying definition in Table 2.3.1.
Figure 2.3.1 - DataSSML class definitions.
The description of the "break" attribute for the "DataSSML" root class is given in Table 2.3.1.1.
Descriptor | Definition |
---|---|
Attribute Name | break |
Data Type | Break |
Value Space | Container [ Unordered ] |
Scope | Local ("-") |
Multiplicity | [0..1] |
Privacy | There are NO privacy implications. |
Description | The break element controls the pausing or other prosodic boundaries between tokens e.g. the space beween two words in a sentence. |
The description of the "phoneme" attribute for the "DataSSML" root class is given in Table 2.3.1.2.
Descriptor | Definition |
---|---|
Attribute Name | phoneme |
Data Type | Phoneme |
Value Space | Container [ Unordered ] |
Scope | Local ("-") |
Multiplicity | [0..1] |
Privacy | There are NO privacy implications. |
Description | The phoneme element provides a phonemic/phonetic pronunciation for the contained text. |
The description of the "prosody" attribute for the "DataSSML" root class is given in Table 2.3.1.3.
Descriptor | Definition |
---|---|
Attribute Name | prosody |
Data Type | Prosody |
Value Space | Container [ Unordered ] |
Scope | Local ("-") |
Multiplicity | [0..1] |
Privacy | There are NO privacy implications. |
Description | The prosody element permits control of the pitch, speaking rate and volume of the speech output. |
The description of the "say-as" attribute for the "DataSSML" root class is given in Table 2.3.1.4.
Descriptor | Definition |
---|---|
Attribute Name | say-as |
Data Type | SayAs |
Value Space | Container [ Unordered ] |
Scope | Local ("-") |
Multiplicity | [0..1] |
Privacy | There are NO privacy implications. |
Description | The say-as element allows the author to indicate information on the type of text construct contained within the element and to help specify the level of detail for rendering the contained text. |
The description of the "sub" attribute for the "DataSSML" root class is given in Table 2.3.1.5.
Descriptor | Definition |
---|---|
Attribute Name | sub |
Data Type | Sub |
Value Space | Container [ Unordered ] |
Scope | Local ("-") |
Multiplicity | [0..1] |
Privacy | There are NO privacy implications. |
Description | The sub element is employed to indicate that the text in the alias attribute value replaces the contained text for pronunciation. |
This Section is NORMATIVE.
All of the data classes used within this Information Model are described in this Section. The syntax and semantics for this representation is described in Appendix A1.2.
The data model for the "Break" class is shown in Figure 2.4.1 and the accompanying definition in Table 2.4.1.
Figure 2.4.1 - Break class definitions.
The description of the "strength" attribute for the "Break" class is given in Table 2.4.1.1.
Descriptor | Definition |
---|---|
Attribute Name | strength |
Data Type | StrengthEnum |
Value Space | Enumerated value set of: { none | x-weak | weak | medium | strong | x-strong } Default = "medium". |
Scope | Local ("-") |
Multiplicity | [0..1] |
Privacy | There are NO privacy implications. |
Description | This attribute is used to indicate the strength of the prosodic break in the speech output. The value 'none' indicates that no prosodic break boundary should be outputted, which can be used to prevent a prosodic break which the processor would otherwise produce. The other values indicate monotonically non-decreasing (conceptually increasing) break strength between tokens. The stronger boundaries are typically accompanied by pauses. 'x-weak' and 'x-strong' are mnemonics for 'extra weak' and 'extra strong', respectively. |
The description of the "time" attribute for the "Break" class is given in Table 2.4.1.2.
Descriptor | Definition |
---|---|
Attribute Name | time |
Data Type | CSSTimeFormat |
Value Space | Container [ DerivedType ] |
Scope | Local ("-") |
Multiplicity | [0..1] |
Privacy | There are NO privacy implications. |
Description | This attribute is used to indicate the duration of a pause to be inserted in the output in seconds or milliseconds. It follows the time value format from the Cascading Style Sheets Level 2 Recommendation [CSS-21] e.g. '250ms', '3s'. |
The data model for the "Phoneme" class is shown in Figure 2.4.2 and the accompanying definition in Table 2.4.2.
Figure 2.4.2 - Phoneme class definitions.
The description of the "ph" attribute for the "Phoneme" class is given in Table 2.4.2.1.
Descriptor | Definition |
---|---|
Attribute Name | ph |
Data Type | String (Primitive-type) |
Value Space | See Appendix A1.3. |
Scope | Local ("-") |
Multiplicity | [1] |
Privacy | There are NO privacy implications. |
Description | This is the phoneme/phone string to be used. It is designed strictly for phonemic and phonetic notations and is intended to be used to provide pronunciations for words or very short phrases. The phonemic/phonetic string does not undergo text normalization and is not treated as a token for lookup in the lexicon. |
The description of the "alphabet" attribute for the "Phoneme" class is given in Table 2.4.2.2.
Descriptor | Definition |
---|---|
Attribute Name | alphabet |
Data Type | AlphabetEnum |
Value Space | Enumerated value set of: { ipa | x-sampa } |
Scope | Local ("-") |
Multiplicity | [0..1] |
Privacy | There are NO privacy implications. |
Description | Specifies the phonemic/phonetic pronunciation alphabet. A pronunciation alphabet in this context refers to a collection of symbols to represent the sounds of one or more human languages. |
The description of the "type" attribute for the "Phoneme" class is given in Table 2.4.2.3.
Descriptor | Definition |
---|---|
Attribute Name | type |
Data Type | TypeEnum |
Value Space | Enumerated value set of: { default | ruby } Default = "default". |
Scope | Local ("-") |
Multiplicity | [0..1] |
Privacy | There are NO privacy implications. |
Description | Indicates additional information about how the pronunciation information is to be interpreted. |
The data model for the "Prosody" class is shown in Figure 2.4.3 and the accompanying definition in Table 2.4.3.
Figure 2.4.3 - Prosody class definitions.
The description of the "rate" attribute for the "Prosody" class is given in Table 2.4.3.1.
Descriptor | Definition |
---|---|
Attribute Name | rate |
Data Type | RateUnion |
Value Space | Container [ Union ] |
Scope | Local ("-") |
Multiplicity | [0..1] |
Privacy | There are NO privacy implications. |
Description | A change in the speaking rate for the contained text. Legal values are: a non-negative percentage or an enumerated vocabulry. Labels 'x-slow' through 'x-fast' represent a sequence of monotonically non-decreasing speaking rates. When the value is a non-negative percentage it acts as a multiplier of the default rate. |
The data model for the "SayAs" class is shown in Figure 2.4.4 and the accompanying definition in Table 2.4.4.
Figure 2.4.4 - SayAs class definitions.
The description of the "interpret-as" attribute for the "SayAs" class is given in Table 2.4.4.1.
Descriptor | Definition |
---|---|
Attribute Name | interpret-as |
Data Type | InterpretAsEnum |
Value Space | Enumerated value set of: { date | time | telephone | characters | cardinal | ordinal } |
Scope | Local ("-") |
Multiplicity | [1] |
Privacy | There are NO privacy implications. |
Description | Indicates the content type of the contained text construct. Specifying the content type helps the synthesis processor to distinguish and interpret text constructs that may be rendered in different ways depending on what type of information is intended. |
The data model for the "Sub" class is shown in Figure 2.4.5 and the accompanying definition in Table 2.4.5.
Figure 2.4.5 - Sub class definitions.
The description of the "alias" attribute for the "Sub" class is given in Table 2.4.5.1.
Descriptor | Definition |
---|---|
Attribute Name | alias |
Data Type | String (Primitive-type) |
Value Space | See Appendix A1.3. |
Scope | Local ("-") |
Multiplicity | [1] |
Privacy | There are NO privacy implications. |
Description | Specifies the string to be spoken instead of the enclosed string. The processor should apply text normalization to the alias value. |
The set of union classes used within this Information Model are described in this Section. The syntax and semantics for this representation is described in Appendix A1.2.
The data model for the "RateUnion" class is shown in Figure 2.5.1 and the accompanying definition in Table 2.5.1.
Figure 2.5.1 - RateUnion class definitions.
This Section is NORMATIVE.
All of the derived data classes used within this Information Model are described in this Section. The syntax and semantics for this representation is described in Appendix A1.2.
The data model for the "CSSTimeFormat" class is shown in Figure 2.6.1 and the accompanying definition in Table 2.6.1.
Figure 2.6.1 - CSSTimeFormat class definitions.
Descriptor | Definition |
---|---|
Class Name | CSSTimeFormat |
Class Type | Container [ DerivedType ] |
Parents | The set of parent classes are: |
Derived Classes | There are no derived classes. |
Super Classes | The set of classes from which this class is derived: |
Characteristics | There are no characteristics. |
Children | There are no children. |
Description | The data-type for the time value format from [CSS-21], e.g. '250ms', '3s', etc. |
The data model for the "RateString" class is shown in Figure 2.6.2 and the accompanying definition in Table 2.6.2.
Figure 2.6.2 - RateString class definitions.
This Section is NORMATIVE.
All of the enumerated vocabularies used within this Information Model are described in this Section. The syntax and semantics for this representation is described in Appendix A1.4.
The permitted set of values for the 'alphabet' property. The data model for the "AlphabetEnum" enumerated class is shown in Figure 2.7.1 and the accompanying vocabulary definition in Table 2.7.1.
Figure 2.7.1 - AlphabetEnum class definitions.
The data model for the "InterpretAsEnum" enumerated class is shown in Figure 2.7.2 and the accompanying vocabulary definition in Table 2.7.2.
Figure 2.7.2 - InterpretAsEnum class definitions.
The data model for the "RateEnum" enumerated class is shown in Figure 2.7.3 and the accompanying vocabulary definition in Table 2.7.3.
Figure 2.7.3 - RateEnum class definitions.
This is the set of permitted values for the 'strength' attribute in the 'Break' class. The data model for the "StrengthEnum" enumerated class is shown in Figure 2.7.4 and the accompanying vocabulary definition in Table 2.7.4.
Figure 2.7.4 - StrengthEnum class definitions.
The set of permitted values for how the pronunciation, of the phoneme, is to be interpretted. The data model for the "TypeEnum" enumerated class is shown in Figure 2.7.5 and the accompanying vocabulary definition in Table 2.7.5.
Figure 2.7.5 - TypeEnum class definitions.
The value of the data-ssml
attribute is a JSON string containing one or more JSON objects representing a specific SSML function with one or more property/value pairs. The content for the data-ssml
property MUST be valid as defined by the corresponding JSON Schema listed in Appendix C
This Section is NOT NORMATIVE.
This is a data model ONLY definition, as part of the common data model, and so there are NO Service Parameter definitions.
This is a data model ONLY definition, as part of the common data model, and so there are NO Service Parameter class definitions.
The syntax and semantics for the Data Class UML/JSON mapping representations is described in Appendix B1.2.
The Payload UML/JSON Mapping for the "Break" Class is given in Table 3.2.3.1.
Information Model Details | JSON Binding Details | ||||
---|---|---|---|---|---|
Name | UML Artefact | Data Type | Multiplicity | Name | Type |
Break | Core | Container [ Unordered ] | - N/A - | BreakDType | Object |
|
Attribute | [ Enumeration (StrengthEnum) ] | [0..1] | strength | Property |
|
Attribute | DT: CSSTimeFormat (PT: NormalizedString) | [0..1] | time | Property |
The Payload UML/JSON Mapping for the "DataSSML" Class is given in Table 3.2.3.2.
Information Model Details | JSON Binding Details | ||||
---|---|---|---|---|---|
Name | UML Artefact | Data Type | Multiplicity | Name | Type |
DataSSML | Core | Container [ Unordered ] | - N/A - | DataSSMLDType | Object |
|
Attribute | Break | [0..1] | break | Property |
|
Attribute | Phoneme | [0..1] | phoneme | Property |
|
Attribute | Prosody | [0..1] | prosody | Property |
|
Attribute | SayAs | [0..1] | say-as | Property |
|
Attribute | Sub | [0..1] | sub | Property |
The Payload UML/JSON Mapping for the "Phoneme" Class is given in Table 3.2.3.3.
Information Model Details | JSON Binding Details | ||||
---|---|---|---|---|---|
Name | UML Artefact | Data Type | Multiplicity | Name | Type |
Phoneme | Core | Container [ Unordered ] | - N/A - | PhonemeDType | Object |
|
Attribute | PT: String | [1] | ph | Property |
|
Attribute | [ Enumeration (AlphabetEnum) ] | [0..1] | alphabet | Property |
|
Attribute | [ Enumeration (TypeEnum) ] | [0..1] | type | Property |
The Payload UML/JSON Mapping for the "Prosody" Class is given in Table 3.2.3.4.
Information Model Details | JSON Binding Details | ||||
---|---|---|---|---|---|
Name | UML Artefact | Data Type | Multiplicity | Name | Type |
Prosody | Core | Container [ Unordered ] | - N/A - | ProsodyDType | Object |
|
Attribute | [ Union (RateUnion) ] | [0..1] | rate | Property |
The Payload UML/JSON Mapping for the "SayAs" Class is given in Table 3.2.3.5.
Information Model Details | JSON Binding Details | ||||
---|---|---|---|---|---|
Name | UML Artefact | Data Type | Multiplicity | Name | Type |
SayAs | Core | Container [ Unordered ] | - N/A - | SayAsDType | Object |
|
Attribute | [ Enumeration (InterpretAsEnum) ] | [1] | interpret-as | Property |
The Payload UML/JSON Mapping for the "Sub" Class is given in Table 3.2.3.6.
Information Model Details | JSON Binding Details | ||||
---|---|---|---|---|---|
Name | UML Artefact | Data Type | Multiplicity | Name | Type |
Sub | Core | Container [ Unordered ] | - N/A - | SubDType | Object |
|
Attribute | PT: String | [1] | alias | Property |
The definition of the set of enumerated data-types used in this specification is given in Table 3.2.4. The syntax and semantics for the Enumerated Class UML/JSON mapping representations is described in Appendix B1.3.
Enumeration Class Name | Description |
---|---|
AlphabetEnum | Enumerated value set of: { ipa | x-sampa }. |
InterpretAsEnum | Enumerated value set of: { date | time | telephone | characters | cardinal | ordinal }. |
RateEnum | Enumerated value set of: { x-slow | slow | medium | fast | x-fast | default }. |
StrengthEnum | Enumerated value set of: { none | x-weak | weak | medium | strong | x-strong }. |
TypeEnum | Enumerated value set of: { default | ruby }. |
There are no enumerated list class definitions.
The definition of the set of union data-types used in this specification is given in Table 3.2.6. The syntax and semantics for the Union Class UML/JSON mapping representations is described in Appendix B1.4.
Union Class Name | Description |
---|---|
RateUnion | This is a value from one of the set of data-types: RateEnum, RateString |
The definition of the set of primitive data-types used in this specification is given in Table 3.2.7. The syntax and semantics for the Primitive Type UML/JSON mapping representations is described in Appendix B1.4.
Primitive Type Name | Description |
---|---|
NormalizedString | This is mapped to the JSON "string" data-type. |
String | This is mapped to the JSON "string" data-type. |
This Section is NOT NORMATIVE.
This is a data model definition ONLY and is a part of the common data model. The payloads defined herein are, in general, partial-only i.e. each payload will be a combination of several common data model components.
A tabular description of the class partial payload is shown in the Table below.
Property Name | Multiplicity | JSON Data-type | Description |
---|---|---|---|
break | [0..1] | Object | The break element controls the pausing or other prosodic boundaries between tokens e.g. the space beween two words in a sentence. |
strength | [0..1] | Enumeration | This attribute is used to indicate the strength of the prosodic break in the speech output. The value 'none' indicates that no prosodic break boundary should be outputted, which can be used to prevent a prosodic break which the processor would otherwise produce. The other values indicate monotonically non-decreasing (conceptually increasing) break strength between tokens. The stronger boundaries are typically accompanied by pauses. 'x-weak' and 'x-strong' are mnemonics for 'extra weak' and 'extra strong', respectively. |
time | [0..1] | String | This attribute is used to indicate the duration of a pause to be inserted in the output in seconds or milliseconds. It follows the time value format from the Cascading Style Sheets Level 2 Recommendation [CSS-21] e.g. '250ms', '3s'. |
phoneme | [0..1] | Object | The phoneme element provides a phonemic/phonetic pronunciation for the contained text. |
ph | [1..1] | String | This is the phoneme/phone string to be used. It is designed strictly for phonemic and phonetic notations and is intended to be used to provide pronunciations for words or very short phrases. The phonemic/phonetic string does not undergo text normalization and is not treated as a token for lookup in the lexicon. |
alphabet | [0..1] | Enumeration | Specifies the phonemic/phonetic pronunciation alphabet. A pronunciation alphabet in this context refers to a collection of symbols to represent the sounds of one or more human languages. |
type | [0..1] | Enumeration | Indicates additional information about how the pronunciation information is to be interpreted. |
prosody | [0..1] | Object | The prosody element permits control of the pitch, speaking rate and volume of the speech output. |
rate | [0..1] | Union(RateUnion) | A change in the speaking rate for the contained text. Legal values are: a non-negative percentage or an enumerated vocabulry. Labels 'x-slow' through 'x-fast' represent a sequence of monotonically non-decreasing speaking rates. When the value is a non-negative percentage it acts as a multiplier of the default rate. |
say-as | [0..1] | Object | The say-as element allows the author to indicate information on the type of text construct contained within the element and to help specify the level of detail for rendering the contained text. |
interpret-as | [1..1] | Enumeration | Indicates the content type of the contained text construct. Specifying the content type helps the synthesis processor to distinguish and interpret text constructs that may be rendered in different ways depending on what type of information is intended. |
sub | [0..1] | Object | The sub element is employed to indicate that the text in the alias attribute value replaces the contained text for pronunciation. |
alias | [1..1] | String | Specifies the string to be spoken instead of the enclosed string. The processor should apply text normalization to the alias value. |
An example of the class partial payload is shown in the code block below.
This Section is NORMATIVE.
This data model is a part of the common data model and so there is NO General Information defined in the OpenAPI description.
This data model is a part of the common data model and so there are NO Tags defined in the OpenAPI description.
This data model is a part of the common data model and so there is NO Security defined in the OpenAPI description.
This data model is a part of the common data model and so there are NO Paths defined in the OpenAPI description.
The following Tables describe the OpenAPI information for each of the JSON Schema Definitions. The syntax and semantics for these JSON Schema Definition descriptions are described in Appendix B2.5.
The OpenAPI JSON Schema description for the "BreakDType" Complex Type is given in Table 3.4.5.1.
The OpenAPI JSON Schema description for the "DataSSMLDType" Complex Type is given in Table 3.4.5.2.
The OpenAPI JSON Schema description for the "PhonemeDType" Complex Type is given in Table 3.4.5.3.
The OpenAPI JSON Schema description for the "ProsodyDType" Complex Type is given in Table 3.4.5.4.
The OpenAPI JSON Schema description for the "SayAsDType" Complex Type is given in Table 3.4.5.5.
The OpenAPI JSON Schema description for the "SubDType" Complex Type is given in Table 3.4.5.6.
The use of data-ssml
to add pronunciation and presentation control to test item content and learning materials should follow best practices appropriate to the context of use. Best practices should be evidence-based and aligned with relevant standards and guidelines that address correct pronunciation of a word or phrase, presentation of acronyms or numeric values, or pacing of spoken presentation. These may be determined by the specific needs of an assessment program or by spoken presentation requirement of specific subject matter (e.g., chemistry or mathematics). Pronunciation is also referenced as a W3C WCAG 2.2 [W3C-WCAG-22] success criteria (Pronunciation 3.1.6 AAA), and may be specifically required for accessibility conformance by some organizations.
data-ssml
is recommended only for elements containing text, ideally words, numbers or short phrases. The data-ssml
attribute is not usable in conjunction with attribute values e.g. aria-label or alt.
Embedding JSON into an XML attribute comes with some complexity. Any special characters within the JSON MUST be properly escaped. One particular problem is that XML tends to use double-quotes (") to enclose attribute values while JSON mandates use of double-quotes to identify property keys. This specification defers to the XML and JSON specifications on how JSON data is serialized within the data-ssml
XML attribute and allows for any valid/well-formed XML that contains valid/well-formed JSON after deserialization. Below are four legitimate methods serializing JSON data within an XML attribute:
1. Single quote usage example: <p>This is some text with <span data-ssml='{"sub":{"alias":"Speech Synthesis Markup Language"}}'>SSML</span> substitution! </p>
2. Double quote usage example (XML entity encoding): <p>This is some text with <span data-ssml="{"sub":{"alias":"Speech Synthesis Markup Language"}}">SSML</span> substitution! </p>
3. Double quote usage example (Hex encoding): <p>This is some text with <span data-ssml="{"sub":{"alias":"Speech Synthesis Markup Language"}}">SSML</span> substitution! </p>
4. Double quote usage example (Decimal encoding): <p>This is some text with <span data-ssml="{"sub4:{"alias":"Speech Synthesis Markup Language"}}">SSML</span> substitution! </p>
Implementers MUST NOT give preference to a particular serialization method as each will resolve to the same target string.
Best practices may dictate that pauses should be used to aid comprehension, especially where content structure is complex. For example, the break tag within data-ssml
can be used to introduce a pause, signaling a new section or emphasizing a point.
Content may be authored in assessments and learning materials that may not be spoken as expected by text to speech synthesizers. The data-ssml
attribute may be used by authors to ensure that content will be presented correctly. Specific examples include:
data-ssml
can be used to introduce a pause, signaling a new section or emphasizing a pointdata-ssml
attribute may be used by authors to ensure that content will be presented correctly. For example:
data-ssml
sub function is recommended for providing alternate text for acronyms or abbreviations to ensure clarity. While some acronyms are well handled by TTS engines, many are not and can be confusing for students and data-ssml
allows authors to precisely control the presentationdata-ssml
say-as
function allows authors to precisely specify how numbers and text strings are to be read. Numeric values, dates, time, and text which must be read by character are all possible applications of say-as
data-ssml
prosody function can be used to adjust the speed of speech delivery. Slower rates may be necessary for complex or highly technical material. There are currently no recommended use cases for increasing presentation rate.An example of the use of the say-as
element is:
<p>Please select the choice which is the base 10 equivalent to <span data-ssml='{"say-as":{"interpret-as":"characters"}}'>10111</span></p>
An example of the use of the phoneme
element is:
<p>My latest travel guide is about <span data-ssml='{"phoneme":{"alphabet":"ipa","ph":"ˈrɛdɪŋ"}}'>Reading</span>, Pennsylvania.</p>
An example of the use of the sub
element is:
<span data-ssml='{"sub":{"alias":"messenger RNA"}}'>mRNA</span>
An example of the use of the break
element is:
<p>Our new vocabulary word is<span data-ssml='{"break":{"time":"250ms"}}'> </span>deoxyribonucleic acid.</p>
An example of the use of the prosody
element is:
<p>Tyrannosaurus Rex is a dinosaur. Let's repeat that name slowly "<span data-ssml='{"prosody":{"rate":"slow"}}'>Tyrannosaurus Rex</span>".</p>
Extensions are not permitted in this specification. Therefore, the content in the data-ssml
property MUST NOT contain values undefined in this specification.
This specification may be profiled. A profile must be formal subset of the base specification. This ensures that, with the exception of namespace/schema location changes, any instance which is compliant to the profile MUST also be compliant to the base specification. This means that a profile must only increase the constraints on the properties of the data model. For example, an element with a multiplicity of [0..1] can have this changed to [1..1] but NOT [0..*]. Proprietary extensions are ONLY permitted as defined by the base specification.
It is strongly recommended that a profile of this specification is undertaken either by, or with the close support, of 1EdTech. However, no matter who is responsible for creating the profile artifacts (documents, XSDs, etc.), it is strongly recommended that the 1EdTech specification tools are used. This will ensure that the artifacts are consistent with the base specifications and that useful support documentation is automatically produced e.g. creation of a document that summarises the differences between the base specification and the profile. Organizations wishing to produce a profile of this specification should contact the VP Of Operations at: operations@1edtech.org.
There is NO separate certification against the data-ssml
specification. Instead, certification for the use of this specification is as part of the certification for QTI 3.0 [QTI-CERT-30]. In particular, the use of the data-ssml
is defined as a new capability for certification of Applications and Content Packages. The data-ssml
XML attribute is used on the following QTI XML tags:
[BCP 47] | Matching of Language Tags (RFC 4647) and Tags for Identifying Languages (RFC 5646), A.Phillips and M.Davis, Internet Engineering Task Force, September 2009, https://www.rfc-editor.org/info/bcp47. |
[CSS-21] | Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification W3C Recommendation, B.Bos, T.Celik, I.Hickson and H.W.Lie, World Wide Web Consortium, June 2011, http://www.w3.org/TR/CSS21/. |
[ISO 8601] | ISO8601:2004 Data elements and interchange formats - Information interchange - Representation of dates and times, ISO, International Standards Organization (ISO), 2000. |
[OAS, 14] | OpenAPI Specification (version 2), D.Miller, J.Harmon, J.Whitlock, K.Hahn, M.Gardiner, M.Ralphson, R.Dolin, R.Ratovsky and T.Tam, OpenAPI Initiative (Linux Foundation), September 2014, https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md. |
[OAS, 17] | OpenAPI Specification (version 3), D.Miller, J.Whitlock, M.Gardiner, M.Ralphson, R.Ratovsky and U.Sarid, OpenAPI Initiative (Linux Foundation), July 2017, https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.0.0.md. |
[QTI-CERT-30] | Question and Test Interoperability (QTI) 3.0 Conformance and Certification 1.0, Tom Hoffmann, Paul Grudnitski, Mark Molenaar and Colin Smythe, 1EdTech Consortium, May 2022, https://www.imsglobal.org/question/qtiv3p0/imsqtiv3p0_certificationv1p0.htm. |
[QTI-IMPL-30] | 1EdTech Question and Test Interoperability (QTI) 3.0 Implementation Guide 1.0, Tom Hoffmann, Paul Grudnitski, Mark Molenaar and Padraig O'hiceadha, 1EdTech Consortium, May 2022, https://www.imsglobal.org/question/qtiv3p0/imsqtiv3p0_implv1p0.html. |
[RFC 2119] | Key words for use in RFCs to Indicate Requirement Levels, S. Bradner, IETF (RFC 2119), March 1997, https://tools.ietf.org/html/rfc2119. |
[Ruby] | Use Cases and Exploratory Approaches for Ruby Markup: W3C Working Group Note 08, R. Ishida, World Wide Web Consortium, October 2013, http://www.w3.org/TR/ruby-use-cases/. |
[SSML-11] | Speech Synthesis Markup Language (SSML) Version 1.1 W3C Recommendation, D.C.Burnett and Z.W.Shuang, World Wide Web Consortium, September 2010, http://www.w3.org/TR/speech-synthesis11/. |
[VDEX, 04] | 1EdTech Vocabulary Definition Exchange (VDEX) 1.0, Adam Cooper, 1EdTech Consortium, February 2004, https://www.imsglobal.org/vdex/vdexv1p0/imsvdex_infov1p0.html. |
[W3C-WCAG-22] | Web Content Accessibility Guidelines (WCAG) 2.2, A.Campbell, C.Adams, R.Bradley Montgomery, M.Cooper and A.Kirkpatrick, World Wide Web Consortium, October 2023, https://www.w3.org/TR/2023/REC-WCAG22-20231005/. |
[WAI-ARIA] | Accessible Rich Internet Applications (WAI-ARIA) 1.2, J.Diggs, J.Nurthen and M.Cooper, World Wide Web Consortium, March 2021, https://www.w3.org/TR/wai-aria-1.2/. |
This section is NOT NORMATIVE.
Table A1.1 provides the key to the descriptions of data model diagrams.
Feature | Definition and Usage |
---|---|
Data Model Package | Each data model description is enclosed in a UML Package that has the stereotype of « dataModel » under which is the name of the data model diagram being described. Only one logical data model can be described. |
DerivedType Class | This is a class that is identified by the stereotype « DerivedType » under which is the name of the data-type. A derived class is one that is derived either from another derived class or a PrimitiveType class. |
Enumeration Class | This is a class that is identified by the stereotype « Enumeration » under which is the name of the enumeration data-type. The enumeration class consists of the list of tokens that are the permitted values of the assigned attribute. |
Enumerated List Class | This is a class that is identified by the stereotype « EnumeratedList » under which is the name of the enumerated list data-type. The enumeration list class consists of the list of tokens that are the permitted values of the assigned attribute. A list of tokens is permitted using comma separation. |
PrimitiveType Class | This is a class that is identified by the stereotype « PrimitiveType » under which is the name of the primitive data-type. A PrimitiveType is one of the many base data-types on which a data model can be built (see Appendix A1.3 for the set of primitive types that are available). |
Selection Class | This is a class that is identified by the stereotype « Selection » under which is the name of the data-type. The selection means that only one of the listed attributes make occur in an instance. If this is an abstract class then multiple iterations of the instance may occur and the multiplicity of the attribute defines the constraints on the number of times the attribute can occur in the full instance. If the stereotype and associated name of the class are in italics this denotes the class is abstract. |
Sequence Class | This is a class that is identified by the stereotype « Sequence » under which is the name of the data-type. The sequence means that the listed attributes must occur only in the order of the attributes listed on the class. The associated multiplicity defines the number of times the attribute may occur consecutively in the instance. If the stereotype and associated name of the class are in italics this denotes the class is abstract. |
Unordered Class | This is a class that is identified by the stereotype « Unordered » under which is the name of the data-type. The unordering means that the listed attributes may occur in any order but the associated multiplicity for the attribute must be followed (when binding to XML this requires the use of Schematron rules to enforce the multiplicity). If the stereotype and associated name of the class are in italics this denotes the class is abstract. |
List Class | This is a class that is identified by the stereotype « List » under which is the name of the data-type. A list class is one in which the associated instance will consist of a list of objects that conform to the permitted data-types of the list (the superclasses for the list class). The terms in the list are separated by a space. |
Union Class | This is a class that is identified by the stereotype « Union » under which is the name of the data-type. A union class is one in which the associated instance will consist of objects that conform to any of the permitted data-types of the union (the superclasses for the union class). |
Characteristic Description | Many classes contain a set of characteristics (the set of characteristics are listed under the stereotype « Characteristics »). Each characteristic description consists of the scope, name, data-type and multiplicity (see Appendix A1.3 for a more complete description). Note that when bound to XSD/XML, a characteristic is mapped to an XML attribute. |
Attribute Description | Many classes contain a set of attributes (the set of attributes are listed under the stereotype « Attributes »). Each attribute description consists of the scope, name, data-type and multiplicity (see Appendix A1.3 for a more complete description). Note when bound to XSD/XML, an attribute is mapped to an XML element. |
Aggregation Arrow | This is an arrow with a white diamond head to indicate that the child class is an aggregate structure to the parent class i.e. the child class may exist without the context of the parent class. This association allows complex structures to be constructed with common subcomponents. |
Composition Arrow | This is an arrow with a filled diamond head to indicate that the child class is a composite structure of the parent class i.e. the child class only exists within the context of the parent class. This association allows complex structures to be constructed with common subcomponents. |
Generalization Arrow | This is an arrow with a white arrow head to indicate the class/superclass relationship. The arrow points in the direction of generality i.e. from the class to the super class. |
Table A1.2 provides the key to the descriptions of the data class tables.
Category | Definition |
---|---|
Class Name | The name given to the class being described. |
Class Type | The nature of the class. This is described as a "Container [...]" or "Abstract Container [...]". The value of "..." being (see Appendix A1.1 for the meaning of these values):
|
Parents | This is the list of classes that contain the class being described as either the type of a child characteristic or attribute. In the case of a Root Class the entry is also labeled as "Root Class". |
Derived Classes | The set of classes that are derived from this class (there may be none). The entries are linked to the corresponding class descriptions. |
Super Classes | The set of super classes from which the class being described is derived (there may be none). The entries are linked to the corresponding class descriptions. |
Characteristics | Lists the set of characteristics for this class. The list of characteristics includes those that are inherited. Each characteristic is linked to the corresponding characteristic description table. |
Children | Lists the set of attributes for this class (the only other permitted associations are generalizations). The list of children includes those attributes that are inherited. Each child entry is linked to the corresponding attribute description table. The nature of the relationship between the children is defined by the stereotype of the parent class i.e. the class type. If the child is in italics this denotes a reference to an abstract class and that an instance would NOT contain a child of that name but would be replaced by a complex set of children as defined by the associated abstract class. The marking of [P] is used to denote that the attribute has privacy implications that will be described in the corresponding description of the attribute. |
Link Data | Lists the set of attributes for this class that are used to provide links to other data objects in the data model. Many types of link references are available. This row is ONLY shown when the class contains at least one link data definition. |
Description | Contains descriptions relating to the class and its properties and relationships. |
Table A1.3 provides the key to the descriptions of the data attributes/characteristics for the data classes.
Category | Definition |
---|---|
Attribute Name or Characteristic Name | The name given to the attribute or characteristic being described. If the name is in italics this denotes an abstract attribute or characteristic. |
Data Type | This is the data-type of the attribute or characteristic (if this is in italics it denotes an abstract class). The data-type can take many forms:
|
Value Space | The range of valid values for this attribute/characteristic (including any default value). If the value space is unspecified, it is not known or is not important. This value space must be defined in terms of the associated data-type. |
Scope | This is the scope of the attribute/characteristic with permitted values of:
|
Multiplicity | A property of an attribute/characteristic indicating the number of times it may be used or appear in a given class instance. The values of this property are expressed as a range or shorthand for a range using the notation:
|
Privacy | Identifies the nature, if any, of the privacy sensitivity. If there are no privacy implications the phrase "There are NO privacy implications is presented". When there are privacy implications the category of the privacy is present (the available terms are defined in Privacy Data Description Appendix Subsection) along with a description of the privacy implications. |
Description | Contains descriptions relating to the attribute/characteristic and its values space. |
Link Data | Contains the description of the link data definition. A link to the corresponding detailed link data description is supplied. This row is ONLY shown when the attribute/characteristic is a link data definition. |
Table A1.4 provides the key to the descriptions of the enumerated vocabulary classes. These are vocabularies that will be contained within the binding form itself. They are contained within a class that has a stereotype of either « Enumeration » or « EnumeratedList ».
Category | Definition |
---|---|
Term | The vocabulary token itself i.e. the vocabulary entry. |
Definition | The meaning of the term and how it should be used. |
Table A1.5 provides the key to the descriptions of the external vocabulary classes. These are vocabularies that will be contained in some independent format e.g. using the 1EdTech VDEX [VDEX, 04].
Category | Definition |
---|---|
Term | The vocabulary token itself i.e. the vocabulary entry. |
Definition | The meaning of the term and how it should be used. This consists of the "Caption" and "Description" of the vocabulary term. The caption is used to provide a human readable label for the term. |
Table A1.6 provides the key to the descriptions of the import classes.
Category | Definition |
---|---|
Import Class Name | The name of the class. |
Parent Classes | The list of parent classes, and the associated children, that use this imported class. Each class and attribute name has a link to its corresponding tabular description in the information model. |
Description | The description of how the class is used within the data model. |
Table A1.7 provides the key to the descriptions of the link data definitions.
Category | Definition |
---|---|
Target Class Name | This is the name of the target class i.e. the destination point of the link reference. |
Link Type | This is the type of link that is being used. The types of link available are:
|
Link Sources | This is the set of classes that contain attributes/characteristics which use the link data defined by this entry. A link to the attribute/characteristic is provided. |
Source Attribute | This is the attribute/characteristic in the source object that contains the identifier of the target object (a characteristic name MUST start with an "@"). This will only be supplied if the pointer is contained within a substructure within the source object. If there is no source the statement "Not Applicable" will be displayed. |
Target Attribute | This is the attribute/characteristic in the target class which is the container for the identifier of the object being identified (a characteristic name MUST start with an "@"). It is the value for this identifier which MUST be supplied in the source object. For "CPResourceId" link types the fixed value of "@identifier" will be given. If there is no target the statement "Not Applicable" will be displayed. |
Parent Class Name | This is the name of the class that contains both the source and target attributes/characteristics. This value will only be supplied for the "IntraParentClassId" link types. If there is no parent class name the statement "Not Applicable" will be displayed. |
Description | The description of how the link data is used within the data model. |
Table A1.8 provides the key to the descriptions of the privacy data definitions.
Category | Definition |
---|---|
Attribute | The name of the attribute. This is the list of ALL of the attributes in the class and NOT just those which have privacy implications. |
Multiplicity | A property of an attribute/characteristic indicating the number of times it may be used or appear in a given class instance. This information identifies which attributes MAY/MUST NOT be excluded from the data being exchanged. The values of this property are expressed as a range or shorthand for a range using the notation:
|
Data-type | The data-type of the attribute (the permitted set of values is listed in the Attribute and Characteristics Descriptions subsection in this Appendix). This information identifies those attributes which MAY be obfuscated and/or encrypted without violating the data-type. |
Privacy Implication | The set of categories that can be applied to an attribute/characteristic:
|
Description | Details of the nature of the privacy implications. |
Table A1.9 provides the key to the descriptions of the common data model persistent identifier definitions.
Category | Definition |
---|---|
Name | This is the name of the data model component which has been assigned a common data model persistent identifier. |
Type | This is the type of link that is being used. The types of link available are:
|
Persistent Identifier | The common data model persistent identifier that has been assigned to this data model component. By definition, this is a unique (within the context of the 1EdTech Common Data Model) and very long-lived identifier |
Table B1.1 provides the key to the descriptions of UML to JSON service parameter mapping tables.
Feature | Definition and Usage |
---|---|
Operation Name | The name operation (this will be a list of all of the operations for the set of defined interfaces). |
Parameter Name | The name of the service parameter (these are the parameters listed for the operation that are not mapped to endpoint query parameters). |
UML Class | The name of the class, the type of the parameter, in the UML diagrams (each class will have an associated stereotype label to denote its modeling interpretation). If the information model description is contained within the same document, this value is hot-linked to that description. |
JSON Name | The equivalent name of the JSON parameter name in the JSON payload. |
JSON Type | The JSON type - this will be either "Object" or "Array of Objects". |
JSON Schema Data Type | The data-type in the context of the JSON Schema. This is hot-linked to the corresponding description table in the binding. |
Table B1.2 provides the key to the descriptions of UML to JSON payload class mapping tables. This table shows the relationship between the two modeling components:
Feature | Definition and Usage |
---|---|
Name | The name of the UML class and the associated set of attributes and characteristics. The first row is used to describe the UML class. Camel-case is used for the attribute and characteristic names. |
UML Artifact | The UML Class will be denoted as "Root" or "Core" depending on the nature of the class. The list of attributes (mapped to JSON properties) and characteristics (mapped to JSON properties) will be identified as either "Attributes" or "Characteristics". |
Data Type | The data-type has several permitted values:
|
Multiplicity | The multiplicity of the child attribute/characteristic. The value for the Class itself is "-N/A-". The multiplicity values are:
|
JSON Name | This is the equivalent name of the UML artifact in the JSON. |
JSON Type | The JSON data-type. For the Class this will have the value "Object". For the attributes the value is either "Property" or "Array of Properties" depending on the multiplicity. For the characteristics the value is either "Properties" or "Array of Properties" depending on the multiplicity. |
Table B1.3 provides the key to the descriptions of UML to JSON enumerated and enumerated list class mapping tables.
Feature | Definition and Usage |
---|---|
Enumeration Class Name or Enumeration List Class Name | The name of the enumeration class or the enumeration list class. |
Description | The list of permitted tokens for the enumeration or list. Each value is separated by the "|" character. |
Table B1.4 provides the key to the descriptions of UML to JSON primitive-type mapping tables.
Feature | Definition and Usage |
---|---|
Primitive Type Name | The name of the primitive type used in the specification. Links to the definition of the primitive types, if provided elsewhere in the document, are supplied. |
Description | The equivalent base data type that is used in the JSON binding. |
These definitions are with respect to the OpenAPI version 2 [OAS, 14] and version 3 [OAS, 17] specifications.
Tables B2.1a and B2.1b provide the key to the OpenAPI(2) and OpenAPI(3) general information respectively.
Category | Definition and Usage |
---|---|
Swagger Version | The version of the OpenAPI/Swagger specification for this OpenAPI description (this must be set as 2.0). |
Specification Title | The title of the specification being described. |
Specification Version | The version of the specification being described. |
Description | A short, human readable description of the specification being described using OpenAPI. |
Terms of Service | The Terms of Service for the API. |
Contact | The contact information for the API. For the IMS OpenAPI released files this will be set as "Lisa Mattson (IMS COO)". When used for an implementation this should be changed to the actual contact person. |
License | The URL for the associated IMS License for the use of this OpenAPI description. |
Host | The host (name or ip) serving the API. For the IMS OpenAPI(2) released files this will be set as "www.imsglobal.org". When used for an implementation this should be changed to the actual host. |
Base Path | The base path that MUST be used in the endpoint URLs (this is relative to the host). |
Schemes | The set of transfer protocols that are supported using this API. This is a comma separated list. |
Consumes | A list of MIME types the APIs can consume. This is a comma separated list. |
Produces | A list of MIME types the APIs can produce. This is a comma separated list. |
Category | Definition and Usage |
---|---|
OpenAPI Version | The version of the OpenAPI specification for this OpenAPI description (this must be set as 3.0.0). |
Specification Title | The title of the specification being described. |
Specification Version | The version of the specification being described. |
Description | A short, human readable description of the specification being described using OpenAPI. |
Terms of Service | The URL for the associated Terms of Service for the use of this OpenAPI description. |
Contact | The contact information for the API. For the IMS OpenAPI released files this will be set as "Lisa Mattson (IMS COO)". When used for an implementation this should be changed to the actual contact person. |
License | The URL for the associated IMS License for the use of this OpenAPI description. |
Servers | The host (name or ip) serving the API. For the IMS OpenAPI(3) released files this will be set as "www.imsglobal.org/{base-path}". When used for an implementation this should be changed to the actual host. |
Table B2.2 provides the key to the tabular description of the OpenAPI tags information (versions 2 and 3).
Category | Definition and Usage |
---|---|
Tag Name | The title of the tag (this must be unique). The tags are derived from the set of Interfaces defined in the Behavioral Model. |
Description | A human readable description of the tag. This is the comment associated with the Interface in the Behavioral Model. The list of associated endpoints assigned to this tag are listed with links to the OpenAPI description of those endpoints. |
Table B2.3 provides the key to the tabular description of the OpenAPI security information.
Category | Definition and Usage |
---|---|
Security Label | The label by which this mode is identified within the OpenAPI file. |
Type | The security mode supported. The permitted values are:
|
Description | A human readable description of the usage of this security scheme. |
Flow | The flow used by the OAuth2 security scheme. The permitted values in OAS2 are:
The permitted values in OAS3 are:
|
Token URL | The token URL to be used for this flow. A value MUST be supplied for the "password", "application" and "accessMode" flows in OAuth 2. |
Authorization URL | The authorization URL to be used for this flow. A value MUST be supplied for the "accessMode" flows in OAuth 2. |
Refresh URL | The refresh URL to be used for this flow. A value MAY be supplied for the "accessMode" flows in OAuth 2. |
Scopes | The set of labels by which the global scope will be identified. |
Global Scope | The default identification of the security mode to be applied to an endpoint. |
Table B2.4 provides the key to the OpenAPI paths information for an HTTP Verb.
Category | Definition and Usage |
---|---|
Operation ID | A unique identifier for the service operation. This is the name of the operation defined in the Behavioral Model. |
Summary | A human readable summary of the objective of the service operation. |
Tags | The tag which has been assigned to this operation. The tag is determined by the Interface under which the operation is defined in the Behavioral Model. |
Security and Scopes | The list of security modes and scopes that MUST be used enable access to this endpoint. |
Privacy Classification | This is the classification of the JSON Payload with respect to the nature of the confidentiality as defined in the 1EdTech Privacy framework. All privacy classifications are identified using the property name of "x-1edtech-confidentiality" for each endpoint definition under the "paths" part of the OpenAPI file. The available classifications are:
|
Description | A human readable summary of the objective of the service operation. This is derived from the associated description of the operation supplied in the Behavioral Model. |
Path Placeholders | The set of placeholders, and their meaning, in the URL path that will be replaced by the appropriate values in the request calls. |
Query Parameters | The set of query parameters that are permitted on the request calls. For each query parameter the following information is supplied:
|
Responses | The set of query responses that are permitted for the request. For each response the following information is supplied:
|
Table B2.5 provides the key to the OpenAPI definitions information.
Category | Definition and Usage |
---|---|
Annotations | The definition of the complex-type as supplied in the data model definition for the associated class. |
Diagram | This diagram consists of two types of linked blocks. Straight link lines denote the set of unordered JSON properties. The block forms are:
|
Type Hierachy | The identification of the superclass upon which this type is based (the superclass is shown on the top line). This indicates the source of the inherited set of JSON properties (this line is only displayed when there is a type hierarchy). |
Model | The set of child properties. This is an ordered list of properties (as per the implied or actual sequence in the object) and accompanied by their multiplicity. In the case where the type is an enumeration or primitiveType then the value is "N/A". The value may also be "Empty" to indicate that no children are permitted. In some situations the value may be "None" denoting that there are no children defined e.g. for a base class from which other classes are derived and which may have children as part of the extension. |
Privacy and Confidentiality | The set of annotations for each class/data-type defined in the OpenAPI files. Each of these annotatuions are placed after the list of properties for the class/data-type definition. The annotations are:
|
Source (OAS2) | The equivalent JSON Schema (OpenAPI dialect) code for the declaration of the complex-type. This is the full declaration. See the corresponding OpenAPI 2 documentation [OAS, 14] for the description of the permitted contents for this declaration. |
Source (OAS3) | The equivalent JSON Schema (OpenAPI dialect) code for the declaration of the complex-type. This is the full declaration. See the corresponding OpenAPI 3 documentation [OAS, 17] for the description of the permitted contents for this declaration. |
This is the set JSON Schema listings used to validate the data models defined in this specification.
The JSON Schema listing is shown below (the JSON Schema is available at: https://purl.imsglobal.org/spec/qti/v3p0/schema/jsd/imsqtiv3p0_datassml_v1p0.json).
{ "$schema": "http://json-schema.org/draft/2020-12/schema", "$id" : "https://www.imsglobal.org/jsd/qtiv3p0/DataSSML.json", "title" : "Support for Speech Synthesis Markup Language (SSML) Using the 'data-ssml' Property Version 1.0 Final Release JSON Schema Binding (DataSSML)", "description" : "Author-Mark Hakkinen (ETS) and Colin Smythe (1EdTech); Version-1.0; Release Date-1st July 2024. ", "type" : "object", "properties" : { "break" : { "description" : "The break element controls the pausing or other prosodic boundaries between tokens e.g. the space beween two words in a sentence.", "$ref" : "#/$defs/BreakDType" }, "phoneme" : { "description" : "The phoneme element provides a phonemic/phonetic pronunciation for the contained text.", "$ref" : "#/$defs/PhonemeDType" }, "prosody" : { "description" : "The prosody element permits control of the pitch, speaking rate and volume of the speech output.", "$ref" : "#/$defs/ProsodyDType" }, "say-as" : { "description" : "The say-as element allows the author to indicate information on the type of text construct contained within the element and to help specify the level of detail for rendering the contained text.", "$ref" : "#/$defs/SayAsDType" }, "sub" : { "description" : "The sub element is employed to indicate that the text in the alias attribute value replaces the contained text for pronunciation.", "$ref" : "#/$defs/SubDType" } }, "additionalProperties" : false, "$defs" : { "BreakDType" : { "description" : "The Break Function is used to inform the TTS to generate a pause of specified strength or time duration.", "type" : "object", "properties" : { "strength" : { "description" : "This attribute is used to indicate the strength of the prosodic break in the speech output. The value 'none' indicates that no prosodic break boundary should be outputted, which can be used to prevent a prosodic break which the processor would otherwise produce. The other values indicate monotonically non-decreasing (conceptually increasing) break strength between tokens. The stronger boundaries are typically accompanied by pauses. 'x-weak' and 'x-strong' are mnemonics for 'extra weak' and 'extra strong', respectively.", "type" : "string", "enum" : [ "medium","none","strong","weak","x-strong","x-weak" ], "default" : "medium" }, "time" : { "description" : "Model Primitive Datatype = NormalizedString. This attribute is used to indicate the duration of a pause to be inserted in the output in seconds or milliseconds. It follows the time value format from the Cascading Style Sheets Level 2 Recommendation [CSS-21] e.g. '250ms', '3s'.", "type" : "string" } }, "additionalProperties" : false }, "PhonemeDType" : { "description" : "The Phoneme function provides a phonemic/phonetic pronunciation for the contained text. It is recommended that the information contain human-readable text that can be used for non-spoken rendering of the document. For example, the content may be displayed visually for users with hearing impairments.", "type" : "object", "properties" : { "ph" : { "description" : "Model Primitive Datatype = String. This is the phoneme/phone string to be used. It is designed strictly for phonemic and phonetic notations and is intended to be used to provide pronunciations for words or very short phrases. The phonemic/phonetic string does not undergo text normalization and is not treated as a token for lookup in the lexicon.", "type" : "string" }, "alphabet" : { "description" : "Specifies the phonemic/phonetic pronunciation alphabet. A pronunciation alphabet in this context refers to a collection of symbols to represent the sounds of one or more human languages.", "type" : "string", "enum" : [ "ipa","x-sampa" ] }, "type" : { "description" : "Indicates additional information about how the pronunciation information is to be interpreted.", "type" : "string", "enum" : [ "default","ruby" ], "default" : "default" } }, "required" : [ "ph" ], "additionalProperties" : false }, "ProsodyDType" : { "description" : "The Prosody Function is used to inform the TTS to speak the contained text using one or more of the specified prosody values as defined by the associated attributes.", "type" : "object", "properties" : { "rate" : { "description" : "A change in the speaking rate for the contained text. Legal values are: a non-negative percentage or an enumerated vocabulry. Labels 'x-slow' through 'x-fast' represent a sequence of monotonically non-decreasing speaking rates. When the value is a non-negative percentage it acts as a multiplier of the default rate.", "anyOf" : [ { "type" : "string", "enum" : [ "x-slow","slow","medium","fast","x-fast","default" ] }, { "description" : "Model Primitive Datatype = NormalizedString.", "type" : "string" } ] } }, "additionalProperties" : false }, "SayAsDType" : { "description" : "The Say-As function allows an author to indicate information on the type of text construct contained within the element and to help specify the level of detail for rendering the contained text.", "type" : "object", "properties" : { "interpret-as" : { "description" : "Indicates the content type of the contained text construct. Specifying the content type helps the synthesis processor to distinguish and interpret text constructs that may be rendered in different ways depending on what type of information is intended.", "type" : "string", "enum" : [ "cardinal","characters","date","ordinal","telephone","time" ] } }, "required" : [ "interpret-as" ], "additionalProperties" : false }, "SubDType" : { "description" : "The Sub Function is used to inform TTS to speak the supplied text string (alias) instead of speaking the contained text string.", "type" : "object", "properties" : { "alias" : { "description" : "Model Primitive Datatype = String. Specifies the string to be spoken instead of the enclosed string. The processor should apply text normalization to the alias value.", "type" : "string" } }, "required" : [ "alias" ], "additionalProperties" : false } } }
Title: | 1EdTech Support for Speech Synthesis Markup Language (SSML) Using the 'data-ssml' Property v1.0 |
Editors: | Colin Smythe (1EdTech) Mark Hakkinen (ETS) Tom Hoffman (1EdTech) Susan Haught (1EdTech) |
Co-chairs: | Mark Hakkinen (ETS) Mike Powell (Pearson) Padraig O'hiceadha (Houghton Mifflin Harcourt) |
Version: | 1.0 |
Version Date: | 1st July 2024 |
Status: | 1EdTech Final Release |
Summary: | In the 1EdTech Question and Test Interoperability: Assessment, Section and Item 3.0 specification the 'data-ssml' property is the container for the equivalent JSON value that is to be passed to an SSML processor. The expected data format for that JSON is defined in this document. |
Revision Information: | The original version of this specification document. |
Purpose: | For internal review by the 1EdTech QTI Project Group. |
Document Location: | https://www.imsglobal.org/question/ |
The following individuals contributed to the development of this document:
Paul Grudnitski | amp-up.io (USA) |
Mark Hakkinen | ETS (USA) |
Susan Haught | 1EdTech Consortium (USA) |
Tom Hoffmann | 1EdTech Consortium (USA) |
Mark Molenaar | Apenutmize (Netherlands) |
Padraig O'hiceadha | HMH (Eire) |
Julien Sebire | O.A.T. (Luxemburgh) |
Colin Smythe | 1EdTech Consortium (USA) |
Wyatt VanderStucken | ETS (USA) |
Sarah Wood | ETS (USA) |
Version No. | Release Date | Comments |
---|---|---|
Final Release 1.0 | 1st July, 2024 | This is the first formal release of the QTI 'Data-SSML' specification. This document should be used in the context of the 1EdTech Question and Test Interoperability (QTI): Assessment, Section and Item (ASI) specification version 3.0. |
1EdTech Consortium, Inc. ("1EdTech") is publishing the information contained in this document ("Specification") for purposes of scientific, experimental, and scholarly collaboration only.
1EdTech makes no warranty or representation regarding the accuracy or completeness of the Specification.
This material is provided on an "As Is" and "As Available" basis.
The Specification is at all times subject to change and revision without notice.
It is your sole responsibility to evaluate the usefulness, accuracy, and completeness of the Specification as it relates to you.
1EdTech would appreciate receiving your comments and suggestions.
Please contact 1EdTech through our website at https://www.1edtech.org.
Please refer to Document Name: 1EdTech Support for Speech Synthesis Markup Language (SSML) Using the 'data-ssml' Property v1.0
Date: 1st July 2024