The schema and the vocabulary tables below explain what the elements describe and how the form should be completed. **Please note that the metadata records will be publicly searchable, and as such they should not contain information which may identify participants. No names of actors or institutions should be listed if doing so may identify a participant** THE METADATA FORM The metadata record should be completed on the Excel spreadsheet named CAVA metadata form, available from  HYPERLINK "http://www.ucl.ac.uk/ls/cava/docs.shtml" the CAVA documents page. Elements in the schema appear horizontally in the top row of the Excel metadata form. Each recording (each unique file) corresponds to a row in the table, as can be seen below. In this case, 7 JC 12-03 and 8 JC 03-04 are unique AVI files.  HOW TO COMPLETE THE FORM Each unique file should be entered on a new row. A large amount of the metadata will be repeated in longitudinal datasets, because multiple files refer to the same actor. It will save time to copy and paste blocks of elements; for instance, elements 5-15, 23,25-42 and 44-46 will normally be the same for all files which feature the same actor. It is recommended that entries are grouped by actor in order to save effort, as in the example metadata (in the Excel document CAVA metadata form). If you are depositing multiple versions of the same recording (for instance 7 JC 12-03 as an AVI, an MPEG-1 and a WAV file), please complete only one row on the form, as the metadata record will be the same for each version. If you are depositing multiple versions of a recording, please complete the Associated files table on sheet 2 of the form. Vocabularies are indicative only. If you wish to use a term that the existing vocabularies do not encapsulate, please use it instead. However, bear in mind that discoverability should be the main concern in completing the form. Please do not use a term which differs only semantically from one on the list. For example, if the recording features an augmentative/alternative communication aid, please do not write AAC, as this is not as intuitively searchable. Aside from those which are boolean (yes/no choices), all fields on the form are free text. This means that multiple answers to each element are encouraged. Separate these with a comma. For example, in element 20, Communication modes, you may write gesture, sign, vocalisations, eyegaze, facial expressions, deictic (pointing) gestures if all these are present. Please leave element 44 blank, as the tiers of access will be assigned by the CAVA team. Aside from element 44, please attempt to complete all the fields on the form for each recording. The more comprehensive the information you provide, the easier it will be for users of the repository to search for the data. Any element which is listed in [brackets] can be left blank (We prefer a record that at least has all the un-bracketed elements completed. If you come across an un-bracketed field you cannot complete, leave it blank unless the open vocabularies require you to differentiate between Unknown and Unspecified). If you have any queries, or for any further information regarding the metadata form, please contact Matt Mahon (the CAVA Project Officer) at HYPERLINK "mailto:lib-cava@ucl.ac.uk"lib-cava@ucl.ac.uk. THE CAVA SUBSET SCHEMA This schema shows how the elements relate to each other, and what subgroups they fall into. Elements in [brackets] may be left blank. Elements marked with (c) are subject to a controlled vocabulary. Elements marked (boolean) are subject to a yes/no choice. No.Object +1Identifier2Date (c)3Original format (c)4Format historyLocation (sub)5Country (c)6DescriptionProject +7Name8IDContact (sub)9Name10Contact's organisation11Longitudinal project (boolean)12DescriptionContent +13Genre14Subgenre15Communication ContextLanguages (sub)16Number of languages (c)17Spoken language ID (c)18Sign language ID (c)19Language variety20Communication modesTranscription (sub)21Transcription (boolean)22[Transcription format]Actors +23ID24Age (c)25Age band (c)26Sex (c)27[Occupation or previous occupation]28[Actor notes]Condition (sub)29Condition30Condition subtype31Cause of condition32Onset of condition33Intervention history34Family history35[Hearing status]36[Vision status]37[Handedness]38[Sign language experience]Education (sub)39[Education leaving age] (c)40[School Type]41[Class Kind]42[Education Model]43[Boarding School] (boolean)44Secondary actor(s) notesAccess +45Rights (c)46Rights evaluation date (c)47Owner ELEMENT DESCRIPTIONS AND INDICATIVE VOCABULARIES The table below explains what each element describes and how it should be completed. It works as follows: ELEMENTDESCRIPTIONINDICATIVE VOCABULARY OBJECT +IdentifierThe name of the session (file). Controlled see Table 3.Date (c)The date the file was created. YYYY-MM, or circa.ControlledOriginal format (c)The format in which the recording was first made.ControlledFormat historyAn open description of any changes to the format of the recording.Free text. For example, Converted to AVI, MPEG-1 and WAV for depositLocation (sub)CountryThe country in which the recording was made.ControlledDescriptionAn open description of the location.Name the town or city and more specific location. For example, if Country is United Kingdom, the description might include London, Primary Care Trust clinic. It is not appropriate to name the institution where the recording took place if this may help to identify the participants.PROJECT+NameThe name of the project for which the recording was made. Free text. For example, EAL deaf childrenIDThe ID number of the project. Alphanumeric. For example, HMM-DOH or ESRC R000239306Contact (sub)Contact nameThe name of the primary researcher(s) on the project.Free text. For example, Dr Suzanne BeekeContacts organisationThe organisation at which the primary researcher(s) are based.Free text.Longitudinal project (boolean)Is this session part of a longitudinal dataset?{ yes | no }Project descriptionAn open description of the project.Free text.CONTENT+GenreThe genre of the session.The following open vocabulary is suggested: Alone Group One:OneSubgenreThe subgenre of the session.The following open vocabulary is suggested: Adult and adult Adult and speech and language therapist Adult parent and adult child Child and child Child and parent Child and sibling Child and teacher Child and speech and language therapist Family group Partners Peer group SpousesCommunication contextThe communication context.The following open vocabulary is suggested: Assessment session Booksharing Free play Institutional conversation Peer conversation Teaching session Therapy sessionLanguages (sub)Number of languages (c)The number of languages, spoken or signed, used in the recording.ControlledSpoken language ID (c)The ID of the spoken language(s) used.ControlledSign language ID (c)The ID of the sign language(s) used.ControlledLanguage varietyThe variety of languages used.List any dialect or further language detail which is not recorded by the encoding for language IDs. For example, if Spoken language ID is eng, Language variety may include Estuary or Wife using Malay English, husband responding in Tamil and so on.Communication modesCommunication modes used.An open description of modalities used in the recording. The following open vocabulary is suggested: Augmentative/alternative communication aid Cultural gestures Deictic (pointing) gestures Emotional states Enactment Eye gaze Haptics (touch) Signs (from Sign Language lexicon) Speech Writing DrawingTranscription (sub)Transcription (boolean)Are there any transcripts associated with the session? { yes | no }.[Transcription format]An open description of the type of transcription documents associated with the session.Use the list below, or name the appropriate file extension or FourCC from the controlled vocabulary Original Format. The following open vocabulary is recommended: Unknown Unspecified Atlas TI ELAN Rich Text Format TransanaACTOR+IDUnique identifier for the primary actor in the session.Alphanumeric. This should correspond to the owners encoding as used in any associated transcriptions. It is not appropriate to name the actor. Please use a pseudonym or identifier.Age (c)The age of the primary actor.ControlledAge band (c)The age band of the primary actor.ControlledSex (c)The sex of the primary actor.The following open vocabulary is used: Unknown Unspecified Male Female Transsexual[Occupation or previous occupation]The occupation or previous occupation of the primary actor.Free text. Leave blank if the actor is a child.[Actor notes]Any further notes on the actor.Free text.Condition (sub)ConditionThe general condition of the primary actor.The following open vocabulary is used: Unknown Unspecified Age related hearing loss Aphasia Autistic spectrum disorder (Adult) Autistic spectrum disorder (Child) Cerebral Palsy Cognitive communication disorder Deafness (Adult) Deafness (Child) Dementia Dysarthria Dyslexia Dyspraxia Language impairment (Child) Language Impairment (Adult) Learning Disability (Adult) Learning Disability (Child) Other physical disability Progressive neurological Second/additional language Stammering Typically ageing Typically developingCondition subtypeAn open description of the specific condition of the actor.More detail on the actors condition. For example, if the condition is Deafness (Child), then the Subtype may be Sensori-neural bilateral hearing loss; if the condition is Aphasia then the Subtype may be Agrammatic aphasia etc. The following open vocabulary is suggested: Unknown Unspecified [free text]Cause of conditionThe cause of the condition.The following open vocabulary is suggested: Unknown Unspecified Congenital Stroke Head injury Brain tumourOnset of conditionAn open description of the onset of the condition.If dates are included, please format as YYYY-MM or YYYY-MM-DD. The following open vocabulary is suggested: Unknown Unspecified [free text]Intervention historyAn open description of the history of interventions.An open description of the history of interventions. If dates are included, please format as YYYY-MM or YYYY-MM-DD. The following open vocabulary is suggested: Unknown Unspecified YYYY-MM, [intervention]; YYYY-MM, [intervention]Family historyAn open description of the history of the specific condition in the actor's family.A description of the history of the condition in the actors family. The following open vocabulary is suggested: Unknown Unspecified [free text] [Hearing status]The hearing status of the primary actorThe following open vocabulary is suggested: Unknown Unspecified Deaf Hard-of-hearing Hearing No reported difficulties [Vision status]The vision status of the primary actor.The following open vocabulary is suggested: Unknown Unspecified Blind Glasses for reading Partially sighted No reported difficulties [Handedness]The handedness of the primary actor.The following open vocabulary is suggested: Unknown Unspecified Ambidextrous Left Right[Sign language experience]An open description of the actor's exposure to sign language.An open description of the actor's exposure to sign language. Give dates in the form Years; months, or birth.Education (sub)[Education leaving age]The age at which the (adult) actor left school.Controlled[School type]The type of school the primary actor attends/attended.The following open vocabulary is suggested: Bilingual (speech-sign) home programme College Home schooling Preschool/nursery Primary school Secondary school Special school University Vocational training[Class kind]The type of class the primary actor attends/attended.The following open vocabulary is suggested: Class in mainstream school Class in special school Individually integrated in mainstream class Mainstream classEducation model]The education model employed in the class.The following open vocabulary is suggested: Bilingual (spoken) Bilingual/bimodal (speech and sign) Oral with sing language interpreter Oral/natural language Sign only[Boarding school] (boolean)Was/is the school a boarding school?{ yes | no }Secondary actor(s) notesAny notes on secondary actors - their ID, roles etc.Free text. It is not appropriate to name any secondary actors. Please use pseudonyms or identifiers.ACCESS+Rights (c)The tier of access to which this session belongs.ControlledRights evaluation date (c)The date of access rights evaluation. YYYY-MM-DD.ControlledOwnerThe owner of the resource. May be the same as The owner of the resource. May be the same as Project . Contact . Name, or may be an institution.Free text. May be the same as Contact Name, or may be an institution. ENCODING SCHEMES The following encoding schemes explain how elements which conform to particular external standards should be completed. Please follow the links provided to see full details of each scheme. Identifier:The identifier of each recording is controlled according to the owners own encoding. This must correspond with the name of the file as deposited.Date (c):Dates are encoded in YYYY-MM or YYYY-MM-DD format, according to a profile of [HYPERLINK "http://en.wikipedia.org/wiki/ISO_8601"ISO8601] as described in [HYPERLINK "http://www.w3.org/TR/NOTE-datetime"W3CDTF].Original format (c):If the format is analogue, please name it in free text, for example VHS or Audio cassette. If the file is born digital, give a file extensions or FourCC codes, for example AVI, WAV, MPEG-1 etc. These are encoded by HYPERLINK "http://filext.com/alphalist.php?extstart=%5eA"Filext.Country:The country is encoded according to [HYPERLINK "http://en.wikipedia.org/wiki/ISO_3166-1"ISO3166-1] 2- or 3-digit codes or in the longhand specified by the ISO code.Number of languages (c):An integer.Spoken language ID (c):Spoken language ID can be encoded according the following two schemas. If a language used does not appear on these lists, please name it in the Language variety field. [HYPERLINK "http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes"ISO639-1], which specifies the code set for language identification in the form of a two-letter code, or [HYPERLINK "http://en.wikipedia.org/wiki/List_of_ISO_639-2_codes"ISO639-2] which specifies the code set for language identification in the form of a three-letter code. The three-letter codes from the [HYPERLINK "http://www.ethnologue.com/language_code_index.asp"ETHNOLOGUE] list from SIL International are allowed by using the prefix 'x-sil-' for the three-letter code (See [HYPERLINK "http://www.sil.org/silewp/2000/001/"LANGID] for more information). For example, one could enter the language identifier 'x-sil-dut' to indicate the Dutch language.Sign language ID (c):Sign language ID is encoded according to[HYPERLINK "http://en.wikipedia.org/wiki/List_of_ISO_639-2_codes"ISO639-2], which specifies the code set for language identification in the form of a three-letter code. See [HYPERLINK "http://www.signwriting.org/archive/docs1/sw0033-Sign-Language-Codes.pdf"SIGNWRITING] for a mapping of signed languages to the ISO standard.Age (c):Age is encoded as years;months, as specified by Codes for the Human Analysis of Transcripts [HYPERLINK "http://www.mpi.nl/IMDI/documents/Proposals/IMDI_MetaData_3.0.4.pdf"AGECHAT]. Age band (c):The searchable age bands are as follows: 0-4 5-10 11-16 16-19 20-40 41-65 65+[Education leaving age] (c):Age is encoded as years;months, as specified by Codes for the Human Analysis of Transcripts [HYPERLINK "http://www.mpi.nl/IMDI/documents/Proposals/IMDI_MetaData_3.0.4.pdf"AGECHAT]. Rights (c):Leave blank.Rights evaluation date (c):The date is encoded according to a profile of [HYPERLINK "http://en.wikipedia.org/wiki/ISO_8601"ISO8601] as described in [HYPERLINK "http://www.w3.org/TR/NOTE-datetime"W3CDTF] and follows the YYYY-MM format.     