Tags and Tagging
Contents:
Current state of tagging
See Examples of Current Metadatas.
New tags
Asset Type
Type tag - what kind of data is represented on this page. Types: event, news, faculty, school.
There should be a type for every different kind of page there may be. This is partially useful for filtering search results (although this may be better utilised by restricting the results by url matching). However, I see this as being more useful for formatting search results.
For instance, if we know that a result is for a faculty, we could apply it's styling to the heading. If it is for an event, we could include the start/end date in the search result, etc.
- Types: VIC.event, VIC.news, VIC.faculty, VIC.school, VIC.course, VIC.general (default)
New tags for specific assets
Note: Topic and Keywords metadata are expected to be carried accross to all asset types.
News
Title | Use existing DC.title |
Date created | Use existing DCTERMS.issued |
Description | Use existing description - TODO: @andrew to look into getting this renamed or remapped to VIC.description |
News Category | E.g. whether it is a media release etc. Use existing VIC.NewsCategory |
Audience | Whether it is for public or staff. Use existing VIC.Audience. TODO @andrew to investigate whether VIC.Restrictions would be better |
Author | Person who created the news. There is already a DC.creator, however this seems to be nearly always "Victoria University of Wellington" |
Also, would be worth implementing Google Rich Snippets for Articles
Events
Title | Currently there is either twitter:title or v:summary which seem to hold the title. Would be better to get this as VIC.title |
StartDate | use existing v:startDate |
EndDate | use existing v:endDate |
Event Type | use existing v:eventType |
Description | Currently there is description, twitter:description or v:description. Would be better to get this as VIC.description |
Location | use existing v:location |
Could VIC.EventOrganiser be replaced with a Faculty/School/Organisation metadata tag?
Also, not quite metadata, but we should look at implementing Google rich snippets for events
Staff
Most metadata we need is already in place. We can't really do any more on this until we know what solutions are being formed for new Staff Profiles hosting.
Course
Beautiful Metadata already in place. Can't think of any issues apart from:
- Can we merge the existing course 'school' and 'faculty' with the new school/faculty/organisation metatags?
- They currently have no prefix, would it be worthwhile to change them to VIC.description, VIC.year etc?
Faculty/School/Organisation
Title | Self Explanatory |
Description | Self Explanatory |
Color | Colored style associated with the school? |
Image | Self Explanatory |
Main Contact Details | Self Explanatory |
Address | Self Explanatory |
Keywords proposal
Keywords currently do not have a great solution. Ideally we wanted a system whereby the user could select from existing keywords, while still maintaining the freedom to create new keywords where necessary. This seems to be outside of the abilities currently provided by Squiz. Here are some other potential solutions:
1: Have the keyword field be a free text box
When entering metadata, the user has a free text box where they can enter any text they like as their keywords.
This is not ideal, for a number of reasons: It doesn't help with the issue of typos, people will be far more likely to not select already used keywords causing greater dispursal of content, if we want to allow multiple keyword use we will need to provide training on how to properly escape your keyword values - to name a few.
It will, however, be the easiest to implement - simply a case of adding a free text box to the metadataschema and some form of javascript interpretation on results.
2: Have a set list of keywords for people to select from
This is the opposite of the previous solutions, we have a curated set of keywords people can select from. If they require a keyword not already existing, they will need to get it approved for addition to the list.
Has the benefits of providing greater control over keywords use, there wont be any issues with typos etc. However, very restricting for the end users who may have content not fitting into a particular keyword which could in turn frustrate users and have them not tag their content correctly.
3: A combination of 1 and 2
Provide a set of curated keywords, but also provide a free text box for the user to add in new keywords as they see fit. The page will then have two metadata values - curatedKeywords and freeKeywords. This will be more confusing for the end user (two methods of providing keywords), however may provide the best functionality and flexibility within our platform constraints. Issues that could arise where if the list of original keywords is too large, users will simply ignore them and only use the free text box, in which case we are no better off than solution 1.
On a side note, we may be able to call a trigger within squiz which will combine the two sets of keywords into one metadata field instead. We could also use this to keep track of new freetext keywords to provide easier analysis of their usage and possible addition to curated keyword list.
4: Squiz Javascript Plugin
Using the javascript plugin functionality within squiz, we create a page/area which will be dedicated purely to tags. Because it is javascript, we should hopefully be able to create and control everything we need in here. If this is possible then we can really do whatever we would like with it. Possibly even providing a free text box which will allow the user to start typing in keywords and have suggestions provided dynamically as they type, as well as allowing new keywords. This will then be converted into properly machine readable metadata tags and saved to the asset.
This would be the most complicated solution, with the greatest development time, however it could potentially give us what we require from tagging. We are also faced with the issue of the upgrade to Squiz 5. Because of this we will need either develop the code purely for the new edit+ suite, or develop it for the easy edit suite but ensure it retains future compatibility. Nathan has said that it may be possible to get Edit+ installed on a server separate from the live Squiz install so we could progress this work without having to wait until the upgrade has taken place.
Topics Current State of Development
The initial workflow for selecting Topics was:
- The user selects the
Super Topic
- eg, accounting - The user selects the
Sub Super Topic
(if there is one) from a list which has been filtered by theSuper Topic
- eg, accounting - Finally the user selects the
Topic
from a list which has been filtered to only those values available within theSuper Topic
and theSub Super Topic
- eg, accounting
Unfortunately, this sort of functionality doesn't seem to be supported in Squiz. There doesn't seem to be a clear way to filter metadata options based on what was selected previously. However there is a workaround which could do until we are able to migrate tagging to a different system:
Instead of 3 lists with selectable Topic categories, there is only one sorted list of all the Topic combinations a person could select. It looks something like this:
Architecture>Architecture>Architecture Architecture>Architecture>Architecture history and theory Architecture>Architecture>Interior architecture Architecture>Architecture>Landscape architecture Architecture>Construction>Building science Architecture>Construction>Project management (building) Architecture>Construction>Sustainable engineering systems
This unfortunately means this list will be quite long, however it doesn't seem unmanageably so, and also allows the selection of multiple topics (if needed).
todo
As I was implementing this, I forgot that we need to be able to select Super Topics
or Sub Super Topics
by themselves (without a regular Topics
that is). This is not difficult to implement I just need to not forget it the next time I import the list of Topics.
todo
Create documentation listing the terminal commands I used to convert the excel file into the correct format to be uploaded to Squiz. Could be useful brief on how to use sed
, sort
, egrep
and paste
/ (paste
not to be confused with pbpaste
which actually works more like the copy and paste command)
Faculties, Schools and Organisations
Similar to the Topics, have a multi select list of all Faculties, Schools and Organisations which the asset can have attributed to them:
Faculty of Architecture and Design Faculty of Architecture and Design>School of Architecture Faculty of Architecture and Design>School of Design Faculty of Humanities and Social Sciences Faculty of Humanities and Social Sciences>School or Art History, Classics and Religious Studies Victoria University Library Faculty of Humanities and Social Sciences International Institute of Modern Letters Early Childhood Services Weir House etc.
Formatting and Gotchas
URL construction for metadata filtering:
You can filter results by metadata using the requiredfields and partialfields tag.
Examples:
If you want only items that have the metadata tag DC.publisher use:
&requiredfields=DC%252Epublisher
If you want only items where the DC.publisher = "Victoria University of Wellington" use:
&requiredfields=DC%252Epublisher:Victoria%2520University%2520of%2520Wellington
Make sure you double percent encode not only the name but the value of the metatag. For example .
becomes %252E
, :
becomes %253A
. See here for the rest of the codes in a handy format
NOTE: If you use either requiredfields
or partialfields
then the q
vaue is optional
(e.g. curl -X GET 'http://search.victoria.ac.nz/search?client=new_homesite_frontend&proxystylesheet=json_frontend&output=xml_NO_DTD&filter=p&getfields=%2A&start=0&wc=0&wc_mc=0&num=100&site=global_search_collection&q=accy+111&requiredfields=DC%252Epublisher:Victoria%2520University%2520of%2520Wellington'
)
inmeta
If using inmeta inside the q value, you need to be careful about your url escaping:
&q=inmeta:[double escaped meta tag name][single escaped :][double escaped meta tag value]
e.g. Searching for "Victoria University of Wellington" in the "DC.publisher" metadata tag
&q=inmeta:DC%252Epublisher%3DVictoria%2520University%2520of%2520Wellington
(E.g. curl -v -X GET 'http://search.victoria.ac.nz/search?client=new_homesite_frontend&proxystylesheet=json_frontend&output=xml_NO_DTD&filter=p&getfields=%2A&start=0&wc=0&wc_mc=0&num=10&site=global_search_collection&q=inmeta:DC%252Epublisher%3DVictoria%2520University%2520of%2520Wellington'
)
NOTE: Doing a search like this seems to take a loong time (32 seconds at one count) so this will probably not be ideal for wide use.
Issues
GSA doesn't seem to be indexing some MT tags
EG: victoria.ac.nz/study/course-career/career-options
This is the metadata that is returned from a gsa search:
"DC_creator": "Victoria University of Wellington", "DC_publisher": "Victoria University of Wellington", "VIC_Audience": "public", "VIC_keyImage": "", "description": "Work backwards to find the right degree to get you the career you want, or find out what careers different degrees could lead to.", "twitter:card": "summary", "twitter:description": "Work backwards to find the right degree to get you the career you want, or find out what careers different degrees could lead to.", "twitter:site:id": "218343330" "twitter:title": "Exploring career options", "viewport": "width=device-width, initial-scale=1, maximum-scale=1",
Whereas the metadata displayed on the site front end is
"DC.creator": "Victoria University of Wellington", "DC.publisher": "Victoria University of Wellington", "VIC.Audience": "public", "VIC.keyImage": "", > "article:modified_time": "2015-07-31T09:07:31+12:00", > "article:published_time": "2015-08-03T14:36:28+12:00", > "article:publisher": "164979016849070", > "article:section": "Future Students", "description": "Work backwards to find the right degree to get you the career you want, or find out what careers different degrees could lead to.", > "fb:profile_id": "164979016849070", > "og:description": "Work backwards to find the right degree to get you the career you want, or find out what careers different degrees could lead to.", > "og:image": "http://www.victoria.ac.nz/__data/assets/image/0003/198246/social_media_default.png", > "og:site_name": "Victoria University of Wellington", > "og:title": "Exploring career options", > "og:type": "article", > "og:url": "http://www.victoria.ac.nz/study/course-career/career-options", "twitter:card": "summary", "twitter:description": "Work backwards to find the right degree to get you the career you want, or find out what careers different degrees could lead to.", "twitter:site:id": "218343330", "twitter:title": "Exploring career options", "viewport": "width=device-width, initial-scale=1, maximum-scale=1",
Missing metadata is highlighted with >.
Looking at the source code, this seems to be because these tags are labelled like as meta property
instead of meta name
:
<meta property="fb:profile_id" content="164979016849070" />
Whereas the other tags are labelled like this:
<meta name="twitter:title" content="Exploring career options" />
The question is why has it been done this way? Should it be changed in Squiz to name
, or should/can the gsa be configured to index property
as well as name
metadata.