Magic Site Integration is a THRON extension which helps you minimize the activation time by automatically importing URL content from existing website pages, collecting strategic data from day 1. Just add a single line of code into the head of any website and THRON will automatically create one content for each page, analyzing its text and applying semantic tags to it.
In order to activate the application via “Extensions and Connectors” within your Dashboard, you will have to provide the following set of information:
- Name for the application: The name to be assigned to this integration in order to identify it quickly within lists. We recommend that you use the name of the linked website.
- Application prettyID: An identifier to be assigned to the application for API usage.
- Folder: The folder within which the "Magic Site Integration" application will publish generated content. We recommend that you select a “temporary” folder so that Content Intelligence Manager will be able to check new url content and tag it properly before moving it into other folders.
- Domain: The reference domain of the website to be linked (e.g. www.mydomain.org). Do not include protocol (http or https) or any query parameter.
- Content Owner: Define the user to whom the application will assign ownership of generated content. Only one user can be selected.
At the end of the setup process, a code snippet will be provided to be copied and pasted within the <head> section of the linked website.
From that moment on, "Magic Site Integration" will start scanning the website, and will automatically generate URL content for each page as soon as it receives a visit.
Available parameters for the Magic Site Configuration are:
- Use og tags for content enrichment: to tell the scraper to use og tags (Open Graph Protocol) where possible.
- Use metatag keywords: to tell the scraper to try to retrieve the keywords from specific html tags within the page.
- Analyze dynamic language pages in: to tell the scraper the language to be used as preferred. A default value will be provided if the "lang" tag cannot be retrieved from the html.
- Scan pages using the following User Agent: to set a custom User Agent to be used by the scraper when scanning the website.
One of the pillars of Magic Site Integration is the retrieval of the url on which the content has to be created and which has to be used for tracking events.
There are three different ways to retrieve such information:
Unless manual mode is set, even if the “use OG tags” flag is enabled and there is an og=url in the page, Magic Site Integration will try to retrieve the url from the "canonical" tag:
<link rel="canonical" href="http://www.example.com/somePath"/>
OG url tag
If the "Use OG tags for content enrichment" flag is active, Magic Site Integration will try to retrieve the url from the "content" field of the OG tag "url":
<meta property="og:url" content="http://www.someUrl.com/ToBe/tracked" />
If this tag is unavailable, Magic Site Integration will try to dynamically retrieve the url based on the "Website type" parameter which is available in the application's management:
Blacklisted query params are:
There's an additional choice which is "Manual". Using this mode you will have to manually select the pages of the website to be imported by invoking the following track method:
Magic Site Integration scrapes the whole source code of each page of the website in looking for relevant information to be used in order to enrich URL content. For a better and faster retrieval of such information, we recommend that you make sure that the following set of information is included in the source code of each page you want to be imported.
First of all make sure that the Semantic Engine is enabled, otherwise Magic Site won't be able to extract tags from your web pages.
If the flag “Use metatag keywords” is on, content tags will be extracted from the html.
<meta name="keywords" content="tag1 tag2 tag3 tag4">
If the flag "Use OG tags for content enrichment" is on too, content tags will be extracted from the following og tags: “video:tag”, “article:tag” o “book:tag”, otherwise they will be extracted from the meta tags "keyword" like in the example above.
<meta property="og:video:tag" content="tag1 tag2 tag3 tag4" />
<meta property="og:article:tag" content="tag1 tag2 tag3 tag4" />
<meta property="og:book:tag" content="tag1 tag2 tag3 tag4" />
The title will be extracted from the og:title tag, if it exists:
<meta property="og:title" content="Page Title" />
if such tag is not present, it will be extracted from the html "title" tag:
If none of those tags is present, the title will be created using the actual page address.
If the flag "Use OG tags for content enrichment" is enabled, content description will be extracted from the “og:description” tag (if present):
<meta property="og:description" content="Page description" />
if such tag is not present, content description will be extracted from the meta description tag:
<meta name="description" content="Page description" />
If the flag "Use OG tags for content enrichment" is enabled, content thumbnail will be extracted from the “og:image” tag.
<meta property="og:image" content="http://someimageurl.com/imageName.jpg" />
If the flag is not enabled or the og:image tag is not present, the thumbnail will be generated from a page screenshot.
Language will be retrieved from the "lang" html tag:
<html lang="en" />