Magic Site Integration

Introduction

Magic Site Integration is a THRON extension which helps you minimize the activation time by automatically importing URL content from existing website pages, collecting strategic data from day 1. Just add a single line of code into the head of any website and THRON will automatically create one content for each page, analyzing its text and applying semantic tags to it.

 

Activation

In order to activate the application via “Extensions and Connectors” within your Dashboard, you will have to provide the following set of information:

  • Name for the application: The name to be assigned to this integration in order to identify it quickly within lists. We recommend that you use the name of the linked website.
  • Application prettyID: An identifier to be assigned to the application for API usage.
  • Folder: The folder within which the "Magic Site Integration" application will publish generated content. We recommend that you select a “temporary” folder so that Content Intelligence Manager will be able to check new url content and tag it properly before moving it into other folders.
  • Domain: The reference domain of the website to be linked (e.g. www.mydomain.org). Do not include protocol (http or https) or any query parameter.
  • Content Owner: Define the user to whom the application will assign ownership of generated content. Only one user can be selected.

 

At the end of the setup process, a code snippet will be provided to be copied and pasted within the <head> section of the linked website.

From that moment on, "Magic Site Integration" will start scanning the website, and will automatically generate URL content for each page as soon as it receives a visit.

 

Configuration

Available parameters for the Magic Site Configuration are:

  • Use og tags for content enrichment: to tell the scraper to use og tags (Open Graph Protocol) where possible.
  • Use metatag keywords: to tell the scraper to try to retrieve the keywords from specific html tags within the page.
  • Analyze dynamic language pages in: to tell the scraper the language to be used as preferred. A default value will be provided if the "lang" tag cannot be retrieved from the html.
  • Scan pages using the following User Agent: to set a custom User Agent to be used by the scraper when scanning the website.

 

Functioning

One of the pillars of Magic Site Integration is the retrieval of the url on which the content has to be created and which has to be used for tracking events.

There are three different ways to retrieve such information:

 

Canonical

Unless manual mode is set, even if the “use OG tags” flag is enabled and there is an og=url in the page, Magic Site Integration will try to retrieve the url from the "canonical" tag:

<link rel="canonical" href="http://www.example.com/somePath"/>

 

OG url tag

If the "Use OG tags for content enrichment" flag is active, Magic Site Integration will try to retrieve the url from the "content" field of the OG tag "url":

<meta property="og:url" content="http://www.someUrl.com/ToBe/tracked" />

 

Website type

If this tag is unavailable, Magic Site Integration will try to dynamically retrieve the url based on the "Website type" parameter which is available in the application's management:

  1. Static: the url from which the javascript is invoked is retrieved, and all query params and anchors are removed.
  2. Dynamic: the url from which the javascript is invoked is retrieved, and all blacklisted query params and anchors are removed.
  3. Single Page Application: the url from which the javascript is invoked is retrieved, and all blacklisted query params are removed.

 

Blacklisted query params are:

  • 'utm_source'
  • 'utm_medium'
  • ‘utm_term'
  • 'utm_content'
  • 'utm_campaign'
  • '_'

 

Manual mode

There's an additional choice which is "Manual". Using this mode you will have to manually select the pages of the website to be imported by invoking the following track method:

trackMS('http://www.example.com/somePath');

 

Pages optimization

Magic Site Integration scrapes the whole source code of each page of the website in looking for relevant information to be used in order to enrich URL content. For a better and faster retrieval of such information, we recommend that you make sure that the following set of information is included in the source code of each page you want to be imported.

 

Tags

First of all make sure that the Semantic Engine is enabled, otherwise Magic Site won't be able to extract tags from your web pages.

 

If the flag “Use metatag keywords” is on, content tags will be extracted from the html.

<meta name="keywords" content="tag1 tag2 tag3 tag4">

 

If the flag "Use OG tags for content enrichment" is on too, content tags will be extracted from the following og tags:  “video:tag”, “article:tag” o “book:tag”, otherwise they will be extracted from the meta tags "keyword" like in the example above.

 

<meta property="og:video:tag" content="tag1 tag2 tag3 tag4" />
<meta property="og:article:tag" content="tag1 tag2 tag3 tag4" />
<meta property="og:book:tag" content="tag1 tag2 tag3 tag4" />

 

Title

The title will be extracted from the og:title tag, if it exists:

<meta property="og:title" content="Page Title" />

 

if such tag is not present, it will be extracted from the html "title" tag:

<title>Page title</title>

 

If none of those tags is present, the title will be created using the actual page address.

 

Description

If the flag "Use OG tags for content enrichment" is enabled, content description will be extracted from the “og:description” tag (if present):

<meta property="og:description" content="Page description" />

if such tag is not present, content description will be extracted from the meta description tag:

<meta name="description" content="Page description" />

 

Thumbnail

If the flag "Use OG tags for content enrichment" is enabled, content thumbnail will be extracted from the “og:image” tag.

<meta property="og:image" content="http://someimageurl.com/imageName.jpg" />

If the flag is not enabled or the og:image tag is not present, the thumbnail will be generated from a page screenshot.

 

Language

Language will be retrieved from the "lang" html tag:

<html lang="en" />
Was this article helpful?
0 out of 0 found this helpful

Have any question?

Open a ticket
Comments