Skip to main content
Creating a Knowledge Base

To start an Insight Booster project, you need a Knowledge Base. This article explains how to create one from various data sources.

Ö
Written by Öykü Aygül
Updated over 2 months ago

A Knowledge Base (KB) is the data used as input for an Insight Booster (IB) project. Creating a KB is a prerequisite for utilizing the Insight Booster. You can create KBs from a variety of data sources, including:

  1. Knowledge Base based on social media

  2. Knowledge Base based on PDF reports

  3. Knowledge Base based on news

  4. Knowledge Base based on Word documents

  5. Knowledge Base based on structured datasets

To create a Knowledge Base, navigate to the Knowledge Base icon in the blue ribbon at the top of the page. Alternatively, you can open an existing Insight Booster project and create a Knowledge Base directly within that project.

Creating a Knowledge Base through an Insight Booster project

Select the “Insight Booster” card to navigate to the Insight Booster menu.

Click “Create a new Insight Booster project”. You can also click an existing project and create the Knowledge Base there.

Write a name for the new Insight Booster project and click “Create”.

Select one of the data sources to create the Knowledge Base.

Once the Knowledge Base has been created, it will be automatically linked to the Insight Booster you used to create it.

Creating a Knowledge Base through the Knowledge Base menu

You can create a Knowledge Base by navigating to the Knowledge Base icon at the blue ribbon at the top of the page.

Click on “Create a new Knowledge Base” to see the available data sources from which you can choose.

You will be given a list of sources.

Knowledge Base based on social media

Dcipher Analytics offers data from Twitter, Facebook, Instagram, and YouTube. Click “Create” to create a Knowledge Base that draws on social media posts from one or more platforms.

Next, click “Select input data” and choose the sources you wish to include in your Knowledge Base. You can select one, multiple, or all sources from the list provided.

In this section, you define the search space through keywords and other restrictions.

Number 1 - Import keywords: Use this button to upload keywords from an Excel spreadsheet. This method lets you create detailed search queries using Boolean operators.

Number 2 - All of these words/phrases: Add keywords here that must be in every post. You can copy and paste them from an Excel column, with each row becoming a separate keyword, or enter them one by one. In Boolean terms, these terms are connected with "AND".

Number 3 - Any of these words/phrases: At least one of these keywords must appear in each post. As demonstrated below, here we are downloading posts containing any of the five terms. You can copy and paste them from an Excel column, with each row as a separate keyword, or add them individually. In Boolean terms, these are connected with "OR." If you fill both keyword sections, the query combines “All of these words/phrases” and “Any of these words/phrases” keywords.

Number 4 - None of these words/phrases: Keywords here will be excluded from your search, even if they contain the search terms. You can add keywords that may bring irrelevant content. For instance, to avoid posts about a celebrity with the same name as the company you’re analyzing, add the celebrity’s full name here.

Number 5 - Language: You can specify the language of the posts you want to scan to align with the terms used in your search query. Choosing the appropriate language can help avoid ambiguities related to false friends -words that have different meanings in different languages. For example, the word "gift" means "present" in English but translates to "poison" or “wedded” in Swedish. Properly setting the language ensures more accurate and relevant search results.

Number 6 - Country: You can choose the countries from which to download data using the provided list. However, please note that some posts may not have country information. When you make your selections, the analysis will only include posts that contain country information and are sourced from those specific countries. While there may be additional posts from those countries, they might not have country data encoded. These posts will be missed from your analysis.

For this reason, we recommend refraining from adding any countries when comprehensive information is critical. This advice is specifically relevant to social media posts; however, it does not apply to news articles, as all news articles include a source country value.

Number 7 - Posted after: Select the earliest date to be included in your analysis. Please note that social media posts are only available for the past month, so access to posts before that time is unavailable.

Number 8 - Posted before: Select the latest date to be included in your analysis.

Number 9 - Sort by: Sort the collected posts by overall score or publication date. This is particularly useful for sampling data.

Number 10 - Post order: This setting controls the order of posts when you sample data. By default, posts are sorted from the most recent to the oldest. If you choose to sample data, the system will select the latest posts based on this order. If you aren’t using the sampling feature, you can keep this setting as it is, as it won’t impact the contents of your Knowledge Base.

Once you are set, click “Find posts”.

After clicking “Find posts”, you'll see the number of matches for your query, which helps you gauge the available data volume for your analysis. If the total number of posts exceeds your desired volume, you can refine your selection.

Once you’ve entered the preferred file name, click “Import”.

Click ”Continue” to proceed to workflow settings.

You can create a Knowledge Base for one-time use, schedule it for automatic updates, or trigger it through an API call for dynamic data integration.

Immediate single run

This is the default option, allowing you to run the workflow once. The results will automatically be converted into a Knowledge Base. Please note that to schedule a workflow, you must first create it through an immediate single run.

Click “Continue” to configure further settings.

You can either create a new Knowledge Base or update an existing one. If you are creating a new one, provide a clear name for your Knowledge Base, and it will appear under this name in the Knowledge Base list.

Next, enter a name for the workflow, which will be displayed under “My Workflows”. You may also add a description for the workflow if desired.

If you are updating an existing Knowledge Base, select “Update” as the action type and choose the Knowledge Base name from the list below.

Once you have filled out these details, click “Create” to finalize the setup.

Once your Knowledge Base has been created, you will find it under “My Knowledge Bases” accessible through the blue ribbon at the top. Successfully created Knowledge Bases will have a green check mark next to their name.

At scheduled intervals

You can also update Knowledge Bases at scheduled intervals. Please note that this functionality can only be applied to existing Knowledge Bases, as outlined in the previous section, “Immediate single run”.

At this stage, you can select your preferred frequency for updating. The settings shown below will create a workflow that updates the Knowledge Base every Monday at 8 AM with data from the previous week.

Click “Create” to proceed.

Select the Knowledge Base you want to automate from the dropdown menu.

If you want the results in the Insight Booster project to update automatically, enable the setting indicated by the arrow to regenerate the results.

Provide a name and optionally a description for the workflow, and click “Create”.

Trigger through API call

Select the “Trigger through API call” option, then click the plus sign to configure the variables you wish to modify for each API run.

You can choose a parameter from the dropdown menu to change the search query (i.e., the downloaded data) and assign a name to the variable. For example, if you want to modify the country for each run, simply select “country” from this menu. When you initiate a new run through the API, you will be prompted to configure a value for the country variable.

You can also define settings through operations. Click on “Add operation”, select the relevant operation, and configure the associated variables. These operations represent the steps your data undergo to transform into a Knowledge Base.

For instance, if you select the node that displays the “Sample” operation and click “Select”, you’ll be able to configure parameters related to the sample, in addition to the country.

Here, you can choose the sample size and set it for the upcoming run. Once you have selected your variables, click “Continue”.

Choose the Knowledge Base where you want to trigger runs through API, enter a workflow name, and click “Create”.

Once your Knowledge Base has been created, you’ll see it under “My Knowledge Bases” accessed through the blue ribbon on top. You will also receive an email once the Knowledge Base has been created. Successfully created KBs have the green tick mark next to their name.

Knowledge Base based on news

Dcipher Analytics offers data from news sources worldwide. Click “Create” to create a Knowledge Base that utilizes news articles.

Click “Select input data” to define the search query to extract news articles.

In this section, you define the search space through keywords and other restrictions.

Number 1 - Import keywords: Use this button to upload keywords from an Excel spreadsheet. This method lets you create detailed search queries using Boolean operators.

Number 2 - All of these words/phrases: Add keywords here that must be in every article. You can copy and paste them from an Excel column, with each row becoming a separate keyword, or enter them one by one. In Boolean terms, these terms are connected with "AND".

Number 3 - Any of these words/phrases: At least one of these keywords must appear in each article. As demonstrated below, here we are downloading articles containing any of the four terms from the query below. You can copy and paste them from an Excel column, with each row as a separate keyword, or add them individually. In Boolean terms, these are connected with "OR." If you fill both keyword sections, the query combines “All of these words/phrases” and “Any of these words/phrases” keywords.

Number 4 - None of these words/phrases: Terms you include here will be excluded from your data, even if they contain the search terms. You can add keywords that may bring irrelevant content. For example, you want to download data about a company whose name is also the last name of an unrelated celebrity, you can add the celebrity’s name in this section to avoid getting posts about that celebrity.

Number 5 - Language: You can specify the language of the articles you want to scan to align with the terms used in your search query. Choosing the appropriate language can help avoid ambiguities related to false friends -words with different meanings in different languages. For example, the word "gift" means "present" in English but translates to "poison" or “wedded” in Swedish. Properly setting the language ensures more accurate and relevant search results.

Number 6 - Country: You can choose the countries from which to download data using the provided list.

Number 7 - Posted after: Select the earliest date to be included in your analysis.

Number 8 - Posted before: Select the latest date to be included in your analysis.

Number 9 - Post order: This is set to weekly sampling by default.

Number 10 - Show advanced settings: Display source-related settings.

Number 11 - Sites: If you want to restrict your analysis to specific sites, you can use this setting.

Number 12 - Lowest site rank (globally): By setting the lowest rank to 10, only the top 10 sites worldwide will be included in the analysis. This global setting limits the analysis to a maximum of 10 sites.

Number 13 - Lowest site rank (within country): This works similarly to the global setting but applies to each country individually. If the analysis involves 3 countries and the rank is set to 10, it will include the top 10 sites from each country, resulting in a total of 30 sites.

Number 14 - Content provider type: You can choose the content provider types from the available options, which include, among others, editorial media, magazines, and local broadcasts.

Number 15 - Media types: You can also choose the media type, such as web, blog, print, TV, or podcast.

Once you are set, click “Find articles”.

From the total number of articles identified that match your search criteria, you can specify how many articles you would like to work with. The maximum number you can enter is equal to the number of matching entries. After setting the desired amount of data and providing a file name, click “Continue” to proceed to the workflow settings.

Click “Import” to proceed.

You can create a Knowledge Base for a one-time use, set it to update on a scheduled basis or trigger it through an API call for dynamic data integration.

Immediate single run

This is the default option, allowing you to run the workflow once. The results will automatically be converted into a Knowledge Base. Please note that in order to be able to schedule a workflow, you need to first create it through an immediate single run.

Click “Continue” to configure further settings.

You can either create a new Knowledge Base or update an existing one. If you are creating a new one, provide a clear name for your Knowledge Base, and it will appear under this name in the Knowledge Base list.

Next, enter a name for the workflow, which will be displayed under "My Workflows". You may also add a description for the workflow if desired.

If you are updating an existing Knowledge Base, select “Update” as the action type and select the Knowledge Base name from the list underneath.

Once you have filled out these details, click “Create” to finalize the setup.

Once your Knowledge Base has been created, you’ll see it under “My Knowledge Bases” accessed through the blue ribbon on top. Successfully created KBs have the green tick mark next to their name.

At scheduled intervals

You can also run Knowledge Bases to update at scheduled intervals. Keep in mind that you can only apply this functionality on existing Knowledge Bases created through the steps explained in the previous section, “Immediate single run”.

At this stage, you can select your preferred frequency for updating. The settings exemplified below would create a workflow that updates a Knowledge Base on the first day of each month with data covering the past month.

Click “Continue” to proceed.

Select the Knowledge Base you want to automate from the dropdown menu.

If you want the results in the Insight Booster project to update automatically, enable the setting indicated by the arrow to regenerate the results.

Provide a name and optionally a description for the workflow, and click “Create”.

Trigger through API call

Select the “Trigger through API call” option, then click the plus sign to configure the variables you wish to modify for each API run.

You can choose a parameter from the dropdown menu to change the search query (i.e., the downloaded data) and assign a name to the variable. For example, if you want to modify the country for each run, simply select “country” from this menu. When you initiate a new run through the API, you will be prompted to configure a value for the country variable.

You can also define settings through operations. Click on “Add operation”, select the relevant operation, and configure the associated variables. These operations represent the steps your data undergo to transform into a Knowledge Base.

For instance, if you select the node that displays the “Sample” operation and click “Select”, you’ll be able to configure parameters related to the sample, in addition to the country.

Here, you can choose the sample size and set it for the upcoming run. Once you have selected your variables, click “Continue”.

Choose the Knowledge Base where you want to trigger runs through API, enter a workflow name, and click “Create”.

Once your Knowledge Base has been created, you’ll see it under “My Knowledge Bases” accessed through the blue ribbon on top. You will also receive an email once the Knowledge Base has been created. Successfully created KBs have the green tick mark next to their name.

Knowledge Base based on PDF reports

You can also analyze your PDF files. Click “Create” to get started.

Click “Select input data”.

Upload the PDF file(s) you would like to work with. Once they have been uploaded, they will appear under “Files”. Click “Select file(s)” to continue.

Provide a file name and click “Continue”.

The PDFs you uploaded will be merged into a single file to facilitate the analysis. Although the files will be combined, you will still have access to information about the original source for each piece of text in the Knowledge Base and the Insight Booster project.

You can find both the merged PDF and the individual PDFs under “Files”.

Click “Import” to import the data.

Click “Continue” to proceed to workflow settings.

Immediate single run

Keep the default selection of “Immediate single run” and click “Continue”.

You can either create a new Knowledge Base or update an existing one. If you are creating a new one, provide a clear name for your Knowledge Base, and it will appear under this name in the Knowledge Base list.

Next, enter a name for the workflow, which will be displayed under “My Workflows”. You may also add a description for the workflow if desired.

If you are updating an existing Knowledge Base, select “Update” as the action type and choose the Knowledge Base name from the list below.

Once you have completed these details, click “Create” to finalize the setup.

Once your Knowledge Base has been created, you’ll see it under “My Knowledge Bases” accessed through the blue ribbon on top. You will also receive an email once the Knowledge Base has been created. Successfully created KBs have the green tick mark next to their name.

Knowledge Base based on structured datasets

You can analyze structured datasets in Excel, JSON, or CSV formats. Click “Create” to get started. For a dataset to be used as input for a Knowledge Base, it must include fields for text, date, and source.

Click “Select input data”.

Upload your file and click “Select file(s)”. In this example, we have uploaded an Excel file containing UN debates.

Select the relevant fields for the Knowledge Base, including the main text field, date, and source, then click “Continue”.

Immediate single run

Keep the default selection of “Immediate single run” and click “Continue”.

You can either create a new Knowledge Base or update an existing one. If you are creating a new one, provide a clear name for your Knowledge Base, and it will appear under this name in the Knowledge Base list.

Next, enter a name for the workflow, which will be displayed under “My Workflows”. You may also add a description for the workflow if desired.

If you are updating an existing Knowledge Base, select “Update” as the action type and choose the Knowledge Base name from the list below.

Once you have filled out these details, click “Create” to finalize the setup.

Once your Knowledge Base has been created, you’ll see it under “My Knowledge Bases” accessed through the blue ribbon on top. You will also receive an email once the Knowledge Base has been created. Successfully created KBs have the green tick mark next to their name.

Knowledge Base based on Word documents

To create a Knowledge Base from a Word (docx) document, click “Create”.

Click “Select input data” to upload the Word / docx files to be used for the Knowledge Base.

To upload your file, drag and drop it into the designated drop zone, or click on the drop zone to select your .docx file. Once you’ve uploaded your file, click on “Select file(s)” to complete the upload.

Click “Import”.

Click “Continue” to proceed.

Immediate single run

Keep the default selection of “Immediate single run” and click “Continue”.

You can either create a new Knowledge Base or update an existing one. If you are creating a new one, provide a clear name for your Knowledge Base, and it will appear under this name in the Knowledge Base list.

Next, enter a name for the workflow, which will be displayed under “My Workflows”. You may also add a description for the workflow if desired.

If you are updating an existing Knowledge Base, select “Update” as the action type and choose the Knowledge Base name from the list below.

Once you have filled out these details, click “Create” to finalize the setup.

Once your Knowledge Base has been created, you’ll see it under “My Knowledge Bases” accessed through the blue ribbon on top. You will also receive an email once the Knowledge Base has been created. Successfully created KBs have the green tick mark next to their name.

Did this answer your question?