Skip to main content

Importing data from social media

Use connectors to blogs, forums, Twitter, Facebook, Instagram, and YouTube to import data from these channels directly into your project.

Tomas Larsson avatar
Written by Tomas Larsson
Updated over 5 years ago

Keywords: data import, social media data, facebook data, twitter data, instagram data, youtube data, blog data, forum data

To make it easier to access data from social media, Dcipher provides connectors to various social media sources. Note that only public data is available. For some of the sources, this means official accounts such as those run by news organizations, companies, and public figures.

Step-by-step guide

1. Start the data import wizard

Click the "Import" button at the top of the workspace and click "Import data from social media".

2. Select social media channels

Select the social media channels that you want to import data from. You can choose one or more among blogs, Facebook, forums, Instagram, Twitter, and YouTube.

☝️ Note: If you import data from multiple sources, only fields that are common to those sources will be available. So, if you want all fields to be available, import from one source at a time.

Click "Select source".

3. Set the search criteria

Specify the search criteria for the posts you're interested in. At least one keyword needs to be provided for the search.

The keyword fields available are:

  • "All these words/phrases": The posts have to contain all the provided keywords.

  • "Any of these words/phrases": The posts have to contain at least one of the provided keywords.

  • "None of these words/phrases": The posts are not allowed to contain any of the provided keywords.

Depending on the source, additional fields are available for the search, such as language, location, content type, and authors. Click "Show advanced settings" to access all search fields.

The time that historical posts are available for varies with the channel:

  • Blogs: most recent year

  • Facebook: most recent month

  • Forums: most recent year

  • Instagram: most recent month

  • Twitter: most recent month

  • YouTube: most recent month

Click "Find posts" when you're done with the search settings.

4. Decide how many posts to import

The number of matching posts in the selected channels is now displayed in the import wizard and you can set the number of posts to be imported from each channel.

If you select a smaller number than the total available, you will get the first posts based on the sorting parameters set in the previous step. For example, you may want the 1000 most recent posts or the 1000 posts with the highest engagement score.

Importing thousands of posts may take a few minutes.

5. Name the dataset

Choose a name for your newly imported dataset.

6. Import the data

Click "Import" to import the data into your project. The imported data is displayed in the Schema workbench.

Example: Importing data from Twitter

In this example, we're interested in tweets, so we select Twitter as the source in the import step.

In the "Any of these words/phrases" keyword field, we type the keywords "COVID-19" and "coronavirus" to find posts that contain either of these keywords.

We select "English" as the language we're interested in.

We keep the default start date and end date to get tweets from the most recent month.

We make sure that "Tweet type" is set to "Tweet". This gives the original tweets, without their many retweets.

We are interested in all content types – photos, videos, and so on – so under "Show advanced settings" we leave the "Content types" field as it is.

We leave the geographical location as "all" to get posts from all over the world.

We're now happy with the search settings and click "Find posts".

Among all the posts that match the search criteria, we can now select how many we want to import. In this case, we set it to 1000 tweets and click "Import".

Name your dataset as you wish.

One thousand English-language tweets containing "COVID-19" or "coronavirus" from the last month have now been imported into the project.

Did this answer your question?